Overview
Website indexing allows AIQA to use publicly available internet content to provide answers for questionnaires.
Examples include existing privacy and security pages, help centers, product sites, and an organization’s main website.
The SafeBase web crawler will catalog these websites specified by your organization and save them in a dedicated data store alongside their Trust Center content, uploaded documents, and knowledge base entries.
Note: There is a 100 page limit, to prevent performance degradation. Websites included should be picked for their robust content beneficial to the AIQA tool. This feature is not meant to ingest entire help centers, hundreds of knowledge base articles, etc.
Site Requirements
The site must be publicly accessible and cannot be gated behind a login.
If behind bot/crawler protection, our web crawler user agent
SB-AI-Scanner/1.0must be added to an allowlist.The following static source IP can be used for exemptions in WAF rules.
35.224.116.27
Note: Our bot is designed to be a good citizen and will not attempt to bypass any anti-scraping measures organizations have on their websites.
Additional Information
Website content is indexed at the time it is added to the system, then a daily recurring cycle, subject to change.
One URL specified equates to one page stored in the data store.
We do not follow/crawl any links on the page.
This means sub-page URLs need to be individually provided!
To add or remove a website from the data store, please contact our Support team.
