Skip to main content

AI Questionnaire Assistance (AIQA) - Website Indexing

SafeBase can index external websites, ingesting content for the AIQA Tool

Matt Szczurek avatar
Written by Matt Szczurek
Updated over 3 months ago

Overview

Website indexing allows AIQA to use publicly available internet content to provide answers for questionnaires.

Examples include existing privacy and security pages, help centers, product sites, and an organization’s main website.

The SafeBase web crawler will catalog these websites specified by your organization and save them in a dedicated data store alongside their Trust Center content, uploaded documents, and knowledge base entries.

Note: There is a 100 page limit, to prevent performance degradation. Websites included should be picked for their robust content beneficial to the AIQA tool. This feature is not meant to ingest entire help centers, hundreds of knowledge base articles, etc.

Site Requirements

  • The site must be publicly accessible and cannot be gated behind a login.

  • If behind bot/crawler protection, our web crawler user agent SB-AI-Scanner/1.0 must be added to an allowlist.

  • The following static source IP can be used for exemptions in WAF rules.

    • 35.224.116.27

Note: Our bot is designed to be a good citizen and will not attempt to bypass any anti-scraping measures organizations have on their websites.

Additional Information

Website content is indexed at the time it is added to the system, then a daily recurring cycle, subject to change.

One URL specified equates to one page stored in the data store.

We do not follow/crawl any links on the page.

This means sub-page URLs need to be individually provided!

To add or remove a website from the data store, please contact our Support team.

Did this answer your question?