Skip to main content

AI Questionnaire Assistance (AIQA) - Website Indexing

AIQA can index external websites, ingesting content for the AIQA Tool

Written by Matt Szczurek
Updated this week

Overview

Website indexing allows AIQA to use publicly available internet content to provide answers for questionnaires.

Examples include existing privacy and security pages, help centers, product sites, and an organization’s main website.

The AIQA web crawler will catalog these websites specified by your organization and save them in a dedicated data store alongside their Trust Center content, uploaded documents, and knowledge base entries.

Note: There is a 100 page limit, to prevent performance degradation. Websites included should be picked for their robust content beneficial to the AIQA tool. This feature is not meant to ingest entire help centers, hundreds of knowledge base articles, etc.

Site Requirements

  • The site must be publicly accessible and cannot be gated behind a login.

  • If behind bot/crawler protection, our web crawler user agent SB-AI-Scanner/1.0 must be added to an allowlist.

  • The following static source IP can be used for exemptions in WAF rules.

    • 35.224.116.27

Note: Our bot is designed to be a good citizen and will not attempt to bypass any anti-scraping measures organizations have on their websites.

Additional Information

  • Website content is indexed twice weekly on Monday and Thursday mornings. This is subject to change.

  • One URL specified equates to one page stored in the data store. AIQA does not follow/crawl any links, sub-links, or directories on the page.

    • This means sub-page URLs need to be individually provided!

  • If a website is removed, the previous information will remain in the data store until the next sync recognizes its absence and removes the information.

  • To add or remove a website from the data store, please contact our Support team.

Did this answer your question?