Sources
In Biel.ai, you can configure various types of sources to index content for your chatbot to reference. These sources allow the chatbot to provide accurate and relevant answers based on the content you want to make accessible. Each source type serves different needs, depending on the scope of content you want to index.
This guide will explain the different source types, when to use them, and how to configure each.
Source types
1. URLs
Use this method when you want to index specific individual web pages. It's ideal for cases where you need the chatbot to reference a particular piece of content, such as a blog post, article, or landing page.
Key points:
- Best for single-page indexing.
- Not suitable for indexing an entire website. If you plan to index an entire website or a large number of pages, using a sitemap is recommended.
2. Sitemaps
If you need to index an entire website or multiple pages at once, use the sitemap method. Sitemaps are XML files that contain a list of all the pages on your website, allowing Biel.ai to automatically crawl and index them.
Key points:
- Ideal for large websites with multiple pages.
- Supports nested sitemaps, allowing comprehensive indexing of large sites.
- Sitemaps must end with
.xml
.
3. Files
Use this method to upload and index various document formats, such as:
- CSV
- Excel
- Word
- TXT
- MD
It is useful when you want the chatbot to reference information from static documents.
Key points:
- File formats accepted include
.pdf
,.word
,.csv
,.xls
,.md
, and.txt
. - Files remain private and only accessible by the chatbot to generate responses.
- Widgets will not expose link to the files as sources or search results.
- This feature is currently under development and will be available soon.
Configuring sources
Regardless of the source type, configuring sources in Biel.ai follows a similar process. Here’s a general guide on how to add a source to your project:
Regardless of the source type, configuring sources in Biel.ai follows a similar process. Here’s a general guide on how to add a source to your project:
-
Open app.biel.ai.
-
Log in with your account credentials.
-
In the dashboard, click Projects in the top navigation bar.
-
Find and select your project from the list.
-
Click Settings.
-
In the Sources section, choose the type of source you want to add (URL, Sitemap, or Files):
- For URLs, enter the full URL of the web page you want to index.
- For Sitemaps, enter the URL of the
.xml
file that lists the pages to index. - For Files (coming soon), you will be able to upload documents directly.
-
Click Save to apply your changes.
Once saved, Biel.ai will start indexing the content from the selected source(s). You can verify the setup by interacting with the chatbot and seeing if it references the indexed content in its responses.
Filtering sources
When implementing a chatbot for documentation, ensuring it retrieves relevant information from a large, complex knowledge base is crucial. Biel AI provides a Filter Sources feature that allows you to define specific URL patterns the chatbot should include or exclude. By focusing only on pertinent content, this feature improves both the chatbot's accuracy and user experience by preventing irrelevant pages from entering the chatbot’s scope.
The filter supports both standard URL patterns and regular expressions (regex), allowing precise control over which URLs are crawled.
To filter sources, follow these steps:
-
Open app.biel.ai.
-
Log in with your account credentials.
-
In the dashboard, click Projects in the top navigation bar.
-
Find and select your project from the list.
-
Click Settings.
-
In the Sources section, scroll to Filter URLs.
-
Add patterns to include or exclude as needed:
Examples:
- Setting
/es
toExclude
will prevent the chatbot from crawling any pages containing/es
in the URL. - Setting
/es
toInclude
will only include pages that contain/es
in the URL. - Setting
/es
toInclude
and/post
toExclude
will only include pages that contain/es
, excluding those that also contain/post
. - Using a regex pattern such as
^https://example.com/docs/old/.*$
withExclude
will filter out all URLs under the/docs/old/
path.
- Setting
-
Click Save to apply your changes.
This filtering option allows you to refine the chatbot’s knowledge base, ensuring that users only access relevant documentation content.