Knowledge Sources

Knowledge sources are the data that trains your agent. This guide covers how to add, manage, and optimize your agent's knowledge base.

How Knowledge Works

When you add a knowledge source, 88Agents processes it through several steps:

Ingestion — Content is extracted from your source
Chunking — Text is split into smaller, manageable pieces
Embedding — Each chunk is converted to a vector representation
Indexing — Vectors are stored for fast retrieval

When a user asks a question, the system finds the most relevant chunks and provides them to the LLM as context. This is called Retrieval-Augmented Generation (RAG).

Note

Your agent can only answer questions based on its knowledge base. If the information isn't in a source, the agent won't know about it.

Source Types

Each source type has its own capture form. Click + Add Source from the Knowledge tab to choose from the available types.

Documents

Upload documents directly to your agent. Supported formats include PDF, Word, plain text, and Markdown. Documents are straightforward and should process quite quickly.

Click + Add Source → Upload Document
Drag and drop or browse to select files
Click Upload

Tip

For best results, ensure PDFs are text-based (not scanned images). Scanned documents may not extract correctly.

Websites

Crawl and index web pages as knowledge sources. Websites require a crawler integration (like Firecrawl) configured in your workspace credentials.

Enter a starting URL and configure crawl limits including max pages, max depth, and included/excluded paths. Indexing begins right away — large sites can take a while.

Add Website form with crawler configuration

Note

There can be variability when crawling websites. When issues are encountered, feedback will appear on the Knowledge Sources page.

Datasets

Datasets are a two-step process for structured data. Upload a CSV or JSON file, then configure how it gets indexed.

Click + Add Source → Add Dataset
Upload your CSV or JSON file

After upload, the file is analysed and a configuration screen appears. Set the dataset title, primary key, and review the Markdown template that will be used to generate indexed content from each record.

Dataset configuration with Markdown template

Click Create Dataset to begin indexing. The template is applied to each record and stored in your vector database, ready for questions.

FAQs

Add question-answer pairs for common queries. FAQs are straightforward and process quickly. Useful for exact-match scenarios where you want precise, controlled answers.

Transcripts

Ingest YouTube content by providing a video URL, playlist URL, or channel URL. Transcripts are extracted automatically.

Note

There can be variability when crawling YouTube channels and playlists. When issues are encountered, feedback will appear on the Knowledge Sources page.

Managing Sources

The Knowledge tab shows all your sources with their current status. Long-running tasks show in-progress updates as they complete.

Knowledge sources list showing processing status

From this page you can:

View status — See processing state (Processing, Completed, Error)
Re-sync sources — Refresh content from web sources or YouTube
Edit sources — Update configuration
Delete sources — Remove sources from the knowledge base

Note

Changes to knowledge sources take effect immediately. There's no need to redeploy your agent.

Best Practices

Quality over quantity — A focused, accurate knowledge base outperforms a large, noisy one
Keep content current — Set up regular re-syncs for website sources or update files when information changes
Structure your content — Use clear headings and organize information logically
Test after changes — Always test your agent in the Playground after adding or modifying sources
Remove outdated sources — Delete sources that contain obsolete information