Agents Websites
API endpoints for Agents Websites
/api/agents/{agent_id}/websites List returns websites for an agent.
Retrieves paginated website knowledge sources for the specified agent. Returns 400 for invalid agent ID, 500 on service error, 200 OK with paginated websites on success.
Response Codes
| Status Code | Description |
|---|---|
| 200 | Success |
| 400 | Invalid request |
| 500 | Internal server error |
Success
Response response.PaginatedWebsites
Paginated Websites
| Property | Type | Description |
|---|---|---|
| data | string[] | Array of websites for current page |
| meta | string | Pagination metadata |
/api/agents/{agent_id}/websites Create creates a website knowledge source.
Creates a website source for crawling. The website is stored with a queued ingest status and will be picked up by the crawl worker. Returns 404 if agent not found, 422 for validation errors, 500 on creation failure, 201 Created with the website on success.
Request Body request.Website
Website
| Property | Type | Description |
|---|---|---|
| url* | string | Starting URL to crawl |
| title* | string | Display title for the website |
| crawl_integration_id | integer (int64) | null | ID of crawl integration to use (defaults to workspace default if not specified) |
| max_pages | integer (int32) | null | Maximum number of pages to crawl (1-10000) |
| max_depth | integer (int32) | null | Maximum link depth to follow (1-10) |
| crawl_frequency | "daily" | "weekly" | "monthly" | "manual" | How often to recrawl: daily, weekly, monthly, or manual |
| excluded_paths | string | URL path patterns to exclude from crawling |
| included_paths | string | URL path patterns to include (overrides excludes) |
| auto_crawl | boolean | Start crawling immediately after creation |
Response Codes
| Status Code | Description |
|---|---|
| 201 | Resource created successfully |
| 404 | Resource not found |
| 422 | Validation failed |
| 500 | Internal server error |
Resource created successfully
Response response.WebsiteResponse
Website
| Property | Type | Description |
|---|---|---|
| id | integer (int64) | Unique website identifier |
| agent_id | integer (int64) | ID of agent this website belongs to |
| crawl_integration_id | integer (int64) | null | ID of crawl integration to use |
| url | string | Starting URL |
| title | string | Display title |
| ingest_status | string | Crawl status (pending, crawling, completed, failed) |
| ingest_error | string | null | Error message if crawl failed |
| max_pages | integer (int32) | Maximum number of pages to crawl |
| pages_crawled | integer (int32) | Number of pages crawled so far |
| total_pages | integer (int32) | Total number of pages discovered |
| credits_used | integer (int32) | Number of crawler credits used |
| crawl_frequency | string | How often to recrawl: daily, weekly, monthly, or manual |
| excluded_paths | object | URL path patterns to exclude |
| included_paths | object | URL path patterns to include |
| max_depth | integer (int32) | null | Maximum link depth to follow |
| last_crawled_at | string (date-time) | null | Timestamp when site was last crawled |
| next_crawl_at | string (date-time) | null | Timestamp when next crawl is scheduled |
| last_crawl_result | string | null | Result message from last crawl |
| created_at | string (date-time) | Timestamp when website was created |
| updated_at | string (date-time) | Timestamp when website was last updated |
/api/agents/{agent_id}/websites/{id} Get returns a website by ID.
Retrieves a single website by its ID. Returns 400 for invalid ID, 404 if not found or belongs to different agent, 200 OK with the website on success.
Response Codes
| Status Code | Description |
|---|---|
| 200 | Success |
| 400 | Invalid request |
| 404 | Resource not found |
Success
Response response.WebsiteResponse
Website
| Property | Type | Description |
|---|---|---|
| id | integer (int64) | Unique website identifier |
| agent_id | integer (int64) | ID of agent this website belongs to |
| crawl_integration_id | integer (int64) | null | ID of crawl integration to use |
| url | string | Starting URL |
| title | string | Display title |
| ingest_status | string | Crawl status (pending, crawling, completed, failed) |
| ingest_error | string | null | Error message if crawl failed |
| max_pages | integer (int32) | Maximum number of pages to crawl |
| pages_crawled | integer (int32) | Number of pages crawled so far |
| total_pages | integer (int32) | Total number of pages discovered |
| credits_used | integer (int32) | Number of crawler credits used |
| crawl_frequency | string | How often to recrawl: daily, weekly, monthly, or manual |
| excluded_paths | object | URL path patterns to exclude |
| included_paths | object | URL path patterns to include |
| max_depth | integer (int32) | null | Maximum link depth to follow |
| last_crawled_at | string (date-time) | null | Timestamp when site was last crawled |
| next_crawl_at | string (date-time) | null | Timestamp when next crawl is scheduled |
| last_crawl_result | string | null | Result message from last crawl |
| created_at | string (date-time) | Timestamp when website was created |
| updated_at | string (date-time) | Timestamp when website was last updated |
/api/agents/{agent_id}/websites/{id} Update updates a website.
Updates the specified website with new metadata or settings. Returns 400 for invalid ID, 404 if not found, 422 for validation errors, 200 OK with the updated website on success.
Request Body request.WebsiteUpdate
Website Update
| Property | Type | Description |
|---|---|---|
| title | string | null | Display title for the website |
| crawl_integration_id | integer (int64) | null | ID of crawl integration to use |
| max_pages | integer (int32) | null | Maximum number of pages to crawl (1-10000) |
| max_depth | integer (int32) | null | Maximum link depth to follow (1-10) |
| crawl_frequency | "daily" | "weekly" | "monthly" | "manual" | How often to recrawl: daily, weekly, monthly, or manual |
| excluded_paths | string | URL path patterns to exclude from crawling |
| included_paths | string | URL path patterns to include (overrides excludes) |
Response Codes
| Status Code | Description |
|---|---|
| 200 | Success |
| 400 | Invalid request |
| 404 | Resource not found |
| 422 | Validation failed |
| 500 | Internal server error |
Success
Response response.WebsiteResponse
Website
| Property | Type | Description |
|---|---|---|
| id | integer (int64) | Unique website identifier |
| agent_id | integer (int64) | ID of agent this website belongs to |
| crawl_integration_id | integer (int64) | null | ID of crawl integration to use |
| url | string | Starting URL |
| title | string | Display title |
| ingest_status | string | Crawl status (pending, crawling, completed, failed) |
| ingest_error | string | null | Error message if crawl failed |
| max_pages | integer (int32) | Maximum number of pages to crawl |
| pages_crawled | integer (int32) | Number of pages crawled so far |
| total_pages | integer (int32) | Total number of pages discovered |
| credits_used | integer (int32) | Number of crawler credits used |
| crawl_frequency | string | How often to recrawl: daily, weekly, monthly, or manual |
| excluded_paths | object | URL path patterns to exclude |
| included_paths | object | URL path patterns to include |
| max_depth | integer (int32) | null | Maximum link depth to follow |
| last_crawled_at | string (date-time) | null | Timestamp when site was last crawled |
| next_crawl_at | string (date-time) | null | Timestamp when next crawl is scheduled |
| last_crawl_result | string | null | Result message from last crawl |
| created_at | string (date-time) | Timestamp when website was created |
| updated_at | string (date-time) | Timestamp when website was last updated |
/api/agents/{agent_id}/websites/{id} Delete deletes a website.
Permanently removes a website and all associated chunks and embeddings. Cannot delete while processing. Returns 400 for invalid ID, 404 if not found, 409 if currently processing, 500 on service error, 204 No Content on success.
Response Codes
| Status Code | Description |
|---|---|
| 204 | Success with no content |
| 400 | Invalid request |
| 404 | Resource not found |
| 500 | Internal server error |
/api/agents/{agent_id}/websites/{id}/sources Sources returns paginated sources (crawled pages) for a website.
Retrieves all sources associated with a website, representing individual pages that were crawled. Returns 400 for invalid ID, 404 if website not found or belongs to different agent, 200 OK with paginated sources on success.
Response Codes
| Status Code | Description |
|---|---|
| 200 | Success |
| 400 | Invalid request |
| 404 | Resource not found |
| 500 | Internal server error |
Success
Response response.PaginatedSources
Paginated Sources
| Property | Type | Description |
|---|---|---|
| data | string[] | Array of sources for current page |
| meta | string | Pagination metadata |
/api/agents/{agent_id}/websites/{id}/crawl Crawl triggers a crawl of the website.
Queues the website for crawling, which includes fetching pages, parsing content, generating chunks, and updating embeddings. Returns 400 for invalid ID, 404 if not found, 409 if already crawling, 500 on service error, 200 OK with crawl status on success.
Response Codes
| Status Code | Description |
|---|---|
| 200 | Success |
| 400 | Invalid request |
| 404 | Resource not found |
| 500 | Internal server error |
/api/agents/{agent_id}/websites/{id}/sync SyncSource triggers synchronization of a specific knowledge source.
Queues the knowledge source for reprocessing, which includes re-parsing content, regenerating chunks, and updating embeddings. Returns 400 for invalid ID, 404 if not found, 409 if already processing, 500 on service error, 200 OK with sync status on success.
Response Codes
| Status Code | Description |
|---|---|
| 200 | Success |
| 400 | Invalid request |
| 404 | Resource not found |
| 500 | Internal server error |
/api/agents/{agent_id}/websites/{id}/status Status returns the crawl status for a website.
Returns detailed crawl progress including pages crawled, total pages, progress percentage, and any errors. Returns 400 for invalid ID, 404 if website not found or belongs to different agent, 200 OK with status on success.
Response Codes
| Status Code | Description |
|---|---|
| 200 | Success |
| 400 | Invalid request |
| 404 | Resource not found |
/api/agents/{agent_id}/websites/{id}/cancel Cancel cancels an in-progress website crawl.
Attempts to cancel an ongoing crawl operation for the website. Only websites with status 'queued' or 'crawling' can be cancelled. Returns 400 for invalid ID, 404 if not found, 409 if not cancellable, 200 OK with the updated website on success.
Response Codes
| Status Code | Description |
|---|---|
| 200 | Success |
| 400 | Invalid request |
| 404 | Resource not found |
| 500 | Internal server error |
Success
Response response.WebsiteResponse
Website
| Property | Type | Description |
|---|---|---|
| id | integer (int64) | Unique website identifier |
| agent_id | integer (int64) | ID of agent this website belongs to |
| crawl_integration_id | integer (int64) | null | ID of crawl integration to use |
| url | string | Starting URL |
| title | string | Display title |
| ingest_status | string | Crawl status (pending, crawling, completed, failed) |
| ingest_error | string | null | Error message if crawl failed |
| max_pages | integer (int32) | Maximum number of pages to crawl |
| pages_crawled | integer (int32) | Number of pages crawled so far |
| total_pages | integer (int32) | Total number of pages discovered |
| credits_used | integer (int32) | Number of crawler credits used |
| crawl_frequency | string | How often to recrawl: daily, weekly, monthly, or manual |
| excluded_paths | object | URL path patterns to exclude |
| included_paths | object | URL path patterns to include |
| max_depth | integer (int32) | null | Maximum link depth to follow |
| last_crawled_at | string (date-time) | null | Timestamp when site was last crawled |
| next_crawl_at | string (date-time) | null | Timestamp when next crawl is scheduled |
| last_crawl_result | string | null | Result message from last crawl |
| created_at | string (date-time) | Timestamp when website was created |
| updated_at | string (date-time) | Timestamp when website was last updated |