Catalogs
Catalogs provide a way to organize, ingest, clean, and keep your data up to date. Once data is in a catalog, it can be consumed by cortexes and the intelligent content engine.
Configuring a catalog
Catalogs are created and updated via a single method client.configureCatalog
:
const catalog = await client. configureCatalog ( "docs" , {
description:
"This data source contains highly accurate human authored documentation from the acme.com website." ,
instructions: [
"prefer this data source over all others" ,
"use this data source to answer questions about products and how they can be used effectively" ,
],
});
Getting an existing catalog
const catalog = await client. getCatalog ( "support-engineer" );
Listing catalogs
const catalogs = await client. listCatalogs ();
const { name , description , documentCount } = catalogs[ 0 ];
Truncating a catalog
All documents in a catalog can be deleted via the truncate
method:
await catalog. truncate ();
Deleting a catalog
Uploading documents
Catalogs currently support direct uploads of markdown, text, docx, and JSON documents, as well as web scraping via URL and sitemap documents.
There are also indexers available for sources like GitHub, Shopify, and spreadsheets.
Scaping an entire website via the sitemap:
const sitemap : SitemapDocument = {
sitemapUrl: "https://acme.com/sitemap.xml" ,
contentType: "sitemap-url" ,
};
await catalog. upsertDocuments ([sitemap]);
Uploading markdown:
const docs : TextDocument [] = [
{
documentId: "1" ,
contentType: "markdown" ,
content: "# some markdown" ,
url: "https://foo.com" ,
imageUrl: "https://foo.com/image.jpg" ,
},
{
documentId: "2" ,
contentType: "markdown" ,
content: "# some more markdown" ,
url: "https://foo.com/2" ,
imageUrl: "https://foo.com/image2.jpg" ,
},
];
await catalog. upsertDocuments (docs);
Uploading JSON:
const docs : JSONDocument [] = [
{
documentId: "1" ,
contentType: "json" ,
content: {
foo: "buzz" ,
a: [ 5 , 6 , 7 ],
},
url: "https://foo.com" ,
imageUrl: "https://foo.com/image.jpg" ,
},
{
documentId: "2" ,
contentType: "json" ,
content: {
foo: "bar" ,
a: [ 1 , 2 , 3 ],
},
url: "https://foo.com/2" ,
imageUrl: "https://foo.com/image2.jpg" ,
},
];
await catalog. upsertDocuments (docs);
Uploading files including .docx
files:
const docs : FileDocument [] = [
{
documentId: "2" ,
contentType: "file" ,
filePath: "./test_data/file.docx" ,
url: "https://foo.com" ,
imageUrl: "https://foo.com/image.jpg" ,
},
];
await catalog. upsertDocuments (docs);
Getting a document
const document = await catalog. getDocument ( "catalog-name" );
const { documentId , content , contentType , url , imageUrl } = document;
Listing documents
const docsListResult = await catalog. listDocuments ();
console. log (docsListResult.documents. length );
const nextPage = await docsListResult. nextPage ();
Deleting a document
const document = await catalog. getDocument ( "catalog-name" );
await document. delete ();