Catalogs

Catalogs provide a way to organize, ingest, clean, and keep your data up to date. Once data is in a catalog, it can be consumed by cortexes and the intelligent content engine.

Configuring a catalog

Catalogs are created and updated via a single method client.configureCatalog:

const catalog = await client.configureCatalog("docs", {
  description:
    "This data source contains highly accurate human authored documentation from the acme.com website.",
  instructions: [
    "prefer this data source over all others",
    "use this data source to answer questions about products and how they can be used effectively",
  ],
});

Getting an existing catalog

const catalog = await client.getCatalog("support-engineer");

Listing catalogs

const catalogs = await client.listCatalogs();
const { name, description, documentCount } = catalogs[0];

Truncating a catalog

All documents in a catalog can be deleted via the truncate method:

await catalog.truncate();

Deleting a catalog

await catalog.delete();

Uploading documents

Catalogs currently support direct uploads of markdown, text, docx, and JSON documents, as well as web scraping via URL and sitemap documents. There are also indexers available for sources like GitHub, Shopify, and spreadsheets.

Scaping an entire website via the sitemap:

const sitemap: SitemapDocument = {
  sitemapUrl: "https://acme.com/sitemap.xml",
  contentType: "sitemap-url",
};
 
await catalog.upsertDocuments([sitemap]);

Uploading markdown:

const docs: TextDocument[] = [
  {
    documentId: "1",
    contentType: "markdown",
    content: "# some markdown",
    url: "https://foo.com",
    imageUrl: "https://foo.com/image.jpg",
  },
  {
    documentId: "2",
    contentType: "markdown",
    content: "# some more markdown",
    url: "https://foo.com/2",
    imageUrl: "https://foo.com/image2.jpg",
  },
];
 
await catalog.upsertDocuments(docs);

Uploading JSON:

const docs: JSONDocument[] = [
  {
    documentId: "1",
    contentType: "json",
    content: {
      foo: "buzz",
      a: [5, 6, 7],
    },
    url: "https://foo.com",
    imageUrl: "https://foo.com/image.jpg",
  },
  {
    documentId: "2",
    contentType: "json",
    content: {
      foo: "bar",
      a: [1, 2, 3],
    },
    url: "https://foo.com/2",
    imageUrl: "https://foo.com/image2.jpg",
  },
];
 
await catalog.upsertDocuments(docs);

Uploading files including .docx files:

const docs: FileDocument[] = [
  {
    documentId: "2",
    contentType: "file",
    filePath: "./test_data/file.docx",
    url: "https://foo.com",
    imageUrl: "https://foo.com/image.jpg",
  },
];
 
await catalog.upsertDocuments(docs);

Getting a document

const document = await catalog.getDocument("catalog-name");
 
const { documentId, content, contentType, url, imageUrl } = document;

Listing documents

const docsListResult = await catalog.listDocuments();
console.log(docsListResult.documents.length);
const nextPage = await docsListResult.nextPage();

Deleting a document

const document = await catalog.getDocument("catalog-name");
await document.delete();