Spreadsheets

Spreadsheets can be indexed by exporting them to .tsv and then using TSVIndexer to turn each row into a document.

TSVIndexer

The TSVIndexer class is responsible for indexing TSV (Tab-Separated Values) files into a catalog.

Example

const tsvIndexer = new TSVIndexer(catalog, "path/to/file.tsv", {
  batchSize: 100,
  fieldMapping: {
    "Original Column": "mappedColumn",
  },
});
 
await tsvIndexer.index();

Constructor

constructor(catalog: Catalog, file: string, opts?: TSVIndexerOpts)

Creates a new instance of the TSVIndexer.

  • catalog: The catalog to index the documents into.
  • file: The path to the TSV file to be indexed.
  • opts: Optional configuration options for the indexer.

TSVIndexerOpts

  • batchSize?: number: The number of documents to process in each batch.
  • getId?: (item: any) => string: A function to extract the document ID from an item.
  • getUrl?: (item: any) => string: A function to extract the URL from an item.
  • getImageUrl?: (item: any) => string: A function to extract the image URL from an item.
  • fieldMapping?: { [key: string]: string }: A mapping of TSV column names to desired field names.

Methods

index()

public async index(): Promise<void>

Indexes the TSV file into the catalog. This method reads the TSV file, processes its contents, and adds the documents to the catalog using a JSONIndexer.

Returns: A Promise that resolves when the indexing process is complete.