A transformer that converts HTML content to plain text.

Example

const loader = new CheerioWebBaseLoader("https://example.com/some-page");
const docs = await loader.load();

const splitter = new RecursiveCharacterTextSplitter({
maxCharacterCount: 1000,
});
const transformer = new HtmlToTextTransformer();

// The sequence of text splitting followed by HTML to text transformation
const sequence = splitter.pipe(transformer);

// Processing the loaded documents through the sequence
const newDocuments = await sequence.invoke(docs);

console.log(newDocuments);

Hierarchy

Constructors

Properties

Methods

Constructors

Properties

options: HtmlToTextOptions = {}

Methods

  • Parameters

    • documents: Document<Record<string, any>>[]

    Returns Promise<Document<Record<string, any>>[]>

Generated using TypeDoc