Search results

Extract Text in React PDF Viewer component

08 Dec 2021 / 1 minute to read

The PDF Viewer library allows you to extract the text from a page along with the bounds. Text extraction can be done using the isExtractText property and extractTextCompleted event.

The following steps are used to extract the text from the page.

Step 1: Follow the steps provided in the link to create a simple PDF Viewer sample.

Step 2: The following code snippet explains how to extract the text from a page .

Copied to clipboard
<PdfViewerComponent
id="container"
documentPath="PDF_Succinctly.pdf"
serviceUrl="https://ej2services.syncfusion.com/production/web-services/api/pdfviewer"
isExtractText={true}
extractTextCompleted={this.extractTextCompleted}
style={{ height: '640px' }}
></PdfViewerComponent>

extractTextCompleted = (args) => {
    // Extract the Complete text of load document
    console.log(args);
    console.log(args.documentTextCollection[1]);
    // Extract the Text data.
    console.log(args.documentTextCollection[1][1].TextData);
    // Extract Text in the Page.
    console.log(args.documentTextCollection[1][1].PageText);
    // Extract Text along with Bounds
    console.log(args.documentTextCollection[1][1].TextData[0].Bounds);
};

Find the Sample, how to Extract Text