Extract Text

31 Aug 20231 minute to read

The PDF Viewer library allows you to extract the text from a page along with the bounds. Text extraction can be done using the isExtractText property and extractTextCompleted event.

The following steps are used to extract the text from the page.

Step 1: Follow the steps provided in the link to create a simple PDF Viewer sample.

Step 2: The following code snippet explains how to extract the text from a page .

<ejs-pdfviewer #pdfViewer id="pdfViewer"
               [serviceUrl]='service'
               [documentPath]='document'
               (extractTextCompleted)='extractTextCompleted($event)'
               [isExtractText]=true
               style="height:640px;display:block">
</ejs-pdfviewer>

public extractTextCompleted(e: any): void {
// Extract the Complete text of load document
console.log(e);
console.log(e.documentTextCollection[1]);
// Extract the Text data.
console.log(e.documentTextCollection[1][1].TextData);
// Extract Text in the Page.
console.log(e.documentTextCollection[1][1].PageText);
// Extract Text along with Bounds
console.log(e.documentTextCollection[1][1].TextData[0].Bounds);
}

Find the Sample, how to Extract Text