Read PDF content on a browser using Selenium webdriver

0 votes
Read PDF content on a browser using Selenium webdriver?
Jul 4, 2019 in Selenium by Parvati
7,790 views

1 answer to this question.

0 votes

Hey Parvati, you can use Apache PDFBox JAR files to read PDF content on a browser using Selenium Webdriver. You can install Apache PDFBox JAR from here. Then you can simply add Selenium Standalone JAR and PDFBox JAR into the Build path of your JAVA Project. Now you can use following code snippet to read pdf data from a webpage:

import java.io.BufferedInputStream;
import java.io.InputStream;
import java.net.URL;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.text.PDFTextStripper;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
import org.testng.Assert;
import org.testng.annotations.BeforeTest;
import org.testng.annotations.Test;

public class PDFReader {

    WebDriver driver;

    @BeforeTest
    public void setUp() {
        System.setProperty("webdriver.chrome.driver", "C:\\Users\\Abha_Rathour\\Downloads\\chromedriver.exe");

        driver = new ChromeDriver();
    }

    @Test
    public void verifyPDFContent() throws Exception {
        String url = "http://www.africau.edu/images/default/sample.pdf";
        driver.get(url);
        String pdfContent = readPDFContent(driver.getCurrentUrl());

        Assert.assertTrue(pdfContent.contains("This is a small demonstration .pdf file"));
        driver.quit();
    }

    public String readPDFContent(String appUrl) throws Exception {

        URL url = new URL(appUrl);
        InputStream input = url.openStream();
        BufferedInputStream fileToParse = new BufferedInputStream(input);
        PDDocument document = null;
        String output = null;

        try {
            document = PDDocument.load(fileToParse);
            output = new PDFTextStripper().getText(document);
            System.out.println(output);

        } finally {
            if (document != null) {
                document.close();
            }
            fileToParse.close();
            is.close();
        }
        return output;
    }
}
answered Jul 5, 2019 by Abha
• 28,020 points

Related Questions In Selenium

0 votes
1 answer
0 votes
1 answer

30 min Wait on a page and then perform any Operation using Selenium Webdriver

Implicit wait tells webdriver to poll the ...READ MORE

answered Apr 19, 2018 in Selenium by Shubham
• 13,480 points
933 views
+1 vote
2 answers

Is it possible to scroll down in a webpage using selenium webdriver programmed on python?

I using next code for facebook for ...READ MORE

answered May 16, 2019 in Selenium by mslavikas@gmail.com
21,615 views
0 votes
2 answers

Finding WebDriver element with Class Name in java

The better way to handle this element ...READ MORE

answered Apr 10, 2018 in Selenium by nsv999
• 5,520 points
4,897 views
0 votes
2 answers

Problem while using InternetExplorerDriver in Selenium WebDriver

enable trusted connection  in internet explorer by ...READ MORE

answered Aug 31, 2020 in Selenium by Sri
• 3,010 points
5,700 views
0 votes
1 answer

Geo-location microphone camera pop up

To Allow or Block the notification, access using Selenium and you have to ...READ MORE

answered May 11, 2018 in Selenium by Samarpit
• 5,890 points
3,477 views
0 votes
2 answers

How to use such xpath to find web elements

xpath are two types. 1) Absolute XPath:    /html/b ...READ MORE

answered Sep 2, 2020 in Selenium by Sri
• 3,010 points
4,176 views
0 votes
2 answers
0 votes
1 answer

How to click on a hyperlink using Selenium WebDriver?

Hi Jonathan, you can use click() method in ...READ MORE

answered May 29, 2019 in Selenium by Abha
• 28,020 points
5,358 views