Read PDF content on a browser using Selenium webdriver

0 votes
Read PDF content on a browser using Selenium webdriver?
Jul 4, 2019 in Selenium by Parvati
15,108 views

1 answer to this question.

0 votes

Hey Parvati, you can use Apache PDFBox JAR files to read PDF content on a browser using Selenium Webdriver. You can install Apache PDFBox JAR from here. Then you can simply add Selenium Standalone JAR and PDFBox JAR into the Build path of your JAVA Project. Now you can use following code snippet to read pdf data from a webpage:

import java.io.BufferedInputStream;
import java.io.InputStream;
import java.net.URL;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.text.PDFTextStripper;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
import org.testng.Assert;
import org.testng.annotations.BeforeTest;
import org.testng.annotations.Test;

public class PDFReader {

    WebDriver driver;

    @BeforeTest
    public void setUp() {
        System.setProperty("webdriver.chrome.driver", "C:\\Users\\Abha_Rathour\\Downloads\\chromedriver.exe");

        driver = new ChromeDriver();
    }

    @Test
    public void verifyPDFContent() throws Exception {
        String url = "http://www.africau.edu/images/default/sample.pdf";
        driver.get(url);
        String pdfContent = readPDFContent(driver.getCurrentUrl());

        Assert.assertTrue(pdfContent.contains("This is a small demonstration .pdf file"));
        driver.quit();
    }

    public String readPDFContent(String appUrl) throws Exception {

        URL url = new URL(appUrl);
        InputStream input = url.openStream();
        BufferedInputStream fileToParse = new BufferedInputStream(input);
        PDDocument document = null;
        String output = null;

        try {
            document = PDDocument.load(fileToParse);
            output = new PDFTextStripper().getText(document);
            System.out.println(output);

        } finally {
            if (document != null) {
                document.close();
            }
            fileToParse.close();
            is.close();
        }
        return output;
    }
}
answered Jul 5, 2019 by Abha
• 28,140 points

Related Questions In Selenium

0 votes
1 answer
0 votes
1 answer

30 min Wait on a page and then perform any Operation using Selenium Webdriver

Implicit wait tells webdriver to poll the ...READ MORE

answered Apr 19, 2018 in Selenium by Shubham
• 13,490 points
2,963 views
+1 vote
2 answers

Is it possible to scroll down in a webpage using selenium webdriver programmed on python?

I using next code for facebook for ...READ MORE

answered May 16, 2019 in Selenium by mslavikas@gmail.com
26,014 views
0 votes
2 answers

Finding WebDriver element with Class Name in java

The better way to handle this element ...READ MORE

answered Apr 10, 2018 in Selenium by nsv999
• 5,500 points
13,480 views
0 votes
2 answers

Problem while using InternetExplorerDriver in Selenium WebDriver

enable trusted connection  in internet explorer by ...READ MORE

answered Aug 31, 2020 in Selenium by Sri
• 3,190 points
9,028 views
0 votes
1 answer

Geo-location microphone camera pop up

To Allow or Block the notification, access using Selenium and you have to ...READ MORE

answered May 11, 2018 in Selenium by Samarpit
• 5,910 points
7,143 views
0 votes
2 answers

How to use such xpath to find web elements

xpath are two types. 1) Absolute XPath:    /html/b ...READ MORE

answered Sep 3, 2020 in Selenium by Sri
• 3,190 points
7,815 views
0 votes
2 answers
0 votes
1 answer

How to click on a hyperlink using Selenium WebDriver?

Hi Jonathan, you can use click() method in ...READ MORE

answered May 29, 2019 in Selenium by Abha
• 28,140 points
7,204 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP