Conversion of PDF file to Excel in R

0 votes

I want to convert a two-page PDF file into an excel spreadsheet. Two tables—a balance statement and a profit and loss table for a company—make up the pdf. I now have this internet code (in R) that functions flawlessly, but it only converts the second page out of the two, not the first. I looked everywhere and attempted a variety of options, but nothing worked. The online conversion tools are also too expensive because there are so many of these files. Could someone please assist me with this? It will be fantastic if it can convert both pages at once.

library("pdftools")
tx1<-pdf_text("C:/Users/Snehal Salaskar/Desktop/Companies/CanFin/2013-14.pdf")
tx3<-strsplit(tx1,"\n")
lapply(tx3, function(x) write.table( data.frame(x), 'Profit.csv'  , sep=',' ))

I want it to convert all 2 pages at once.

Oct 16 in Others by Kithuzzz
• 20,660 points
80 views

1 answer to this question.

0 votes

I looked at the pdf, and it appears that formatting it into a good table will take some time. The problem is that you were saving to the same file name twice if all you want to do is capture the outputs to a file.

Using append=TRUE will save them both to one file, rather than two, as in the example above.

# save to two files
lapply(seq_along(tx3), function(i){
  write.table( data.frame(tx3[[i]]), sprintf('Profit_%s.csv', i), sep=',' )
}) 

# save to single file with append=TRUE adding on the data
lapply(seq_along(tx3), function(i){
  write.table( data.frame(tx3[[i]]), sprintf('Profit.csv', i), sep=',' ,
               append = TRUE)
}) 
answered Oct 16 by narikkadan
• 37,660 points

Related Questions In Others

0 votes
1 answer

Change date format of cell in excel from dd.mm.yyyy to yyy/mm/dd ( excel version 2013 )

Hello :)   Excel’s Format Cells function can quickly ...READ MORE

answered Feb 9 in Others by gaurav
• 22,040 points
544 views
0 votes
1 answer

How to import excel file in Oracle SQL live

Hello, there are a few steps You'll ...READ MORE

answered Feb 18 in Others by gaurav
• 22,040 points
607 views
0 votes
1 answer
0 votes
1 answer

Download multiple excel files linked through urls in R

Try something along the lines of: for (i ...READ MORE

answered Sep 23 in Others by narikkadan
• 37,660 points
103 views
0 votes
1 answer

Print chosen worksheets in excel files to pdf in python

In the simplest form: import win32com.client o = win32com.client.Dispatch("Excel.Application") o.Visible ...READ MORE

answered Sep 24 in Others by narikkadan
• 37,660 points
248 views
0 votes
1 answer

Java Spring - Writing Excel file and converting to PDF

Since you are using Spring I suggest ...READ MORE

answered Sep 26 in Others by narikkadan
• 37,660 points
272 views
0 votes
1 answer

Convert Excel to PDF issue with documents4j

MS Excel may not always be used ...READ MORE

answered Sep 26 in Others by narikkadan
• 37,660 points
198 views
0 votes
1 answer

How to look in all folders in directory to change file extensions of excel file?

Loop Through All Subfolders Using VBA Dim strCurrentFileExt ...READ MORE

answered Nov 24 in Others by narikkadan
• 37,660 points
33 views
0 votes
1 answer

How to print an Excel Sheet using VBA with Nitro PDF Creator in Excel 2016

you can use the built-in excel facilities ...READ MORE

answered Sep 24 in Others by narikkadan
• 37,660 points
131 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP