Please describe the best method to extract specific text from a PDF and input that text into a C++ application text field

+1 vote
I would like to know the best method of extracting specific data (Text) from a PDF and inputting that data (Text) into a text field of a Tax software application.

A user will fill in the PDF form in different sections as "answers". EG - 1/ Question... [Text Answer]

I will need to copy or extract answer 1 and input answer 1 into the correct text field of the Tax software at 1/ Answer 1.

Any advice is appreciated.
Sep 14 in RPA by Chris
• 130 points
162 views
@Chris, could you please post the complete error that you have encountered?

Steps broken down.

Here is a snip if the PDF. The Red Boxes indicate the areas users can enter text and the areas I wish to GET. I made the PDF so I can change it to help the situation if needed.

I have stripped this to the bare bones. Process-Sequence

Activate:

 

Variables set as strings or Int32 and converted via ToString in MessageBox

I have used Edge, Chrome, AVG Browser, and Adobe. I can use what is recommended.

Result - The Message Box is either displayed and the end of compiling but is empty or it has the words “Chrome Legacy Window”. If I grab the Full text it works and included the text from the users. No Problems. If I use OCR screen scraping with Teseract only it seems to work.

I am open to suggestions for the best way to do this as I have flexibility to do best practise for best results.

Keeping in mind I need to put the results or variables into another application after this.

You are getting this exception coz most probably you used a partial selector. Try using a complete selector instead of a partial selector.

Hope this helps!
The selector details didnt make any difference but I added a hot key stroke to make the PDF "Actual Size" before reading the text. This has worked consistently now. I can display the text (captured in a variable) in a Message Box or Text File.

Part 2/ I can open the Business Tax Application and navigate, and input the text in the correct field but it seems to loose focus after the 1st input of text and it also wont save in the application. It is not consistent either as it as sometimes it inputs the text in the wrong spot.

I am using "Type Into" activity. This seems to work but again the UI element seems to loose focus. I have checked the selectors on them and they have been validated. I have played with clicking in the element first or tabbing through to the correct field. Get inconsistent results.

any ideas (No errors) to help make this solid and work 100% of the time?

No answer to this question. Be the first to respond.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.

Related Questions In RPA

0 votes
1 answer

How to extract table and text from pdf file and load into Excel by using workbench commands in automation anywhere

Hey Dhruv, to extract table and text ...READ MORE

answered Oct 10, 2019 in RPA by Abha
• 27,930 points
1,900 views
0 votes
1 answer

RPA : How to extract specific data from scanned pdf and write into excel in blueprism?

Your query is similar to https://www.edureka.co/community/6509 ...READ MORE

answered Mar 23 in RPA by Sirajul
• 58,130 points
609 views
0 votes
1 answer

how to extract information from a pdf using regex and send that information to an excel spreadsheet?

You can use the following high level activities to ...READ MORE

answered Apr 9 in RPA by Karan
• 18,400 points
276 views
0 votes
1 answer
0 votes
3 answers

How to write lines to a text file in R?

sink("outfile.txt") cat("hello") cat("\n" ...READ MORE

answered May 24, 2019 in Data Analytics by anonymous
11,551 views
0 votes
1 answer

Selection Bias

Selection bias is the bias introduced by the ...READ MORE

answered Jul 11, 2018 in Data Analytics by CodingByHeart77
• 3,720 points
84 views
0 votes
1 answer

Using Real Time flume Data for Analysis

By using MorphlineSolrSink we can extract, transform ...READ MORE

answered Jul 17, 2018 in Database by kurt_cobain
• 9,320 points
143 views
0 votes
1 answer

Channel in Flume

 A transient store that receives the events ...READ MORE

answered Jul 17, 2018 in Big Data Hadoop by Ashish
• 2,650 points
82 views