I need to automate data printing from a web site.
I have a list of hundreds items that I need to pass to this web site (10 to 20 at one time)
Once the site responds with the results (The list of items found needs to be stored on memory and or disk from a table generated dynamically by the website) I then select them and click print, after that I select the type of format I need and click on print to PDF.
All items print to a single PDF I then need to slip It and name it accordingly (From the previously saved list of results).
The pdf generated by this process are searchable (pdf/a) and contain information that need to be extracted (3 lines on information easy found via a text search) Finally the last step is to add a small image to some pdfs wich contain certain keywords
I will provide remote access to the credentials of the site.
It looks like one or more of the following technologies could be applied to the project
I would like to provide this links as a guide, please read them including comments.
The end result could use a combination of scripts from different programs/technologies
Maybe a bash script could do handle the PDF part.
Instead of adding the image to the PDF's increasing the font, bolding and underlining the keywords is enough.
I need to project completed in 2 days.
The order of completion is a priority
First, the PDF manipulation, searching for the keywords list on each file then and increasing the font, bolding and underlining of those keywords in the document.
Second the document scraping/download of the files.