Data Scraping/Mining
$30-5000 USD
Оплачується при отриманні
To scarpe/mine bibliographical information and enhanced descriptive product information from 2 technical book publishers' websites. Also to collect book jacket images. Each publisher will have no more than 500 records each to collect. The resulting data should be presented in a CSV format.
## Deliverables
At this stage, one of the sites to be mined will require giving out a user name and password, which we do not wish to do at this stage. Hence, the ZIP file contains 4 files as samples of the source HTML. Hopefully this will be enough for coder to advise on work needing done. The 4 files are : [url removed, login to view] : this contains the main record for one book title - all bibliographic information including descriptions etc required. [url removed, login to view] : contains the contents list for this title - the website provides this information as a link from the main title page. The above information is taken from our password protected website with this company. The other 2 files are taken from the freely accessed public website of [url removed, login to view] and are, as above, the main title record plus the linked contents list. [url removed, login to view] [url removed, login to view]
ID Проекту: #2930431