Web-scrape 120929 companies data including +7000 email addresses from [login to view URL] supplier reference
I work in a surveyor firm as a salesperson in hong kong. I need to find clients on a daily basis, but my boss company has virtually zero support on lead generation. I decide to build an excel spreadsheet , based on this Trade Development Council , TDC link : [login to view URL]
If you type in "limited" in search engine of company name in the website, 120929 companies data are available as at yesterday 16/6/2018 across 2016 pages using 60 results per page. Some problems are found.
One problem is: both the fax and the telephone numbers are stored in a picture in jpeg image. I advise OCR is used to convert the two set of numbers to text .
Second problem is: downloading the first 12 pages is smooth. downloading the 13th pages is blocked by the website.
Third problem: some 7000 email addresses are on the TDC pages of suppliers references .
In short, the project consists of 2 parts. First page is turn webpage to data. Download +120,000 companies datasets, where MS excel spreadsheet fields include
1. Company name
2. Year of Establishment,
3. Number of Staff
4. nature of business
5. Annual Turnover
6. Industry product/services range
7. Office address
8. Country / Region
9. Telephone, need translate from OCR
10. Fax, need translate from OCR
12. Contact person
13. Title of contact person
Visit : [login to view URL],
Product page are NO need. Important pages are the COMPANY and CONTACT.
Second part of this project is +7,000 email addresses in companies with +7,000 supplier reference.
14. "+7000 email addresses from +7000 companies with "suppliers supplier references" ✓✓logo
3-day time is expected to finish the project. 21 June 2018 is deadline.
Language in English.
My budget is US $50, awarded to project winner. I pay NDA fee.(non disclosure agreement)