An existing recursive python crawler using beautiful soup and sqlite needs the following fixed:
1) The crawler is quite brutal during parsing and recursive threads are sometimes grabbing the same links. This needs to be fixed so, each running thread grabs a diffrent link, not the same one. The crawler needs to be made less "brutal". This also needs to be tested on a limited site-only crawl and on open internet-wide crawl and see if both work correctly.
2) Add proxy capability, so that crawling and parsing is done through rotating proxies. The pool of future proxies needs to be defined as an array, so the proxy lists can be changed at any time in only one spot and the crawler would still work correctly, as before the proxy change.
**IMPORTANT:**
Please quote the full name of the inventor of Python for your bid to be considered.
The employer will not act as a unpaid tester for the worker. Testing to 100% up-to-spec is a requirement of this project. Employer will test the final deliverables once the worker is 100% confident the deliverables work up-to-spec.
The 3 days deadline WILL be increased on the request of the worker IF there is sufficient progress observed during the first 3 days (and each successive deadline extension) of the project.