I have a MySQL DB with website urls and serverpath of pictures it it.
I am looking for a guy who can program the following multithread python crawler:
- Each website url that was not checked yet, is visited
- It will be checked whether the source text of the website still contains the picture URL
- if this is the case, a "YES" will be added to the column "online". If not, a "NO" will be added to the DB column.
- If the picture url is still online, it will be checked on the website url whether there is a certain variable text string (implemented in DB) on the website (important: on the website, not within the source text to exclude alter tags, etc.). If yes, a "named" will be added to the column "photographer". If not, a "NO" will be added to the column.
- Proxies need to be used for that project (available here)
- I want to have the option to set a delay time between crawling a website url of the same domain.
Looking forward to your bids!
I can have this done in python very, very quickly. I also wanted to know if your okay with it not having a GUI because python GUI's and multithreading do no mix well.
$25 USD за 1 день
4,9 (2 відгуки(-ів))
2,3
2,3
4 фрілансерів(-и) готові виконати цю роботу у середньому за $62 USD
python master here. i have worked with may python bots in teh past. I am sure i can have this done in a day.
please also check my feedback and portfolio. let me know when i can start.