Bidders must be familiar with the google/yahoo sitemap protocols.
English speaking programmers only please.
I am looking for a script that will perform the generation of xml/txt sitemaps in google/yahoo formats.
The script must be able to 'fully spider' the external user submitted site whilst not using excessively using my server resources.
The script must:-
Accept a page being submitted as well as a domain name.
The script must have a ‘crawl' progress meter.
Have the resulting output available in total as a variable or array.
It must also have the following user 'options':-
Obey [login to view URL] file?
Obey robots meta tags?
Filtering of certain querystrings? eg sessionid, id, sess
Options to pick up hrefs to other indexable files. eg pdf, xls, doc
Start from a directory level deeper than the main domain and not spider upwards.
The resulting output of the script must be able to be:-
Edited online.
Downloadable in both google(xml), ror(xml), yahoo(text) and html formats.
Similar scripts and tips can be found at:-
[login to view URL]
[login to view URL]
[login to view URL]
Overall design is not important as I have a template to drop it into.
Functionality and commenting of php code is a must please.