i want a webcrawler script with search engine, where user can submit site for crawling
project details are
admin feature-
Add/manage/delete domains.
Specify news domains to faster crawling.
Restrict domains even after start crawling.
Delete already crawled results of the restricted domains.
Search for domains.
View the queued pages and crawled pages.
View the list of all identified keywords from each pages.
View the query keywords along with the number of total results.
Set the number of pages to be crawled by the bot for a single execution.
Limit page search by depth.
Set meta keyword limit from pages while crawling up to a particular number.
Specify the size and width of thumbshots for image results.
Create domain sets(a group of websites or domains) to create a service specific search channel.
Manage/delete domain sets.
Create categories for a theme based search channel(Sports, Cricket etc).
Manage/delete categories.
Track each and every search query request it receive as search logs, with all parameters related to the query.
Create any number of API keys.
Use API keys to authorize/sell search data access to external parties.
Track and control the usage of data from spider under each API keys easily.
View the statistics of different searches based on API keys.
Track the search history and evaluate performance from each API key.
Manage Family Filter to perform screening on results.
Add/manage/delete countries and languages.
Statistics including page queue details, crawled pages details etc
and advetising space in search top, bottom and left
Efficient communication is the key to success <---
I would like to ask some questions related to your project for a better understanding and to prepare my proposal. My price/time will be determined if and when a discussion takes place.
-
CL