Information Retrieval

I need to do something like a search engine. I have a number of documents. These documents contain Chinese news articles. I need to do text processing for these Chinese articles. I mean by text processing is tokenization - normalization - and build a matrix that contains the terms and the documents where the terms occur. I also have some queries, so when I enter the query I need to get the query id and the articles id sorted by relevance. ex:

1 (query id)

12 1923 182 7192 19 2988 3999 (news id,separate by space,sorted by relevance)

The system should retrieve at most 100 relevant news articles. I attached one document to have a look at. I look forward to hearing from you.

Навички: Розміщення тематичної реклами, Java, PHP, Python, Архітектура ПЗ

Деталі: matrix articles, architecture search engine, architecture information, normalization, information architecture, retrieval, chinese architecture, chinese search engine, flash enter information, information retrieval system, mysql information retrieval, report advantages information retrieval system aspnet technologies, matrix enter keywords link url, help writing python program enter payroll information, vbnet information retrieval, can outside users enter information joomla, parse log file enter information database

Про роботодавця:
( 0 відгуки(-ів) ) Taipei, Taiwan

ID Проекту: #6839864

3 фрілансерів(-и) у середньому готові виконати цю роботу за $40


Plz check [login to view URL]~supporttest/[login to view URL] where you can see a sample of the work and let me know if you need a same thing. Thanks, Subhasish

$66 USD за 1 день
(7 відгуків(и))

A proposal has not yet been provided

$25 USD за 1 день
(1 відгук)

A proposal has not yet been provided

$35 USD за 1 день
(3 відгуків(и))

We could have something up and running in no time using a Solr server, which sits on top of Lucene information retrieval library. I can personally attest to the quality of the search results, having worked with Solr/Lu Більше

$30 USD за 1 день
(0 відгуків(и))