Find Jobs
Hire Freelancers

Cluster Analysis (using existing code) / MySQL database

$30-250 USD

Закрито
Опублікований over 11 years ago

$30-250 USD

Оплачується при отриманні
I have .[login to view URL] that I think contains all the pieces you need, if you think something is missing then please let me know. The rough workflow is as follows: [login to view URL] will set up a database for clustering that is in the correct format. Ideally I would like to leverage my existing database on GoDaddy but I would be open to other suggestions. You will need to change the "data" table at the very bottom so that it is a view across your actual page data, which is expected to show the page id (a unique identifier) and a hash of the DOM. When you run the script you can specify a database schema, all of the tables will go in that schema. Compile qfp.c with "gcc -o qfp qfp.c". Run [login to view URL], this script takes a lot of options and will allow you to customize where the database and all the tables are. If you have done all this, congratulations, you have clusters in your database! The [login to view URL] table contains the actual clusters, for each page it will have a (rep_id, page_id) pair, where the rep_id is essentially the cluster id (it is actually just the id of the lowest page in the cluster). Depending on what you want to do with the clusters, this may be all you need. You can compile [login to view URL] with "javac -cp [login to view URL]:. web_clustering/[login to view URL]". You may want to make a copy of this file for your modifications, that way you can refer back to the original if you delete too much and screw something up. If you compile and run web_clustering/[login to view URL], it will generate a web site that shows your clusters, gives screenshots of common pages in the clusters (assuming you have screenshots enabled on Neha's crawler), and lets you look at their DOMs pretty easily. You have to compile and run it from the main folder, not from within web_clustering, as it is part of the web_clustering java package. Unfortunately you will need to dig into the Java file to change things like table names, the output location, and the location of your screenshot and DOM files. These are all hard-coded and spread through multiple files, so this part will be a little time consuming. Run it with the "-M" flag and just delete any code that did not follow this execution path (there is a lot of it, he added lots of different options to this code as time went on). Then you will probably need to modify the SQL queries to grab the page data correctly, I am not certain how much work this will be though. If you can get this to compile and run, you should be left with an output directory that contains a bunch of folders and files, one of which is "[login to view URL]". Opening this file in a web browser will give you a main page that shows your 225 most common clusters, the most common screenshots of the pages in those clusters, and have links for more information about the clusters, DOMs, etc. Let me know if you have any questions about any of this, I would be happy to answer them. Good luck! Happy Bidding!
ID проекту: 3996385

Про проект

4 пропозицій(-ї)
Дистанційний проект
Активність 11 yrs ago

Хочете заробити?

Переваги подання заявок на Freelancer

Вкажіть свій бюджет та терміни
Отримайте гроші за свою роботу
Опишіть свою пропозицію
Реєстрація та подання заявок у проекти є безкоштовними
4 фрілансерів(-и) готові виконати цю роботу у середньому за $208 USD
Аватарка користувача
I am Java expert. I am want to help you here. Please check your personal inbox for more details. I will wait you. Thanks, AMit
$250 USD за 7 дні(-в)
4,7 (100 відгуки(-ів))
6,3
6,3
Аватарка користувача
Hello sir. I read all your requirements. And i am good at all that. Please check attached doc for my previous works. Hope to hear from you soon. Thanks!!
$200 USD за 10 дні(-в)
5,0 (41 відгуки(-ів))
4,8
4,8
Аватарка користувача
HI I am confident to handle this will work until you are satisfied Thanks With REgards i am keenly interested in this project
$195 USD за 4 дні(-в)
5,0 (16 відгуки(-ів))
4,0
4,0
Аватарка користувача
Petra is a developer group experienced 5-years in web development, desktop programming and database design and programming. We have excellent expertise in web Development languages and tools (PHP, JOOMLA, DRUPAL, Magento, HTML, CSS,AJAX, JavaScript, SEO, word press etc),programming languages (Java, C#) and database design (Oracle SQL, MySql, MS. SQL Server).
$185 USD за 7 дні(-в)
0,0 (1 відгук)
1,6
1,6

Про клієнта

Прапор UNITED STATES
Alexandria, United States
5,0
16
Спосіб оплати верифіковано
На сайті з черв. 30, 2012

Верифікація клієнта

Дякуємо! Ми надіслали на вашу електронну пошту посилання для отримання безкоштовного кредиту.
Під час надсилання електронного листа сталася помилка. Будь ласка, спробуйте ще раз.
Зареєстрованих користувачів Загальна кількість опублікованих робіт
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Завантажуємо для перегляду
Дозвіл на визначення геолокації надано.
Ваш сеанс входу закінчився, і сеанс було закрито. Будь ласка, увійдіть знову.