Find Jobs
Hire Freelancers

cluster-density-reducing unsupervised feature selection algorithm

$10-30 CAD

Закрито
Опублікований over 9 years ago

$10-30 CAD

Оплачується при отриманні
(1) Given a distance measure between two vectors X and Y, D(X,Y) = 1⁄2 { (var(X) + var(Y))2 - √[ (var(X) + var(Y))2 - 4 var(X) var(Y) (1-ρ(X,Y)2 ] } ρ(x,y) = cov(X,Y)/√(var(X)var(Y)). var(X) = (1/n) (∑(X-meanX)2), where n is the dimensionality of vector X. cov(X,Y) = (1/n) (∑(X-meanX)(Y-meanY)), where both X and Y have the same dimensionality, n. (2) Given a number of features (n), there exists a feature set O = {Fi, i= 1, 2... n}, which may be reducible. For the sake of this test, the features are represented by ‘feature vectors’ (e.g., X=<FV1, FV2, ... FVn>), where the dimensions of each and every vector is equal to the number of randomly-picked instances of a dataset used to construct the feature vectors (say 10% of the available instances). (3) Implement the following cluster-density-reducing unsupervised feature selection algorithm (or CDR) For every feature vector, compute the D distance between the vector and all other vectors; Find the smallest distance (>0) between any two vectors: call that D_min_global; Let e = k x D_min_global (where k is a real-valued constant =>1 entered by the user); Repeat { For all vectors whose minimum D distance to another vector =~ D_min_global Do { Find a nearest vector (this will give you a pair of vectors, Va and Vb); Discard one of the pair of vectors (Va, Vb); // function description below } Re-compute D_min_global; // as the elimination of features will alter it } Until (D_min_global > e) Output retained features; // just a list of feature indexes, such as FV1, FV3, FV14. Discard one of the pair of vectors (Va, Vb) { Create a vector_cluster = {Va,Vb}; Let k=2; // meaning 2nd nearest neighbour 10: If there are no more nearest neighbours Then { Randomly discard Va or Vb; Flip switch; Exit function }; } Find the k-nearest_neighbour of Va and of Vb; // they may be two distinct vectors or the same one Let vector_cluster = vector_cluster + k-nearest_neighbour of Va and of Vb; Calculate D_average of Va and of Vb to the other vectors in the vector_cluster; If D_average of (Va>Vb) OR (Va<Vb) Do { If (switch = true) Then discard Va or Vb with lowest D_average Else discard the other; Switch = NOT(switch); // flip the Switch, where switch is a global Boolean variable } Else { k++; go to 10 } // in order to increase the size of the vector_cluster (4) To solve the test, you will need to: - Implementing & debugging the distance measure (described above) - Possibly, adapt a nearest neighbor classifier to the needs of applying CDR to one of the data set below. Description: [login to view URL] Example code in Java: [login to view URL] - Design & Implement a function to read from one of the data sets at: [login to view URL] (e.g., REALDISP, Image Segmentation and Robot Execution Failures datasets). - Incrementally Implement & Debug the algorithm (CDR): first to compute D, then the Discard function and finally, to implement the whole algorithm. - Provide a theoretical assessment of the time complexity of CDR as a function of n (the initial number of features); test your prediction empirically by running your program on sub-sets with different n values. - Present the program & output file and a written report on the results- especially, theoretical & empirical time scalability results. Marks are out of 100% and they are assigned as follows: 35% accurate & efficient Java or C++ program implementation of the CDR algorithm (USB key); 15% a complete output file exhibiting feature reduction for one dataset (USB key); 15% correct theoretical analysis of the order of time complexity of your implementation (paper); 35% correct analysis of the empirical time-performance of your program, with justification for any divergence between it and the theoretical analysis (on paper). Attach a completed & signed statement of originality form to your submission.
ID проекту: 6677637

Про проект

2 пропозицій(-ї)
Дистанційний проект
Активність 9 yrs ago

Хочете заробити?

Переваги подання заявок на Freelancer

Вкажіть свій бюджет та терміни
Отримайте гроші за свою роботу
Опишіть свою пропозицію
Реєстрація та подання заявок у проекти є безкоштовними
2 фрілансерів(-и) готові виконати цю роботу у середньому за $84 CAD
Аватарка користувача
Hello I'm interesting your project very well I'm a Good C/C++, Java, Math, Algorithm expert. I understand your req exactly. I m quite well experienced in these assignment jobs. Let's go ahead with me I want to service for you continously. Thanks
$90 CAD за 1 день
4,9 (436 відгуки(-ів))
8,5
8,5
Аватарка користувача
Dear i have done a lot of work in it and also currently working on these type of projects. I will deliver you best quality work on time.
$77 CAD за 1 день
4,0 (8 відгуки(-ів))
2,8
2,8

Про клієнта

Прапор CANADA
Montreal, Canada
5,0
10
Спосіб оплати верифіковано
На сайті з лист. 11, 2013

Верифікація клієнта

Дякуємо! Ми надіслали на вашу електронну пошту посилання для отримання безкоштовного кредиту.
Під час надсилання електронного листа сталася помилка. Будь ласка, спробуйте ще раз.
Зареєстрованих користувачів Загальна кількість опублікованих робіт
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Завантажуємо для перегляду
Дозвіл на визначення геолокації надано.
Ваш сеанс входу закінчився, і сеанс було закрито. Будь ласка, увійдіть знову.