need a person who can do mathematical analysis with use of tools like sas excel
₹100-400 INR / hour
Закрито
Опублікований over 7 years ago
₹100-400 INR / hour
: Cluster Analysis to find Patterns in Patients with Heart Disease
The SAS VA HEART dataset on the Teradata University Network site contains data about patients with heart disease. It includes such variables as gender, age of death, age of diagnosis, weight status, cholesterol status, and smoking status (Non-Smoker has Smoking variable coded as 0; Light has Smoking variable coded as 1-5; Moderate has Smoking variable coded as 6-15; Heavy has Smoking variable coded as 16-35; Very Heavy has Smoking variable coded as >25). You will perform a cluster analysis in this problem to similarly group the patients who have died in this dataset.
1. Open the HEART dataset and create some visualizations to get familiar with the data. (Note: You do not need to submit these visualizations to Moodle for this problem.)
2. Create clusters over patients who have died. To do so, filter the data over the entire dataset over Status = Dead. Remove missing values.
3. Click the New Cluster icon on the toolbar and assign all the measure variables except Metropolitan Relative Weight and Age at Start to it.
4. Click the Properties tab. Notice that the number of clusters is set to 5, which is the default. Five clusters were crated with cluster IDs 0-4. Change the number of clusters to 4.
5. Increase the Visible Roles to 7. Maximize the cluster matrix. Right-click on one of the cells that have Age of Death on the X axis. Select Plot Age of Death by Cluster ID. Which cluster has the patients who died the youngest?
6. Create a box plot of Smoking by Cluster ID. Which cluster represents those that were heavy smokers in this dataset?
7. Minimize the cluster matrix and maximize the parallel coordinates plot. The plot shows the cluster IDs on the left side of the plot and the effects along the top. The clusters are colored differently. The bar sizes on the left represent the number of observations in each cluster. The minimum and maximum values for each effect are shown at the top and bottom of the effect. By looking at the plot with all the clusters shown, what can you assess? For example, which cluster appears to have the patients with the highest cholesterol?
8. Which cluster can be classified as follows: Non-smokers who were older in age at death, had lower cholesterol, and had lower blood pressure?
9. Characterize each of the other clusters.
10. Which cluster is the most different; that is, it has the largest Within-Cluster SS?
11. Right-click any of the cells in the cluster matrix. Select Derive a Cluster ID Variable. A new variable is created and appears in the Data pane. This may be used now as an input to other models.
Hi
I have good experience in data analysis using Excel, ACL, IDEA & other analytical tools for past 10 years. I did various analysis in business processes such as Sales, Marketing, Procurement, Inventory, Finance Accounts.
I look forward to work on this project.
Thanks
Sundaram
Hi, I am Nikhil Gupta. I am having total 10 years of extensive experience into Analytics. I rarely bid on this website. However, I am planning to be active on this website. I worked on tools like Excel, SAS & SPSS. Good exposure to statistical Analysis. I can do this task at nominal price and in very less amount. Experience the quality work by awarding this project. Thanks, Nikhil Gupta
Hi Freelancer Employer, i am a university Associate Professor teaching research methodology and statistical analysis. I love to do research and i have 15 years of research experience with good track record. I have published my research outputs in various journals indexed in SCOPUS and Social Science citation impact database. I think my acheivement could match your expectation if given the opportunity to analysis your data. Thanks for your kind consideration! Kindly contact me should you need more details from me.