Linear Regression Methods on Cardiovascular Disease Dataset

A Group Class Project in R, November - December 2022


Objective:
The dataset we are examining for this project evaluates various factors related to cardiovascular health, containing 12 features and 70,000 observations. Our primary focus is identifying factors that contribute to high blood pressure, specifically systolic blood pressure, as it is associated with more negative health outcomes.
● To create a regression function that selects the best model for predictive or explanatory purposes.
● To evaluate factors contributing to high systolic blood pressure using the dataset.
● To perform diagnostics and transformations to enhance model analysis.

Outcomes:
● We developed usable functions utilizing LASSO, Ridge Regression, and Ordinary Least Squares methods for predictive and explanatory analysis on any dataset.
● We selected the best predictive model for high sistolic blood pressures in patients with cardiovascular health issues.
● We conducted diagnostics and transformations, discovering the high sensitivity of the data to outliers and the necessity to remove them to optimize regression function performance.
● Our explanatory model identified significant factors associated with high blood pressure, including gender, height, weight, cholesterol, activity level, and the presence of cardiovascular disease.
● We concluded that further study is needed to examine the efficacy of treatment plans involving diet change and/or increased activity levels to target weight and cholesterol.

Snipit of Report File.


Download Report File:











Contact

I also do freelancing Web Development! If you need a website made, feel free to reach out to inquire.

[email protected]

[email protected]

Based in San Francisco, CA 94118