Create Your First Project
Start adding your projects to your portfolio. Click on "Manage Projects" to get started


Stroke Prediction Analysis Using Healthcare Data
Coding Assignment
This project involved analyzing healthcare data to identify potential factors contributing to stroke risk. Using Python, Seaborn, and Matplotlib, I visualized relationships between key variables like age, BMI, glucose levels, and smoking status, aiming to uncover patterns that might help in early stroke prediction.
Google Colab
Problem Statement
Objective: “To explore correlations between various health indicators (e.g., age, glucose levels, smoking status) and stroke occurrence. By understanding these relationships, this analysis aims to inform early detection strategies.”
Data Collection and Methodology
Data Source: The dataset includes demographic and health data, such as age, hypertension status, heart disease history, glucose levels, and stroke occurrence.
Process:
Imported and cleaned the dataset.
Used exploratory data analysis (EDA) techniques to investigate variable distributions and relationships.
Created visualizations to highlight trends and relationships between health indicators and stroke risk.
Key Analyses and Visualizations
Stroke Incidence by Smoking Status:
Visualization: A bar plot showing average glucose levels by smoking status, grouped by stroke occurrence.
Insight: This plot reveals whether smoking status correlates with higher glucose levels among stroke patients, suggesting potential risk factors.
Age and Stroke Distribution:
Visualization: A distribution plot showing the age range of individuals with and without stroke.
Insight: Identifies the age groups at higher risk, providing clues for targeted stroke prevention.
BMI and Stroke Risk:
Visualization: Scatter plots and regression analyses between BMI and glucose levels.
Insight: Shows the relationship between BMI and stroke occurrence, which may suggest additional health interventions.
Heart Disease and Stroke:
Visualization: Count plot of individuals with heart disease status, separated by stroke occurrence.
Insight: This plot examines the overlap between heart disease and stroke, highlighting potential comorbidities.
Technical Challenges and Solutions
Handling Missing Data: Addressed missing values for variables like BMI.
Visualization Complexity: Used Seaborn and Matplotlib to layer data efficiently and reveal patterns.
Tools and Technologies Used
Languages & Libraries: Python, Seaborn, Matplotlib for data visualization.
Platform: Google Colab for data processing and visualization.
Learning Outcomes
Improved my ability to handle and visualize healthcare datasets.
Gained insights into using data analysis to identify health risk factors.





