top of page

Create Your First Project

Start adding your projects to your portfolio. Click on "Manage Projects" to get started

Stroke Prediction Analysis Using Healthcare Data

Coding Assignment

This project involved analyzing healthcare data to identify potential factors contributing to stroke risk. Using Python, Seaborn, and Matplotlib, I visualized relationships between key variables like age, BMI, glucose levels, and smoking status, aiming to uncover patterns that might help in early stroke prediction.

Google Colab

Problem Statement

Objective: “To explore correlations between various health indicators (e.g., age, glucose levels, smoking status) and stroke occurrence. By understanding these relationships, this analysis aims to inform early detection strategies.”
Data Collection and Methodology

Data Source: The dataset includes demographic and health data, such as age, hypertension status, heart disease history, glucose levels, and stroke occurrence.
Process:
Imported and cleaned the dataset.
Used exploratory data analysis (EDA) techniques to investigate variable distributions and relationships.
Created visualizations to highlight trends and relationships between health indicators and stroke risk.
Key Analyses and Visualizations

Stroke Incidence by Smoking Status:

Visualization: A bar plot showing average glucose levels by smoking status, grouped by stroke occurrence.
Insight: This plot reveals whether smoking status correlates with higher glucose levels among stroke patients, suggesting potential risk factors.
Age and Stroke Distribution:

Visualization: A distribution plot showing the age range of individuals with and without stroke.
Insight: Identifies the age groups at higher risk, providing clues for targeted stroke prevention.
BMI and Stroke Risk:

Visualization: Scatter plots and regression analyses between BMI and glucose levels.
Insight: Shows the relationship between BMI and stroke occurrence, which may suggest additional health interventions.
Heart Disease and Stroke:

Visualization: Count plot of individuals with heart disease status, separated by stroke occurrence.
Insight: This plot examines the overlap between heart disease and stroke, highlighting potential comorbidities.
Technical Challenges and Solutions

Handling Missing Data: Addressed missing values for variables like BMI.
Visualization Complexity: Used Seaborn and Matplotlib to layer data efficiently and reveal patterns.
Tools and Technologies Used

Languages & Libraries: Python, Seaborn, Matplotlib for data visualization.
Platform: Google Colab for data processing and visualization.
Learning Outcomes

Improved my ability to handle and visualize healthcare datasets.
Gained insights into using data analysis to identify health risk factors.

+972-54-994-5850

+1-202-643-9308

  • LinkedIn
  • Instagram
bottom of page