Skip to content

Latest commit

 

History

History
38 lines (28 loc) · 1.05 KB

File metadata and controls

38 lines (28 loc) · 1.05 KB

Titanic Survival Analysis (NumPy + Matplotlib)

📌 Project Overview

This project analyzes the Titanic dataset using NumPy & Matplotlib.
We perform data preprocessing, feature engineering, statistical analysis, and visualization to uncover survival patterns.


⚙️ Data Processing Steps

1️⃣ Data Loading & Cleaning

  • Merged Name columns
  • Handled missing values (Age, Fare → mean | Embarked → mode)

2️⃣ Encoding

  • Sex encoded (female=0, male=1)
  • Embarked encoded (S=0, C=1, Q=2)

3️⃣ Feature Engineering

  • Dropped Name, Ticket, Cabin
  • Added FamilySize & IsAlone features

4️⃣ Normalization

  • Applied Z-score scaling on Age & Fare

5️⃣ Statistical Analysis

  • Computed mean, median, std for key features
  • Calculated survival rates by gender & class
  • Correlation matrix of numerical features

6️⃣ Visualizations

  • Survival Rate by Gender (bar chart)
  • Fare Distribution (histogram)
  • Correlation Heatmap

7️⃣ Train/Test Split

  • Random shuffle
  • 80% training, 20% testing