Property Price Prediction

A Jupyter Notebook building a regularised regression model to predict residential property sale prices in the Australian market, helping a US-based investment company — Surprise Housing — identify undervalued properties for strategic acquisition.

Overview

Surprise Housing, a US-based property investment firm, is entering the Australian real estate market. Their strategy involves purchasing properties below market value and selling them at a profit. This project builds a predictive regression model — incorporating Lasso and Ridge regularisation — to estimate the actual market value of prospective properties, providing a data-driven basis for investment decisions.

Background

Predicting property prices is a regression problem with high-dimensional feature spaces, multicollinearity between predictors, and a mix of categorical and continuous variables. Standard linear regression is prone to overfitting in such settings. Regularisation techniques — Ridge (L2) and Lasso (L1) — penalise model complexity to improve generalisation. Lasso additionally performs implicit feature selection by shrinking less informative coefficients to zero, making it particularly interpretable for identifying the true drivers of property value.

Dataset

File	Description
`train.csv`	Training data with 80+ property features and sale prices (Australian market)
`SubjectiveQuestions.pdf`	Written analysis of model choices, regularisation interpretation, and business insights

Key features include structural attributes (floor area, year built), quality ratings (kitchen quality, overall condition), location (neighbourhood), and sale conditions.

Notebook Contents

Section	Description
Data Loading & EDA	Loading data, distribution analysis, missing value treatment, outlier inspection
Feature Engineering	Encoding categorical variables, handling skewed distributions, creating derived features
Train/Validation Split	Splitting data for unbiased hyperparameter tuning
Linear Regression Baseline	Establishing an unregularised benchmark
Ridge Regression	L2 regularisation with cross-validated alpha selection
Lasso Regression	L1 regularisation with feature selection and alpha tuning
Model Evaluation	RMSE, R², residual analysis on validation and test sets
Feature Importance	Identifying the 12 most influential predictors of sale price
Business Insights	Translating model coefficients into actionable investment guidance

Technologies Used

Library	Version	Purpose
`pandas`	2.0.3	Data manipulation
`numpy`	1.24.3	Numerical operations
`scikit-learn`	1.3.0	Regression models, regularisation, cross-validation
`matplotlib`	3.7.2	EDA and residual plots
`python`	3.11.5	Runtime environment

Setup and Installation

git clone https://github.com/chetnapriyadarshini/PropertyPricePrediction.git
cd PropertyPricePrediction
pip install pandas numpy scikit-learn matplotlib
jupyter notebook "Property Pricing.ipynb"

Results and Conclusions

The Lasso model identifies 12 variables as the primary predictors of property price. Key findings:

Predictor	Direction	Interpretation
`YearBuilt`	Positive	Newer properties command higher prices
`1stFlrSF`	Positive	Larger ground floor area increases value
`2ndFlrSF`	Positive	Multi-storey properties are valued higher
`KitchenQual_TA` (Typical)	Negative	Below-average kitchen quality suppresses price
`Crawford` neighbourhood	Positive	Crawford is the highest-value neighbourhood in the dataset

Investment implication: Properties in Crawford with large floor areas and recent construction year represent the highest-value acquisition targets for Surprise Housing's Australian strategy.

Contact

Created by @chetnapriyadarshini — feel free to reach out with questions or suggestions.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
Property Pricing.ipynb		Property Pricing.ipynb
README.md		README.md
SubjectiveQuestions.pdf		SubjectiveQuestions.pdf
train.csv		train.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Property Price Prediction

Table of Contents

Overview

Background

Dataset

Notebook Contents

Technologies Used

Setup and Installation

Results and Conclusions

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Property Price Prediction

Table of Contents

Overview

Background

Dataset

Notebook Contents

Technologies Used

Setup and Installation

Results and Conclusions

Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages