Overview/Goal
This project focuses on predicting apartment prices and visualizing data related to apartment listings. It combines various datasets to create predictive models and interactive visualizations using geographic and demographic information. Models tested include custom random forest, XGBoost, and H2O autoML. The final dashboard is built in JavaScript using D3.js.
Data Sources
- Apt Listing Data: UCI Apartment Listing Dataset
- GIS Zip Data: ArcGIS Zip Data
- IRS Zip Data: IRS Individual Income Tax Statistics
- USA GeoJSON Data: US GeoJSON Data
- US States GeoJSON Data: State-zip-code-GeoJSON
Key Files
- dataJoining.py: Joins datasets, generates zip codes, and integrates IRS data.
- dataFeatureEng.py: Cleans data and creates dummy variables for analysis.
- modelCreation.ipynb: Builds predictive models including random forest and XGBoost.
- aptVis.html: Interactive visualization tool using D3.js for predicting apartment prices.
How to Use
- Clone the repository and set up the required dependencies.
- Run
dataJoining.py
to combine datasets and generate zip codes. - Execute
dataFeatureEng.py
to clean data and create dummy variables. - Open and run
modelCreation.ipynb
to build predictive models and save predicted prices. - Launch
aptVis.html
in a web browser to interact with the visualization tool.