Overview
This project involves automating the scraping of data from Cars.com using Python scripts and analyzing the data for insights. It focuses on Lexus GX 460 listings, with scripts for scraping, cleaning, and analysis.
Files
carsDotComScrape.py
: Python script for scraping data related to the Lexus GX 460 model from Cars.com.carsFileCleaning.py
: Python script for cleaning the scraped data, including removing irrelevant information and formatting data.GX460_scrape.py
: Script combining scraping and cleaning operations specific to Lexus GX 460.carsVis.ipynb
: Jupyter Notebook for data visualization and analysis using matplotlib, seaborn, and scikit-learn.output_DATE.csv
: Raw scraped data from Cars.com listings.output_DATE_cleaned.csv
: Cleaned and formatted data ready for analysis.
Usage
- Scraping and Cleaning Data:
- Edit
GX460_scrape.py
to modify the URL with desired filters for scraping Lexus GX 460 data. - Run
GX460_scrape.py
to extract and clean data, storing it inoutput_DATE.csv
.
- Edit
- Visualizing Data:
- Open and run
carsVis.ipynb
in Jupyter Notebook to visualize cleaned data and perform analysis.
- Open and run