← Back to Home

🚲 Bikeshare Usage Analysis – Washington D.C.

Tools Used: Python, pandas, seaborn, matplotlib, statsmodels

View Notebook Github Repo

🧠 Objective

The goal of this project is to analyze bikeshare ridership patterns by distinguishing between two key user groups: casual riders and registered riders. Using real-world data and exploratory visualizations, the analysis investigates:

📊 Dataset Overview

The dataset includes daily ride counts segmented by user type (casual vs registered), as well as temporal and contextual variables such as:

Before analysis, the data was cleaned and filtered to enable conditional comparisons and plotting.

📈 Methods and Visualizations

1. Scatterplot of Casual vs Registered Riders

A scatterplot was used to compare casual and registered ride volumes across the full dataset. A positive but non-linear trend suggests that increased casual use often correlates with increased registered use, though at different scales.

Scatterplot Casual vs Registered

2. LOWESS Smoothing

Locally Weighted Scatterplot Smoothing (LOWESS) was applied to reveal underlying trends in noisy data. This helped isolate the general pattern without assuming a specific model. The temperature relationship with casual ridership was most evident on weekends.

LOWESS Curves

3. KDE Density Plot (Working Days Only)

Kernel Density Estimation (KDE) visualized the joint distribution of casual and registered riders on working days. The contours represent areas of higher density, indicating where the counts of casual and registered riders are most concentrated. The marginal distributions on the top and right show the individual density of casual and registered rider counts, respectively.

KDE Contour Working Day

4. Overlay KDE Comparison (Workday vs Non-Workday)

Stacked KDE plots show how rider dynamics shift. Registered ridership dominates on working days while casual ridership peaks on non-working days.

Overlay KDE Legend

5. Temperature Effects

Linear regression plots explores how normalized temperature impacts the proportion of casual riders across different days of the week. Warmer temperatures are strongly associated with higher casual ridership, especially on weekends.

Linear Regression Temp

6. Hourly Usage Patterns

This line plot compares hourly averages. Registered riders peak during morning and evening commute times. Casual riders are more active midday.

Hourly Pattern Line Chart

🔍 Key Insights

📁 Conclusion

This bikeshare ridership analysis provides a visual, data-driven lens into how different types of users engage with urban transit systems. These findings could inform: