Skip to main content
Projects
Case studyForecasting / Time series / Evaluation

Climate CO2 Forecasting ML

Atmospheric CO2 forecasting needs leakage-safe time-series evaluation rather than a visually impressive model trained on future information.

CO2 Forecast Lab dashboard showing model comparison and atmospheric trend.
Project type
Data and ML
Core stack
Python, pandas, statsmodels
Delivery
Case study

Case Study

The problem, implementation decisions, measured evidence, and next improvements.

Overview

An end-to-end forecasting and anomaly-analysis system with reproducible data preparation, model comparison, API serving, and dashboard evidence.

Problem

The weekly Mauna Loa CO2 series contains missing values, long-term trend, and strong annual seasonality. A credible project must compare simple and complex models on the same chronological holdout.

Solution

Built a reproducible pipeline for validation, monthly resampling, chronological splits, feature engineering, baseline/statistical/scikit-learn/PyTorch training, shared metrics, residual analysis, anomaly signals, FastAPI inference, and a React dashboard.

Outcome

Exponential Smoothing and SARIMA clearly outperformed the more complex machine-learning models. The project reports that result directly and keeps anomaly detection framed as exploratory rather than verified climate-event detection.

What It Proves

Time-series methodology, leakage prevention, honest model evaluation, statistical forecasting, PyTorch experimentation, FastAPI, React, and reproducibility.

Key Features

  • Chronological train, validation, and test splits prevent future leakage.
  • All models share the same rolling one-step evaluation and residual analysis.
  • Generated artifacts feed both the FastAPI service and React dashboard.

Architecture

  1. 01

    statsmodels CO2 dataset

  2. 02

    Validation and resampling

  3. 03

    Chronological split

  4. 04

    Forecasting models

  5. 05

    Shared evaluation

  6. 06

    Anomaly analysis

  7. 07

    API and dashboard

Tech Stack

  • Python
  • pandas
  • statsmodels
  • scikit-learn
  • PyTorch
  • FastAPI
  • React
  • Recharts
  • Docker

Challenges & Trade-offs

  • 2,284 weekly rows become 526 monthly observations after resampling and interpolation.
  • Exponential Smoothing achieved MAE 0.237 and RMSE 0.298 on rolling one-step evaluation.
  • The two-epoch LSTM debug run underperformed, demonstrating that complexity is not automatically better.

Future Improvements

  • Run a fully tuned LSTM experiment separately from the fast debug pipeline.
  • Add probabilistic forecast intervals and backtesting windows.
  • Deploy the API and dashboard together for a public interactive demo.

Repository README, generated metrics, and public GitHub presentation reviewed June 13, 2026.