Alberta Wildfire Prediction System
A production-grade ConvLSTM deep learning system that predicts wildfire ignition likelihood across the Province of Alberta from geospatial and meteorological time-series data — deployed as a full TFX pipeline on GCP Vertex AI with 91% accuracy and AUC of 90%.
91%
Model Accuracy
AUC 90%
Predictive Power
Vertex AI
Production Deployment on GCP
The Problem
Wildfire ignition is spatially complex, temporally dynamic, and increasingly unpredictable under climate change
Alberta is one of Canada's most wildfire-prone provinces, with fire seasons growing longer, more intense, and more geographically widespread as climate change accelerates. Predicting where and when ignitions are likely to occur requires modeling the interaction of dozens of variables — vegetation type and dryness, topography, wind speed and direction, temperature, humidity, and historical fire patterns — across a province-wide spatial domain that spans hundreds of thousands of square kilometers. Traditional rule-based fire risk indices cannot capture the nonlinear spatial and temporal dynamics that drive ignition likelihood, and conventional ML approaches treat each location independently, missing the spatial propagation patterns that are critical for accurate province-wide prediction. The result is a chronic gap between what fire management agencies need — precise, forward-looking ignition risk maps — and what existing tools can reliably deliver.
The Solution
A ConvLSTM system that models wildfire ignition as a spatiotemporal prediction problem — deployed as a production TFX pipeline on GCP Vertex AI
The Alberta Wildfire Prediction System frames ignition likelihood as a spatiotemporal sequence prediction problem, addressed using a ConvLSTM architecture that simultaneously captures spatial patterns across the landscape and temporal dynamics across time. The model ingests geospatial and meteorological time-series inputs — including weather variables, vegetation indices, topographic features, and historical fire data — and produces province-wide ignition likelihood predictions across Alberta. Trained and validated on historical wildfire records, the system achieved 91% accuracy, an AUC of 90%, and a recall of 87% — metrics that reflect both predictive precision and the system's ability to identify true ignition events. The full system was productionized as a TFX pipeline on GCP Vertex AI Pipelines, following the same production architecture as the Calgary Flood Prediction System, with automated retraining, model evaluation gating, and a Vertex AI serving endpoint for real-time inference.
Key Outcome
A province-wide wildfire ignition prediction system that delivers spatially explicit risk maps across Alberta at 91% accuracy — built on a ConvLSTM architecture that captures both spatial landscape patterns and temporal weather dynamics, and deployed as a fully automated production pipeline on GCP Vertex AI.
Technical Deep Dive
Architecture & Design
ConvLSTM Architecture & TFX Pipeline
ConvLSTM Architecture — Spatiotemporal Ignition Prediction
Input Layer 1
Weather Variables
Temp, humidity, wind speed & direction, precipitation
Input Layer 2
Vegetation Indices
NDVI, fuel moisture, dryness indicators from GEE
Input Layer 3
Topographic Features
Elevation, slope, aspect — geospatial rasters
Input Layer 4
Historical Fire Data
Past ignition locations, fire perimeters, burn scars
Core Architecture · ConvLSTM
Spatiotemporal Encoder
Convolutional filters capture spatial ignition patterns across the Alberta landscape · LSTM gates model temporal weather dynamics across time steps · Combined in a single ConvLSTM unit
Output
Province-Wide Ignition Likelihood Map
Per-grid-cell ignition probability across Alberta · Accuracy 91% · AUC 90% · Recall 87%
TFX Pipeline — GCP Vertex AI Pipelines
Stage 1 · ExampleGen
Data Ingestion
Ingests geospatial + meteorological time-series from GCS · Train/eval splits
StatisticsGen
Statistics
Dataset stats & drift detection
SchemaGen
Schema
Schema inference from training data
ExampleValidator
Validation
Anomaly detection against schema
Stage 3 · Transform
Feature Engineering
Spatial normalization · Temporal sequence construction · Consistent train/serve preprocessing
Stage 4 · Tuner
Hyperparameter Tuning
Keras Tuner · Searches ConvLSTM filters, hidden units, learning rate, dropout · Best trial passed to Trainer
Stage 5 · Trainer
CNN+LSTM Model Training
Spatiotemporal ignition model · Accuracy 91% · AUC 90% · Recall 87%
Stage 6 · Evaluator
Model Evaluation
TFMA · Accuracy, AUC, Recall thresholds · Blessing gate blocks underperforming models
Stage 6 · Resolver
Model Comparison
Champion vs. challenger · Auto-promotes on improvement
Stage 7 · Pusher
Model Serving — Vertex AI Endpoint
Blessed models pushed to Vertex AI · Real-time ignition likelihood inference endpoint
MLOps & CI/CD Layer
Pipeline Orchestration
Vertex AI Pipelines
Kubeflow DAG execution · Automated runs on new data · Full lineage tracking
Continuous Training
Automated Retraining
Triggered on new fire season data · Drift detection gates retraining
Model Registry
Vertex AI Model Registry
Versioned artifacts · Champion tracking · Rollback support
Metadata & Lineage
ML Metadata Store
Full artifact and execution logging · Reproducibility across runs
GCP Infrastructure
Model Inputs
Geospatial & Meteorological Features
The model ingests four input streams — meteorological variables (temperature, humidity, wind), vegetation indices including NDVI and fuel moisture sourced via Google Earth Engine, topographic rasters (elevation, slope, aspect), and historical fire records (ignition locations, perimeters, burn scars). All streams are spatially aligned to a province-wide grid covering Alberta before being passed to the ConvLSTM encoder.
Core Architecture
ConvLSTM Spatiotemporal Encoder
The ConvLSTM architecture natively combines spatial and temporal modeling in a single unit — convolutional filters capture spatial ignition patterns across the Alberta landscape while LSTM gates model how weather and fuel conditions evolve over time. This unified encoding eliminates the need for a separate synchronization layer, as the architecture inherently handles spatiotemporal co-dependence across all input streams simultaneously.
Data Ingestion & Validation
ExampleGen + Schema Validation
ExampleGen ingests geospatial and meteorological time-series from GCS and partitions data into training and evaluation splits. StatisticsGen, SchemaGen, and ExampleValidator then compute statistics, infer the data schema, and detect anomalies — ensuring that seasonal variation in weather data and fire records does not introduce drift that silently degrades model performance.
Feature Engineering
Transform — Spatial & Temporal Preprocessing
The Transform component applies spatial normalization across the province-wide grid and constructs temporal sequences from the multi-stream input data. The same transformation graph is used at both training and serving time, eliminating training-serving skew and ensuring the model receives identically preprocessed inputs whether it is training on historical fire seasons or serving real-time ignition predictions.
Tuning & Training
Tuner + Trainer — Optimized ConvLSTM
The Tuner component uses Keras Tuner to search across ConvLSTM filter counts, hidden units, learning rate, and dropout — passing the best trial directly to the Trainer. The Trainer then trains the full ConvLSTM model using the tuned configuration and transformed features, producing a model that achieved 91% accuracy, AUC of 90%, and recall of 87% on held-out Alberta fire season data.
Evaluation & Serving
Evaluator + Pusher — Gated Deployment
The Evaluator enforces per-metric thresholds across accuracy, AUC, and recall using TFMA. Only models that pass all three gates receive a blessing. The Resolver compares the blessed model against the current production champion, and the Pusher deploys only genuinely improved models to the Vertex AI serving endpoint — preventing silent degradation across fire seasons.
Key Design Decisions
ConvLSTM natively models spatial propagation — not just point predictions
Most wildfire risk models treat each grid cell independently, missing the spatial propagation dynamics that drive ignition clustering across landscapes. ConvLSTM processes the entire spatial domain simultaneously at each time step — capturing how weather patterns, fuel conditions, and historical fire presence interact across neighboring cells. This is the core architectural choice that enables province-wide prediction rather than point-level risk scoring.
Recall optimized alongside accuracy to minimize missed ignitions
In wildfire prediction, a missed ignition (false negative) carries far greater operational cost than a false alarm. The evaluation gate enforces a recall threshold of 87% alongside accuracy and AUC — ensuring the system is tuned to identify true ignition events even under class imbalance, where non-ignition locations vastly outnumber actual fire events in the training data.
Google Earth Engine enables scalable province-wide vegetation data
Sourcing vegetation indices like NDVI and fuel moisture for a province the size of Alberta at the resolution required for meaningful spatial modeling would be prohibitive through conventional data pipelines. Google Earth Engine's planetary-scale geospatial compute allows these layers to be extracted, processed, and aligned to the model grid without local storage or processing constraints — a critical infrastructure decision for making province-wide prediction tractable.
Tech Stack
| Technology | Purpose |
|---|---|
| TensorFlow / Keras | ConvLSTM model architecture, training, and evaluation |
| TensorFlow Extended (TFX) | End-to-end ML pipeline — ExampleGen, Transform, Tuner, Trainer, Evaluator, Pusher |
| GCP Vertex AI Pipelines | Pipeline orchestration, automated retraining, and DAG execution |
| Vertex AI Training & Serving | Scalable model training and real-time inference endpoint |
| Vertex AI Model Registry | Versioned model artifacts, champion-challenger tracking, rollback |
| TF Model Analysis (TFMA) | Multi-metric model evaluation with blessing gate |
| Keras Tuner | Automated hyperparameter search within TFX Tuner component |
| Google Earth Engine | Province-wide vegetation indices and fuel moisture extraction |
| Google Cloud Storage | Raw data storage and TFX artifact repository |
| Python | Core language and pipeline orchestration |
Results & Metrics
What the system delivers
91%
Model Accuracy
Validated on held-out Alberta fire season data — province-wide ignition likelihood prediction
90%
AUC Score
Area under the ROC curve — strong discriminative power between ignition and non-ignition zones
Vertex AI
Production Deployment
Full TFX pipeline on GCP — automated retraining, model registry, and real-time serving endpoint
Province-wide ignition likelihood maps at 91% accuracy
The system produces per-grid-cell ignition probability maps spanning the entire Province of Alberta — not point predictions or regional aggregates. At 91% accuracy and AUC of 90%, the maps provide operationally reliable spatial risk intelligence for fire management agencies planning pre-season resource deployment and real-time response.
87% recall — minimizing missed ignition events
In wildfire prediction, failing to identify a true ignition event carries far greater operational cost than a false alarm. The model was optimized and evaluated with an explicit recall threshold of 87% — ensuring the system identifies the vast majority of actual ignition events even under the severe class imbalance that characterizes provincial fire datasets.
Fully automated production pipeline on GCP Vertex AI
The TFX pipeline runs end-to-end without manual intervention — ingesting new fire season data, validating schema, engineering features, tuning hyperparameters, training, evaluating against all three metric gates, and deploying only genuinely improved models to the serving endpoint. New data triggers a full retraining run automatically.
Scalable geospatial feature pipeline via Google Earth Engine
Province-wide vegetation indices, fuel moisture estimates, and land cover data are sourced and preprocessed at scale via Google Earth Engine — enabling the model to ingest high-resolution geospatial features across Alberta without local storage or processing constraints. This infrastructure decision makes provincial-scale real-time inference tractable.
Peer-reviewed publication — Canadian Journal of Forest Research
The system and its methodology are the subject of a submitted paper — "Development and Implementation of Wildfire Prediction System: Application for the Province of Alberta" — under review at the Canadian Journal of Forest Research, establishing the academic validity of the approach alongside its operational deployment.