Calgary Flood Prediction System
A production-grade CNN+LSTM deep learning system that predicts flood hazard extent and depth across Calgary from rainfall data alone — deployed as a full TFX pipeline on GCP Vertex AI and stress-tested against climate change projections to 2100.
89%
Model Accuracy
<15cm
Avg. Training MSE
Vertex AI
Production Deployment on GCP
The Problem
Traditional flood modeling is too slow, too expensive, and too rigid for real-time climate adaptation
Conventional flood forecasting relies on coupled hydrologic-hydraulic simulation models that are computationally intensive, require extensive calibration data, and can take hours to generate a single scenario. For a city like Calgary — which experienced a flood in 2013 that displaced over 100,000 residents and caused nearly $5 billion in damages — this lag is not just inconvenient, it is a critical operational gap. As climate change drives rainfall patterns toward greater intensity and unpredictability, cities need flood prediction tools that are fast enough for real-time response, accurate enough for capital planning, and scalable enough to test hundreds of future climate scenarios — including projections to 2100 under high-emission pathways.
The Solution
A CNN+LSTM deep learning system — built, validated, and deployed as a production TFX pipeline on GCP Vertex AI
The Calgary Flood Prediction System is an end-to-end deep learning pipeline that predicts flood hazard extent and depth across Calgary's Bow and Elbow River watersheds using rainfall data as its only input. At its core is a CNN+LSTM architecture with a custom synchronization layer — the CNN captures spatial rainfall patterns and their interaction with topographic features, while the LSTM models how rainfall accumulates and translates into flooding over time. The synchronization layer accounts for the temporal lag between storm events and inundation, a critical innovation that improves real-world accuracy. The model was trained and validated against historical flood events including the 2013 disaster, achieving a Nash-Sutcliffe Efficiency (NSE) of 89% and an average training MSE below 15cm. The full system was then productionized as a TFX pipeline with all stages — data ingestion, schema validation, feature transformation, model training, evaluation, and serving — deployed and orchestrated on GCP Vertex AI Pipelines.
Key Outcome
A production-grade flood prediction system that generates high-resolution hazard maps from rainfall alone — validated against Calgary's historic floods, stress-tested against RCP 8.5 climate scenarios to 2100, and deployed as a fully automated TFX pipeline on GCP Vertex AI for real-time inference and continuous retraining.
Technical Deep Dive
Architecture & Design
TFX Pipeline & MLOps
TFX Pipeline — GCP Vertex AI Pipelines
Stage 1 · ExampleGen
Data Ingestion
Ingests rainfall time-series + geospatial flood data from GCS · Splits into train/eval sets
StatisticsGen
Statistics
Computes dataset stats for drift detection
SchemaGen
Schema
Infers schema from training data
ExampleValidator
Validation
Detects anomalies against schema
Stage 3 · Transform
Feature Engineering
Consistent preprocessing for training + serving · Spatial normalization · Temporal windowing for LSTM
Stage 4 · Tuner
Hyperparameter Tuning
Keras Tuner integration · Searches CNN filter sizes, LSTM units, learning rate, dropout · Best trial passed to Trainer
Stage 5 · Trainer
CNN+LSTM Model Training
Convolutional spatial encoder + LSTM temporal modeler + synchronization layer · NSE 89% · MSE <15cm
Stage 6 · Evaluator
Model Evaluation
TFMA evaluation · NSE, MSE, spatial accuracy · Blessing gate — blocks underperforming models
Stage 6 · Resolver
Model Comparison
Compares new model against production baseline · Auto-promotes if performance improves
Stage 7 · Pusher
Model Serving — Vertex AI Endpoint
Blessed models pushed to Vertex AI Serving · Real-time inference endpoint for flood hazard prediction
MLOps & CI/CD Layer
Pipeline Orchestration
Vertex AI Pipelines
Kubeflow-based DAG execution · Automated pipeline runs on new data · Full lineage tracking
Continuous Training
Automated Retraining
Pipeline triggered on new rainfall/flood data · Schema drift detection gates retraining
Model Registry
Vertex AI Model Registry
Versioned model artifacts · Challenger vs. champion tracking · Rollback support
Metadata & Lineage
ML Metadata Store
Every artifact, execution, and evaluation logged · Full reproducibility across pipeline runs
GCP Infrastructure
Stage 1
Data Ingestion & Validation
ExampleGen ingests rainfall time-series and geospatial flood extent data from Google Cloud Storage and partitions it into training and evaluation splits. StatisticsGen, SchemaGen, and ExampleValidator then run in concert to compute dataset statistics, infer a data schema, and detect anomalies — blocking malformed or drifted data from reaching the training stage.
Stage 2
Feature Engineering
The Transform component applies spatial normalization and temporal windowing to prepare inputs for the CNN+LSTM architecture. Critically, the same transformation graph is applied at both training and serving time — eliminating training-serving skew and ensuring that the preprocessing the model was trained on is identical to what it receives during real-time inference.
Stage 3
Hyperparameter Tuning
The Tuner component integrates Keras Tuner to search across CNN filter sizes, LSTM hidden units, learning rate, and dropout rate. The best hyperparameter trial is automatically passed to the Trainer component — removing manual tuning from the retraining loop and ensuring every pipeline run uses an optimized configuration.
Stage 4
CNN+LSTM Training
The Trainer component trains the CNN+LSTM model using the tuned hyperparameters and transformed features. The CNN encodes spatial rainfall patterns and their interaction with topographic features, the LSTM captures temporal accumulation dynamics, and a custom synchronization layer accounts for the temporal lag between storm events and inundation — the key architectural innovation that drives the 89% NSE.
Stage 5
Evaluation & Model Blessing
The Evaluator uses TensorFlow Model Analysis (TFMA) to assess the trained model against NSE, MSE, and spatial accuracy thresholds. Only models that meet or exceed all thresholds receive a blessing. The Resolver component then compares the newly trained model against the current production champion — auto-promoting it only if performance genuinely improves.
Stage 6
Serving & Deployment
The Pusher component deploys blessed models to a Vertex AI Serving endpoint, making the flood prediction system available for real-time inference. Model versions are tracked in the Vertex AI Model Registry — enabling challenger vs. champion comparisons, versioned rollbacks, and full audit trails for every production deployment.
Key Design Decisions
Synchronization layer resolves the temporal lag problem
Rainfall and flood inundation are not temporally aligned — there is a lag between when a storm occurs and when flooding peaks. Standard CNN+LSTM architectures ignore this lag, degrading prediction accuracy on fast-response urban catchments. The custom synchronization layer explicitly aligns rainfall inputs with flood response timing, and is the single most impactful architectural decision in reaching NSE 89%.
TFX Transform eliminates training-serving skew
One of the most common sources of production ML failures is a mismatch between how features are preprocessed during training versus serving. By using TFX Transform, the same TensorFlow graph is applied at both stages — guaranteeing that the model receives identically preprocessed inputs whether it is being trained on historical data or serving real-time rainfall forecasts.
Blessing gate prevents silent model degradation in production
In a continuously retrained system, a model that passes validation on aggregate metrics can still degrade on specific flood scenarios or spatial regions. The TFMA Evaluator enforces per-metric thresholds — NSE, MSE, and spatial accuracy — and only blesses models that meet all three. Combined with the Resolver's champion-challenger comparison, this ensures that only genuinely better models ever reach the production endpoint.
Rainfall-only input enables real-time operational use
Traditional flood models require detailed soil moisture data, calibrated river gauges, and physics-based solver outputs — inputs that are often unavailable in real time or require hours to prepare. By designing the CNN+LSTM to operate on rainfall data alone, the system can generate flood hazard predictions as soon as a rainfall forecast is available, making it suitable for early warning systems and digital twin integration.
Tech Stack
| Technology | Purpose |
|---|---|
| TensorFlow / Keras | CNN+LSTM model architecture, training, and evaluation |
| TensorFlow Extended (TFX) | End-to-end ML pipeline — ExampleGen, Transform, Tuner, Trainer, Evaluator, Pusher |
| GCP Vertex AI Pipelines | Pipeline orchestration, automated retraining, and DAG execution |
| Vertex AI Training & Serving | Scalable model training and real-time inference endpoint |
| Vertex AI Model Registry | Versioned model artifacts, champion-challenger tracking, rollback |
| TF Model Analysis (TFMA) | Multi-metric model evaluation with blessing gate |
| Keras Tuner | Automated hyperparameter search within TFX Tuner component |
| Google Cloud Storage | Raw data storage and TFX artifact repository |
| Python | Core language and pipeline orchestration |
Results & Metrics
What the system delivers
89%
Model Accuracy
Nash-Sutcliffe Efficiency (NSE) — validated against Calgary's historical flood events including the 2013 disaster
<15cm
Avg. Training MSE
Average mean squared error on flood depth prediction across training runs — well within operational tolerance
Vertex AI
Production Deployment
Full TFX pipeline deployed on GCP Vertex AI — automated retraining, model registry, and real-time serving endpoint
High-resolution flood hazard maps from rainfall alone
The system generates spatially explicit flood extent and depth predictions across Calgary's Bow and Elbow River watersheds using only rainfall as input — no soil moisture data, no river gauge calibration, no physics-based solver outputs required. This dramatically reduces the data burden for operational deployment.
Climate scenario stress-testing to 2100
The model was applied to RCP 8.5 rainfall projections for 2025, 2050, 2080, and 2100 — producing future flood hazard maps that reveal how high-risk zones expand and deepen under high-emission climate pathways. These outputs directly inform long-horizon infrastructure investment and capital planning decisions for the client.
Fully automated production pipeline on GCP
The TFX pipeline runs end-to-end without manual intervention — ingesting new data, validating schema, engineering features, tuning hyperparameters, training, evaluating, and deploying in a single automated DAG. New rainfall data triggers a retraining run; only models that improve on the production champion are promoted to the serving endpoint.
Validated against Calgary's historic flood events
The model was trained and validated on flood events from 2010–2020 including the catastrophic 2013 Southern Alberta flood. Successful replication of observed flood extent and depth from those events — at NSE 89% — confirms the system's reliability for both current operational use and future scenario projection.
Full reproducibility and lineage across every pipeline run
Every artifact, execution, transformation, and evaluation result is logged in the ML Metadata Store — creating a complete audit trail from raw data to deployed model. Any pipeline run can be fully reconstructed, any model version can be rolled back, and every prediction can be traced to the exact data and code that produced it.