Applied ML Systems · Climate Resilience & Infrastructure

Calgary Flood Prediction System

A production-grade CNN+LSTM deep learning system that predicts flood hazard extent and depth across Calgary from rainfall data alone — deployed as a full TFX pipeline on GCP Vertex AI and stress-tested against climate change projections to 2100.

Architecture CNN+LSTM · Deep Learning
Tech Stack
TensorFlow TFX GCP Vertex AI Keras GIS Python

89%

Model Accuracy

<15cm

Avg. Training MSE

Vertex AI

Production Deployment on GCP

The Problem

Traditional flood modeling is too slow, too expensive, and too rigid for real-time climate adaptation

Conventional flood forecasting relies on coupled hydrologic-hydraulic simulation models that are computationally intensive, require extensive calibration data, and can take hours to generate a single scenario. For a city like Calgary — which experienced a flood in 2013 that displaced over 100,000 residents and caused nearly $5 billion in damages — this lag is not just inconvenient, it is a critical operational gap. As climate change drives rainfall patterns toward greater intensity and unpredictability, cities need flood prediction tools that are fast enough for real-time response, accurate enough for capital planning, and scalable enough to test hundreds of future climate scenarios — including projections to 2100 under high-emission pathways.

The Solution

A CNN+LSTM deep learning system — built, validated, and deployed as a production TFX pipeline on GCP Vertex AI

The Calgary Flood Prediction System is an end-to-end deep learning pipeline that predicts flood hazard extent and depth across Calgary's Bow and Elbow River watersheds using rainfall data as its only input. At its core is a CNN+LSTM architecture with a custom synchronization layer — the CNN captures spatial rainfall patterns and their interaction with topographic features, while the LSTM models how rainfall accumulates and translates into flooding over time. The synchronization layer accounts for the temporal lag between storm events and inundation, a critical innovation that improves real-world accuracy. The model was trained and validated against historical flood events including the 2013 disaster, achieving a Nash-Sutcliffe Efficiency (NSE) of 89% and an average training MSE below 15cm. The full system was then productionized as a TFX pipeline with all stages — data ingestion, schema validation, feature transformation, model training, evaluation, and serving — deployed and orchestrated on GCP Vertex AI Pipelines.

Key Outcome

A production-grade flood prediction system that generates high-resolution hazard maps from rainfall alone — validated against Calgary's historic floods, stress-tested against RCP 8.5 climate scenarios to 2100, and deployed as a fully automated TFX pipeline on GCP Vertex AI for real-time inference and continuous retraining.

Technical Deep Dive

Architecture & Design

TFX Pipeline & MLOps

TFX Pipeline — GCP Vertex AI Pipelines

Stage 1 · ExampleGen

Data Ingestion

Ingests rainfall time-series + geospatial flood data from GCS · Splits into train/eval sets

StatisticsGen

Statistics

Computes dataset stats for drift detection

SchemaGen

Schema

Infers schema from training data

ExampleValidator

Validation

Detects anomalies against schema

Stage 3 · Transform

Feature Engineering

Consistent preprocessing for training + serving · Spatial normalization · Temporal windowing for LSTM

Stage 4 · Tuner

Hyperparameter Tuning

Keras Tuner integration · Searches CNN filter sizes, LSTM units, learning rate, dropout · Best trial passed to Trainer

Stage 5 · Trainer

CNN+LSTM Model Training

Convolutional spatial encoder + LSTM temporal modeler + synchronization layer · NSE 89% · MSE <15cm

Stage 6 · Evaluator

Model Evaluation

TFMA evaluation · NSE, MSE, spatial accuracy · Blessing gate — blocks underperforming models

Stage 6 · Resolver

Model Comparison

Compares new model against production baseline · Auto-promotes if performance improves

Stage 7 · Pusher

Model Serving — Vertex AI Endpoint

Blessed models pushed to Vertex AI Serving · Real-time inference endpoint for flood hazard prediction

MLOps & CI/CD Layer

Pipeline Orchestration

Vertex AI Pipelines

Kubeflow-based DAG execution · Automated pipeline runs on new data · Full lineage tracking

Continuous Training

Automated Retraining

Pipeline triggered on new rainfall/flood data · Schema drift detection gates retraining

Model Registry

Vertex AI Model Registry

Versioned model artifacts · Challenger vs. champion tracking · Rollback support

Metadata & Lineage

ML Metadata Store

Every artifact, execution, and evaluation logged · Full reproducibility across pipeline runs

GCP Infrastructure

Cloud Storage (GCS) Vertex AI Pipelines Vertex AI Training Vertex AI Serving

Stage 1

Data Ingestion & Validation

ExampleGen ingests rainfall time-series and geospatial flood extent data from Google Cloud Storage and partitions it into training and evaluation splits. StatisticsGen, SchemaGen, and ExampleValidator then run in concert to compute dataset statistics, infer a data schema, and detect anomalies — blocking malformed or drifted data from reaching the training stage.

Stage 2

Feature Engineering

The Transform component applies spatial normalization and temporal windowing to prepare inputs for the CNN+LSTM architecture. Critically, the same transformation graph is applied at both training and serving time — eliminating training-serving skew and ensuring that the preprocessing the model was trained on is identical to what it receives during real-time inference.

Stage 3

Hyperparameter Tuning

The Tuner component integrates Keras Tuner to search across CNN filter sizes, LSTM hidden units, learning rate, and dropout rate. The best hyperparameter trial is automatically passed to the Trainer component — removing manual tuning from the retraining loop and ensuring every pipeline run uses an optimized configuration.

Stage 4

CNN+LSTM Training

The Trainer component trains the CNN+LSTM model using the tuned hyperparameters and transformed features. The CNN encodes spatial rainfall patterns and their interaction with topographic features, the LSTM captures temporal accumulation dynamics, and a custom synchronization layer accounts for the temporal lag between storm events and inundation — the key architectural innovation that drives the 89% NSE.

Stage 5

Evaluation & Model Blessing

The Evaluator uses TensorFlow Model Analysis (TFMA) to assess the trained model against NSE, MSE, and spatial accuracy thresholds. Only models that meet or exceed all thresholds receive a blessing. The Resolver component then compares the newly trained model against the current production champion — auto-promoting it only if performance genuinely improves.

Stage 6

Serving & Deployment

The Pusher component deploys blessed models to a Vertex AI Serving endpoint, making the flood prediction system available for real-time inference. Model versions are tracked in the Vertex AI Model Registry — enabling challenger vs. champion comparisons, versioned rollbacks, and full audit trails for every production deployment.

Key Design Decisions

Synchronization layer resolves the temporal lag problem

Rainfall and flood inundation are not temporally aligned — there is a lag between when a storm occurs and when flooding peaks. Standard CNN+LSTM architectures ignore this lag, degrading prediction accuracy on fast-response urban catchments. The custom synchronization layer explicitly aligns rainfall inputs with flood response timing, and is the single most impactful architectural decision in reaching NSE 89%.

TFX Transform eliminates training-serving skew

One of the most common sources of production ML failures is a mismatch between how features are preprocessed during training versus serving. By using TFX Transform, the same TensorFlow graph is applied at both stages — guaranteeing that the model receives identically preprocessed inputs whether it is being trained on historical data or serving real-time rainfall forecasts.

Blessing gate prevents silent model degradation in production

In a continuously retrained system, a model that passes validation on aggregate metrics can still degrade on specific flood scenarios or spatial regions. The TFMA Evaluator enforces per-metric thresholds — NSE, MSE, and spatial accuracy — and only blesses models that meet all three. Combined with the Resolver's champion-challenger comparison, this ensures that only genuinely better models ever reach the production endpoint.

Rainfall-only input enables real-time operational use

Traditional flood models require detailed soil moisture data, calibrated river gauges, and physics-based solver outputs — inputs that are often unavailable in real time or require hours to prepare. By designing the CNN+LSTM to operate on rainfall data alone, the system can generate flood hazard predictions as soon as a rainfall forecast is available, making it suitable for early warning systems and digital twin integration.

Tech Stack

Technology Purpose
TensorFlow / Keras CNN+LSTM model architecture, training, and evaluation
TensorFlow Extended (TFX) End-to-end ML pipeline — ExampleGen, Transform, Tuner, Trainer, Evaluator, Pusher
GCP Vertex AI Pipelines Pipeline orchestration, automated retraining, and DAG execution
Vertex AI Training & Serving Scalable model training and real-time inference endpoint
Vertex AI Model Registry Versioned model artifacts, champion-challenger tracking, rollback
TF Model Analysis (TFMA) Multi-metric model evaluation with blessing gate
Keras Tuner Automated hyperparameter search within TFX Tuner component
Google Cloud Storage Raw data storage and TFX artifact repository
Python Core language and pipeline orchestration

Results & Metrics

What the system delivers

89%

Model Accuracy

Nash-Sutcliffe Efficiency (NSE) — validated against Calgary's historical flood events including the 2013 disaster

<15cm

Avg. Training MSE

Average mean squared error on flood depth prediction across training runs — well within operational tolerance

Vertex AI

Production Deployment

Full TFX pipeline deployed on GCP Vertex AI — automated retraining, model registry, and real-time serving endpoint

🌊

High-resolution flood hazard maps from rainfall alone

The system generates spatially explicit flood extent and depth predictions across Calgary's Bow and Elbow River watersheds using only rainfall as input — no soil moisture data, no river gauge calibration, no physics-based solver outputs required. This dramatically reduces the data burden for operational deployment.

🌡️

Climate scenario stress-testing to 2100

The model was applied to RCP 8.5 rainfall projections for 2025, 2050, 2080, and 2100 — producing future flood hazard maps that reveal how high-risk zones expand and deepen under high-emission climate pathways. These outputs directly inform long-horizon infrastructure investment and capital planning decisions for the client.

⚙️

Fully automated production pipeline on GCP

The TFX pipeline runs end-to-end without manual intervention — ingesting new data, validating schema, engineering features, tuning hyperparameters, training, evaluating, and deploying in a single automated DAG. New rainfall data triggers a retraining run; only models that improve on the production champion are promoted to the serving endpoint.

🏙️

Validated against Calgary's historic flood events

The model was trained and validated on flood events from 2010–2020 including the catastrophic 2013 Southern Alberta flood. Successful replication of observed flood extent and depth from those events — at NSE 89% — confirms the system's reliability for both current operational use and future scenario projection.

🔁

Full reproducibility and lineage across every pipeline run

Every artifact, execution, transformation, and evaluation result is logged in the ML Metadata Store — creating a complete audit trail from raw data to deployed model. Any pipeline run can be fully reconstructed, any model version can be rolled back, and every prediction can be traced to the exact data and code that produced it.