Adaptive models (simpler for simple cases, complex for tricky ones)
Standard Python debugging (just add a print)
Variable input sizes (sentences: 3 words vs. 50 words)
Small performance cost, huge flexibility gains
Use nn.Sequential for fixed patterns
def__init__(self, in_channels):# ...self.features = nn.Sequential( ConvBlock(in_channels, 32), ConvBlock(32, 64), ConvBlock(64, 128),# ... )self.classifier = nn.Sequential(# ... )def forward(self, x): x =self.features(x) x =self.classifier(x)return x
Use nn.Module for reusable blocks
class ConvBlock(nn.Module):def__init__(self, in_channels, out_channels):self.conv = nn.Conv2d(...)self.relu = nn.ReLU()self.pool = nn.MaxPool2d(...)
Workflow: Start explicit, then refactor
Inspecting Your Model
Basic inspection
# Structure overviewprint(model)# Counting and locating parameterstotal =sum(p.numel() for p in model.parameters())# Per layer (shows where each set lives)for name, param in model.named_parameters():print(f"{name}: {param.shape}")
Inspecting nested blocks
model.children(): top-level only
model.modules(): everything, including nested (think folder structure)
Note: the shape for, say, fc1.weight connecting 2,048 inputs to 512 outputs is reversed: (512, 2048) because each row = weights for one output neuron.
Debugging Shape Mismatches
Common error: “mat1 and mat2 shapes cannot be multiplied”
Two-step approach:
Check layer shape:print(model.fc1.weight.shape) - what does FC1 expect?
Trace shapes through forward: Print shapes at each step to see what it actually gets
Combine inspection (what layer expects) with shape tracing (what it gets) to quickly pinpoint the issue.
Module 4 Synthesis
From single neuron to CNN
Started: predicting delivery times
Now: convolutional neural networks
Built data pipelines, trained and evaluated models, inspected what’s happening
Lab 2: Model Debugging, Inspection, and Modularization
“Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.” — Brian Kernighan
CUE: START THE LAB HERE
Assignment 1: Overcoming Overfitting: Building a Robust CNN