Module 4 - Session 1: Convolutional Neural Networks

Module 4 Overview

What will we learn?

  • Convolutional layers: filters, patterns, and feature maps
  • Complete CNN architecture: convolution, pooling, fully connected layers
  • Training CNNs for image classification
  • Dynamic computation graphs in PyTorch
  • Modular architectures and code organization
  • Model inspection and debugging techniques

Session 1: Convolutional Neural Networks

What you’ll know by the end:

  • How convolutional filters detect patterns in images
  • The complete architecture of a CNN
  • How to train a CNN for multi-class image classification

The New Challenge

Butterfly house expansion

  • Classify flowers, insects, and small animals
  • Need to detect edges, textures, and patterns
  • Linear layers aren’t enough

Why Linear Layers Fall Short

Every pixel is independent

  • No spatial understanding
  • Can’t recognize patterns formed by neighboring pixels
  • Wings, antennae, eye spots are invisible

Convolutional Neural Networks

Inspired by biology

  • 1960s: Visual cortex neurons respond to specific patterns
  • CNNs mimic this with learnable filters
  • Filters scan images to extract features

How Filters Work

Source: https://dennybritz.com/posts/wildml/understanding-convolutional-neural-networks-for-nlp/

A 3×3 grid of numbers

  • Slide over the image
  • Multiply filter values with pixel values
  • Sum the results
  • This is convolution

See: Convolution Arithmetic for more details.

What Do Filters Detect?

  • Vertical edges
  • Horizontal edges
  • Textures and shapes

Butteryfly Example

Butterfly image passing through one filter of first layer.

Learning vs. Hand-Designing Filters

Different weights → different patterns

The Power of Hierarhical Feature Extraction

Source: Receptive Field in Deep Convolutional Networks | by Reza Kalantar | Medium

The image illustrates how a single “pixel” in a deep layer of a neural network can “see” a much larger portion of the original input image. This concept is called the Receptive Field. Because the orange area in Layer 2 was already looking at a larger area in Layer 1, the single pixel in Layer 3 is effectively “aware” of a area in the original input.

Creating Convolutional Layers in PyTorch

nn.Conv2d(
    in_channels=3,      # RGB color channels
    out_channels=32,    # Number of filters
    kernel_size=3,      # 3×3 filter size
    padding=1,          # Preserve image size
    stride=1            # Step size
)

Number of parameters for this layer:

\[ [(\text{kernel_size}^2 \times \text{in_channels}) + \underbrace{1}_{\text{bias}}] \times \text{out_channels} \]

Output: Activation/Feature Maps

Here we see a out_channels=16 of convolution outputs showing high values where they activate (after ReLU()).

Pooling

Example: 28×28 feature map

  • After first pool: 14×14
  • After second pool: 7×7
  • Each pooling layer halves the spatial dimensions

Building a Complete CNN Architecture

A sequential conv-pool conv-pool flatten fc fc architecture

CNN Architecture Overview

Three main components:

class CNN(nn.Module):
    def __init__(self):
        # Convolutional layers → extract features
        # Pooling layers → reduce size
        # Fully connected layers → classify

Define the flow:

    def forward(self, x):
        x = self.conv1(x)
        x = self.relu1(x)
        x = self.pool1(x)
        # ... more layers ...
        x = x.flatten()
        x = self.fc(x)
        return x

The Snow Detector Problem

Husky misclassified as wolf

  • Model fixated on snow in background
  • Some neurons became “snow detectors”
  • Others relied on them (co-adaptation)

Regularization

Regularization in deep learning refers to techniques used to prevent models from overfitting to the training data. Overfitting occurs when a model learns not only the underlying patterns but also the noise in the data, resulting in poor performance on unseen data. Regularization methods add a form of constraint or penalty to the learning process, encouraging simpler models that generalize better.

Dropout

Dropout: Randomly turns off a fraction of neurons during training, forcing the network to develop redundant representations and making it less likely to rely too heavily on any single feature.

Weight Decay

Weight Decay: Add a penalty to the loss function based on the size of the weights, encouraging them to be smaller and discouraging complex models.

The two regularization techniques

Feature Dropout Weight Decay
Mechanism Randomly deactivates neurons. Penalizes large weight values.
Goal Breaks co-dependency between neurons. Keeps the model simple and less sensitive.
Active When? Only during training. During training (via the optimizer).
Analogy A team where players are randomly benched so everyone learns to play every position. A coach telling players not to over-commit to a single move so they stay balanced.

Dataset Issue

If most wolf images in your dataset have snow and dog images don’t, that’s a dataset problem.

Solution

Get more representative data.

Lab 1: Building a CNN for Nature Classification

“If we want machines to think, we need to teach them to see.” — ImageNet Project launch

CUE: START THE LAB HERE

What’s Next?

In Session 2: PyTorch Techniques and Model Inspection we learn:

  • Dynamic computation graphs in PyTorch
  • Building modular architectures
  • Model inspection and debugging