Generative AI — Lecture Series

Neural Networks

Lecture 3 · The computational backbone of Generative AI

The Neuron

An artificial neuron computes a weighted sum of its inputs, adds a bias, and passes the result through a non-linear activation function:

y = σ(w₁x₁ + w₂x₂ + … + wₙxₙ + b)

Common Activations

FunctionFormulaUse Case
Sigmoid1 / (1 + e⁻ˣ)Binary output, gates
ReLUmax(0, x)Hidden layers (default)
GeLUx·Φ(x)Transformers
Softmaxeˣⁱ / ΣeˣʲProbability over classes / tokens

Training Loop in Python

Python · PyTorch
import torch
import torch.nn as nn

# Simple two-layer network
model = nn.Sequential(
    nn.Linear(784, 256),
    nn.ReLU(),
    nn.Linear(256, 10),
)

optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
criterion = nn.CrossEntropyLoss()

for epoch in range(10):
    for X, y in dataloader:
        pred  = model(X)
        loss  = criterion(pred, y)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
    print(f"Epoch {epoch}: loss={loss.item():.4f}")
⚠️ Common Pitfall

Forgetting optimizer.zero_grad() before backward() will accumulate gradients across batches — a frequent source of bugs.