Diffusion models

Diffusion models have emerged as a powerful generative framework, blending data with noise and then reversing the process through a learned denoising model. This post explains the core mechanics, derivations, and loss functions that define diffusion models, providing a clear understanding of how they produce high-quality samples.

Variational autoencoders

Variational autoencoders (VAEs) are probabilistic generative models that leverage latent variables to capture complex data distributions. This post explains the core concepts behind VAEs, focusing on how they decompose data into simpler prior and posterior distributions and maximize the Evidence Lower Bound (ELBO) during training.

Normalizing flows

Dive into the mechanics of normalizing flows, powerful probabilistic generative models that leverage bijective transformations to map latent distributions to data distributions. This post covers the theory, loss functions, and various types of flows including linear, coupling, autoregressive, residual, and multi-scale flows, highlighting their unique properties.

Generative adversarial networks

Explore the family of generative models that use a discriminator as a signal. This post explores how GANs work, delving into their loss functions, challenges in training, and advanced techniques such as Wasserstein distance and conditional generation to enhance their performance and effectiveness.

Bayesian linear regression

Bayesian approach leverages the full distribution of model parameters, resulting in more robust predictions. This post explains the principles of Bayesian thinking, contrasts it with maximum likelihood estimation, and illustrates its application in linear regression, offering a refresher on matrix operations and theoretical basics.

Loss functions

Loss functions are essential for training models, guiding the optimization process. This post explores how loss functions work from a probabilistic perspective, delves into maximum likelihood estimation, and connects this framework with methods such as least squares, KL-divergence, and cross-entropy.

Optimization methods

The complex nature of loss functions requires numerical methods to search through them. This post explains various optimization methods, including stochastic gradient descent, momentum, and adaptive moments like Adam. You'll gain insights into their mechanics, advantages, and how they address the challenges of training deep learning models.

Backpropagation

Understanding backpropagation is crucial, as it calculates the gradients needed to optimize model parameters. This post explains backpropagation, covering its theory, Jacobian matrices, and why proper initialization and residual connections are important for training deep neural networks effectively.