How JPEG Compression Works

An interactive, visual guide to the algorithm behind the world's most widely used image format. Explore each step - from colour conversion and the DCT to quantization and Huffman coding - with live demos you can manipulate in real time.

Originally built as a BSc Computer Science dissertation project at Manchester Metropolitan University by Sean O'Mahoney. The original tool let users personalise JPEG quantization tables and see the effect on image quality. This v2 adds a complete educational walkthrough of every step in the JPEG algorithm.

Before

Tap/click the image to zoom in

Quantize

Click a cell to edit its value

Why 5 sliders?

The 8×8 quantization table has 64 values, one per spatial frequency. Rather than exposing all 64, they're grouped into 5 perceptual frequency bands along the zig-zag scan order - from lowest frequency (top-left: the DC component and gentle gradients) to highest (bottom-right: sharp edges and fine detail).

This mirrors how the human visual system perceives images: we're highly sensitive to low-frequency changes but barely notice high-frequency loss (Wallace, 1991). You can also click individual cells in the table visualisation to fine-tune specific values.

About this quantization table

The default luminance quantization table comes from Annex K of ITU-T Recommendation T.81 (the JPEG standard) and was derived from psycho-visual experiments conducted by the Independent JPEG Group (IJG). It is based on research published in:

Wallace, G.K. (1991). "The JPEG Still Picture Compression Standard." Communications of the ACM, 34(4), pp. 30-44. doi:10.1145/103085.103089

The table values represent the human visual system's sensitivity to different spatial frequencies - lower values (top-left) preserve low frequencies the eye notices most, while higher values (bottom-right) aggressively compress high frequencies that are less perceptible.

What happens at extreme values?

All values = 1 (minimum): Every DCT coefficient is divided by 1 and kept intact - no quantization loss. However, this does not make the image truly lossless. Small rounding errors still occur during the floating-point DCT and the RGB → YCbCr colour-space conversion. The result is near-lossless - visually indistinguishable from the original, but not bit-for-bit identical. The file will also be very large, since very few coefficients round to zero.

For bit-perfect lossless compression, you would need either a completely different format (e.g. PNG, TIFF, WebP-lossless) or the rarely supported JPEG lossless mode (ITU-T T.81 §14), which bypasses the DCT entirely and uses predictive coding instead.

All values = 255 (maximum): Every coefficient is divided by 255, so almost everything rounds to zero. Only the very largest coefficients survive - the image degrades into flat blocks of average colour with extreme blocking artefacts.

A value of 0 is invalid - it would mean dividing by zero, which is undefined. The JPEG standard (ITU-T T.81) requires all quantization table values to be in the range 1-255. Try dragging all sliders to the far left (≈1) or far right (≈255) to see the effect.

⚠️ May be slow on large images or mobile devices

⚙️ Optional: Entropy Encoding

In the full JPEG standard, after quantization and zig-zag scanning, the data goes through entropy encoding - a lossless compression step. This is optional to visualise because it doesn't affect image quality, only file size.

The JPEG spec supports two methods:

  • Huffman coding (baseline, most common) - assigns shorter bit patterns to frequently occurring values
  • Arithmetic coding (optional, ~5-10% better compression) - was historically patent-encumbered, so rarely used

After

Tap/click the image to zoom in

📖 How JPEG Compression Works

JPEG compression is lossy by nature - it permanently discards information the human eye is least likely to notice, and that information can never be recovered. Unlike lossless formats such as PNG or WebP-lossless, opening and re-saving a JPEG compounds the loss each time. Even at the highest quality settings, standard JPEG introduces small, irreversible changes - see the quantization step below for exactly where and why.

Can JPEG ever be truly lossless? Not with the normal DCT-based pipeline. Setting every quantization-table value to 1 removes quantization loss, but rounding errors still creep in from the floating-point DCT and the RGB → YCbCr colour-space conversion. The result is near-lossless - visually identical but not bit-for-bit perfect. A genuinely lossless JPEG mode does exist (ITU-T T.81 §14), but it uses an entirely different technique - predictive coding with no DCT - and is rarely supported by consumer software.

Each step below is expandable. Tap a header to open it.

1

Colour Space Conversion (RGB → YCbCr)

The image is converted from RGB to YCbCr (as defined in ITU-R BT.601), separating luminance (brightness) from chrominance (colour). Human eyes are far more sensitive to brightness changes, so colour data can be compressed more aggressively.

Original (RGB)

Y (Luminance)

Cb (Blue Chroma)

Cr (Red Chroma)

1b

Chroma Subsampling (optional)

Since human vision is less sensitive to colour detail than brightness, JPEG can downsample the chroma channels (Cb and Cr) before further processing. This exploits the same perceptual asymmetry as the YCbCr conversion, but at a spatial level (Wallace, 1991).

Common subsampling schemes:

  • 4:4:4 - no subsampling (full resolution for all channels)
  • 4:2:2 - chroma halved horizontally (50% colour data saved)
  • 4:2:0 - chroma halved in both directions (75% colour data saved, most common in photos)

This is specified in ITU-T T.81. The tool above processes the luminance (Y) channel only in greyscale mode, or all channels at full resolution in colour mode - equivalent to 4:4:4.

4:4:4 (no subsampling)

4:2:2 (horizontal half)

4:2:0 (quarter resolution)

Subsampling is invisible in many photographs because human colour acuity is roughly half that of luminance acuity - we simply don't notice the missing chroma detail.

2

8×8 Block Splitting

Each channel is divided into 8×8 pixel blocks (ITU-T T.81 §A.2). Every block is processed independently. Click on the image below to highlight a block and see its raw pixel values.

Click to select an 8×8 block

Selected Block Values

3

Level Shift (−128)

Each value is shifted from 0-255 to −128-127 by subtracting 128 (ITU-T T.81 §A.3.1). This centres the data around zero, which improves the DCT's energy compaction - more of the signal's energy is packed into fewer coefficients - and reduces the magnitude of the DC coefficient.

Before Shift (0-255)

After Shift (−128-127)

4

Discrete Cosine Transform (DCT)

The DCT converts pixel values into frequency coefficients. Top-left = average brightness (DC). Moving right/down = higher spatial frequencies.

The DCT is built upon the Fourier transform - the foundational tool of signal processing. In 1965, Cooley & Tukey published the Fast Fourier Transform (FFT), an efficient algorithm for computing the Discrete Fourier Transform (DFT), which decomposes a signal into complex sinusoidal components (both sine and cosine).

In 1974, Ahmed, Natarajan & Rao recognised that for real-valued signals like pixel intensities, only the cosine (real) components are needed. The DCT they derived is essentially the real half of the DFT, inheriting the FFT's O(N log N) computational efficiency. This Fourier heritage is why the DCT appears across all of signal processing: audio compression (MP3, AAC), video codecs (H.264, HEVC), medical imaging, and telecommunications.

Shifted Block

DCT Coefficients

Show all 64 DCT basis functions
5

Quantization (The Lossy Step)

Each DCT coefficient is divided by the corresponding quantization table value and rounded to the nearest integer (ITU-T T.81 §A.3.4). This is where information is permanently discarded.

DCT Coefficients

Quantization Table

Quantized Result

Notice how many high-frequency coefficients become zero - these zeros are the primary source of compression.

6

Zig-Zag Scan & Entropy Coding

The quantized block is read in a zig-zag pattern, grouping trailing zeros together. Then Run-Length Encoding compresses zero-runs, followed by Huffman coding (Huffman, 1952) or optionally arithmetic coding for the final bitstream.

The DC coefficient (top-left, index [0][0]) is special: it represents the average brightness of the entire block. In JPEG, DC values are encoded using DPCM (Differential Pulse Code Modulation, ITU-T T.81 §F.1.2.1) - only the difference from the previous block's DC is stored, since neighbouring blocks tend to have similar averages. The remaining 63 values are AC coefficients, encoded with run-length + Huffman as shown below.

Zig-Zag Traversal

1D Stream

Show Huffman encoding breakdown
7

Reconstruction (Single Block)

The quantized coefficients are de-quantized (×table), passed through the Inverse DCT (Ahmed et al., 1974), then shifted by +128. The result is an approximation of the original - the difference is the compression error.

Original Block

Reconstructed

Error (amplified)

🔄 JPEG Decoding - The Reverse

When your device opens a JPEG file, it reverses every encoding step. Here's the full pipeline in reverse, using the same block you selected above.

D1

Entropy Decoding → Quantized Coefficients

The Huffman (or arithmetic) coded bitstream is decoded back into integer values. The zig-zag order is reversed to reconstruct the 8×8 quantized coefficient matrix.

1D Stream (from file)

Quantized 8×8

D2

De-Quantization (× Table)

Each quantized coefficient is multiplied by the corresponding quantization table value. This gives an approximation of the original DCT coefficients - the zeros remain zero, meaning the lost information is gone forever.

Quantized

Quantization Table

Approximate DCT

D3

Inverse DCT → Spatial Domain

The IDCT converts the approximate frequency coefficients back into pixel-like values in the range −128-127.

Approx. DCT Coefficients

IDCT Output (−128-127)

D4

Level Shift (+128) → Pixel Values

Adding 128 shifts values back to the 0-255 range, giving reconstructed pixel values.

Before (+128)

Reconstructed Block

D5

YCbCr → RGB Conversion

Finally, the Y, Cb, and Cr channels are combined and converted back to RGB for display. The full image is reassembled from all decoded 8×8 blocks.

Original

Decoded