Skip to main content
United StatesComputer Science PrinciplesSyllabus dot point

How can data be made smaller, and when does compression lose information?

Topic 2.2 Data Compression: compression reduces the number of bits used to store data; lossless compression preserves all information, while lossy compression discards some to save more space.

A focused answer to AP CSP Topic 2.2, covering why compression matters, lossless versus lossy compression, run-length encoding as a lossless example, the trade-offs of lossy compression for images and audio, and how to choose between them.

Generated by Claude Opus 4.89 min answer

Reviewed by: AI editorial process; not yet individually human-reviewed

Have a quick question? Jump to the Q&A page

Jump to a section
  1. What this topic is asking
  2. Why compress data
  3. Lossless compression
  4. Lossy compression
  5. Choosing between them
  6. Try this

What this topic is asking

The College Board (Topic 2.2) wants you to understand data compression: reducing the number of bits used to store or transmit data. You must distinguish lossless compression (the original data can be reconstructed exactly) from lossy compression (some data is permanently discarded to save more space), and reason about which to use for a given situation. Compression matters because smaller data is cheaper to store and faster to send.

Why compress data

Lossless compression

A simple lossless technique is run-length encoding, which replaces a run of repeated values with the value and a count. For example the pixel run W W W W W can be stored as 5W, which is fewer bits but fully reversible: from 5W you recover exactly W W W W W.

Lossy compression

The key fact is irreversibility: once data is thrown away, it is gone. Compress a photo aggressively and you cannot recover the original pixels.

Choosing between them

The decision is a trade-off between size and fidelity:

  • If the data must be restored exactly (text, contracts, code, medical records), use lossless.
  • If much smaller size is the priority and small quality loss is acceptable (photos, music, video), use lossy.

Try this

Q1. Why is lossless compression required for a computer program's source code? [2 points]

  • Cue. Source code must be reconstructed exactly to run correctly; even one changed character could break it, so no information can be lost, which requires lossless compression.

Q2. State one advantage lossy compression has over lossless compression for storing photos. [1 point]

  • Cue. Lossy compression can achieve much smaller file sizes than lossless, saving more storage and transmission cost.

Exam-style practice questions

Practice questions written in the style of College Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.

AP 2021 (style)1 marksMultiple choice. A user wants to compress a text document containing the exact wording of a legal contract so that it can be perfectly restored. Which type of compression should be used, and why? (A) Lossy, because it produces the smallest file. (B) Lossless, because the original data must be reconstructed exactly. (C) Either, because both restore the data exactly. (D) Neither, because text cannot be compressed.
Show worked answer →

The answer is (B).

A legal contract must be restored exactly, so the compression must lose no information: that is lossless compression, which allows the original data to be perfectly reconstructed. (A) lossy compression discards data and cannot perfectly restore a contract. (C) is wrong: only lossless restores exactly; lossy does not. (D) is wrong: text compresses well (often losslessly).

Markers reward matching the requirement "restore exactly" to lossless compression.

AP 2023 (style)2 marksFree response (short). A photo-sharing app must store millions of user photos using as little storage as possible, and small visual imperfections are acceptable. State which type of compression is appropriate and explain the trade-off involved.
Show worked answer →

A 2-point question on the lossy trade-off.

Point 1: Lossy compression is appropriate, because the app prioritizes minimizing storage and small imperfections are acceptable. Lossy compression can reduce file size much more than lossless by discarding data the human eye is unlikely to notice.

Point 2: The trade-off is that some information is permanently lost: the original photo cannot be reconstructed exactly, and aggressive compression can produce visible quality loss. The app trades perfect fidelity for greatly reduced storage. A common error is to claim lossy can still restore the original, which it cannot.

Related dot points

Sources & how we know this