How does bias get into computing systems, and how can it be reduced?
Topic 5.3 Computing Bias: computing innovations can reflect existing human biases through biased data or design choices, and bias can be embedded intentionally or unintentionally.
A focused answer to AP CSP Topic 5.3, covering how bias enters computing systems through biased data and design, intentional versus unintentional bias, real effects on people, why biased data produces biased outputs, and how bias can be identified and reduced.
Reviewed by: AI editorial process; not yet individually human-reviewed
Have a quick question? Jump to the Q&A page
Jump to a section
What this topic is asking
The College Board (Topic 5.3) wants you to understand computing bias: how computing systems can reflect and amplify human biases. Bias enters through biased data and through design choices, and it can be embedded intentionally or unintentionally. You need to explain why unrepresentative data produces biased outputs, give examples of the real harm bias causes, and describe how bias can be identified and reduced.
What computing bias is
How bias enters: data and design
A facial-recognition system trained mostly on one group's images works poorly for others: the data was unrepresentative, so the outputs are biased.
Intentional versus unintentional
Bias can be deliberately built in, but the CED stresses that it is most often unintentional. Developers with no intent to discriminate can still create biased systems by using data that carries historical inequalities. Because it is often invisible to the creators, bias must be actively looked for, not assumed absent.
Effects and mitigation
Biased systems make real decisions about people, in hiring, lending, policing and recognition, so bias can cause serious, scaled harm. To reduce it, developers can:
- Use more representative and diverse data.
- Test the system across different groups to detect unequal performance.
- Review design choices and involve diverse perspectives (linking back to collaboration).
Try this
Q1. How can unrepresentative training data cause computing bias? [2 points]
- Cue. A system learns patterns from its data; if the data over-represents some groups, the system performs better for them and worse for others, producing unfair, biased outputs.
Q2. Suggest one way developers can reduce bias in a computing system. [1 point]
- Cue. Use more representative and diverse data, test the system across different groups, or review design choices for fairness (any one).
Exam-style practice questions
Practice questions written in the style of College Board exam questions on this dot point, with worked answer explainers. The year tag is the paper they imitate, not the source.
AP 2022 (style)1 marksMultiple choice. A facial recognition system performs much worse for some groups of people than others because it was trained mostly on images of one group. This is an example of:
(A) A hardware fault.
(B) Computing bias caused by unrepresentative training data.
(C) The digital divide.
(D) A network latency problem.
Show worked answer →
The answer is (B).
The system reflects computing bias introduced by biased (unrepresentative) data: trained mostly on one group, it works poorly for others. (A) is not a hardware fault; the system works as built but unfairly. (C) the digital divide is about access, not biased outputs. (D) is unrelated to fairness.
Markers reward identifying that unrepresentative training data embeds bias into a system's outputs.
AP 2021 (style)2 marksFree response (short). Explain how bias can be unintentionally introduced into a computing system, and suggest one way developers can reduce it.
Show worked answer →
A 2-point question on the source and mitigation of bias.
Point 1 (source): Bias is often introduced unintentionally through the data used to build or train a system. If the data over-represents some groups or reflects existing human prejudices, the system learns and reproduces those patterns even though no one set out to discriminate.
Point 2 (mitigation): Developers can reduce bias by using more representative, diverse data, testing the system across different groups, and reviewing design choices for fairness. Any valid source-and-mitigation pair earns the marks.
Related dot points
- Topic 5.1 Beneficial and Harmful Effects: computing innovations have both beneficial and harmful effects on society, economy and culture, and effects may be intended or unintended.
A focused answer to AP CSP Topic 5.1, covering how a single computing innovation can have both beneficial and harmful effects, intended versus unintended consequences, effects on individuals and society, and how to analyze an innovation's impact for the exam.
- Topic 5.2 The Digital Divide: the digital divide is the unequal access to computing devices and the Internet across groups, shaped by socioeconomic, geographic and demographic factors.
A focused answer to AP CSP Topic 5.2, covering what the digital divide is, the socioeconomic, geographic and demographic factors behind it, its effects on opportunity and equity, the difference between access and skills, and efforts to close it.
- Topic 2.3 Extracting Information from Data: information is extracted from data through processing, filtering, transforming and combining data sets, and correlation does not imply causation.
A focused answer to AP CSP Topic 2.3, covering the difference between data and information, processing data to find patterns and trends, filtering and transforming, metadata, combining data sets, and the limits of data including correlation versus causation.
- Topic 5.5 Legal and Ethical Concerns: computing raises legal and ethical issues including intellectual property, licensing, plagiarism, privacy and the responsible use and sharing of material and data.
A focused answer to AP CSP Topic 5.5, covering intellectual property and copyright, open-source and Creative Commons licensing, plagiarism, the ethics of using others' work, privacy of personal data, and the legal and ethical responsibilities of creators and users.
- Topic 5.4 Crowdsourcing: crowdsourcing uses the input of a large number of people, often via the Internet, to obtain ideas, services, content, funding or data.
A focused answer to AP CSP Topic 5.4, covering what crowdsourcing is, how the Internet enables it, examples (knowledge, funding, citizen science, mapping), the benefits of scale and diverse input, the risks of quality and reliability, and how it relates to other impacts.
Sources & how we know this
- AP Computer Science Principles Course and Exam Description — College Board (2025)