What Is a Pie Chart?

A pie chart is not merely a graphic device. It is a mathematical representation of compositional data, a closed system in which all parts sum to exactly 1 (or 100%) and every observation belongs to exactly one category. This ontological commitment shapes everything about how the chart should be constructed, interpreted, and reported. The researcher who uses a pie chart implicitly asserts that the phenomenon under study has a definite boundary, that its categories are mutually exclusive and collectively exhaustive, and that the relative magnitude of each category is more informative than its absolute count.

Every design choice in data visualization is a choice about which aspects of reality to encode and which to suppress. A pie chart foregrounds part-whole relationships while suppressing absolute magnitude, trends, and variability. The researcher must own that choice explicitly. (Cairo, 2019)

Ontology: What the Chart Claims About Reality

Every data visualization carries an implicit ontological claim about what exists and how it is structured. A pie chart asserts that reality, at the moment of measurement, can be partitioned into a finite, mutually exclusive, and exhaustive set of categories. This is the ontology of a closed compositional system: the whole is exactly 1 (or 100%), and every observation belongs to exactly one category with no remainder. This framework draws from Aitchison's (1986) work on the simplex, the mathematical space of compositional data, where vectors of proportions summing to one occupy a constrained geometry fundamentally different from ordinary Euclidean space.

Epistemology: How the Chart Produces Knowledge

Epistemology asks how we come to know, and what justifies a knowledge claim. The pie chart produces knowledge through visual inference: the viewer perceives angular area and makes comparative judgments about relative magnitude. This process is subject to well-documented perceptual limitations (Cleveland and McGill, 1984). Humans are significantly less accurate when judging angles and circular areas than when judging length along a common scale. This does not invalidate the pie chart, but it does define its epistemic boundary. The pie chart communicates the composition of a sample at a given point in time. It does not test hypotheses, establish causality, or permit generalisation to a population.

Semiotics: The Sign System of the Circle

Following Peirce's triadic sign model, the pie chart operates as an iconic sign: the visual form (a divided circle) resembles the structure of the data it represents (a divided whole). The proportionality between slice area and relative frequency is not arbitrary; it is motivated by structural analogy. When this analogy holds, the chart communicates efficiently. When it breaks down through distortions such as 3D perspective, the semiotic contract is violated and the viewer is misled (Cairo, 2019; Rettberg, 2020).

Historical Origins: William Playfair (1801)

The pie chart was invented by William Playfair (1759-1823), a Scottish engineer and political economist. Playfair published the first pie charts in his Statistical Breviary (1801), using circular diagrams divided into segments to show the proportions of territory held by various states and kingdoms. His earlier work, The Commercial and Political Atlas (1786), had introduced the bar chart and the time-series line graph. His design choices, including proportional angles, colour coding, exterior labels, and a bounded circle representing a complete population, remain the standard today, more than two centuries later (Friendly, 2021; Spence and Wainer, 2001).

Academic Use Note Pie charts are best suited to four to six categories. Beyond six, a bar chart or frequency table is preferred in academic publication (APA Publication Manual, 7th ed., 2020). The chart is appropriate for nominal or ordinal categorical variables displaying relative frequency within a finite population N.

Measurement Theory and Variable Types

The legitimacy of any statistical chart depends on the measurement scale of the underlying data. Stevens (1946) identified four scales in ascending order of mathematical richness. Nominal data are categories with no inherent order (e.g., academic strand, gender, region); only frequency counts are meaningful. Ordinal data are ordered categories with no equal intervals (e.g., Likert scale ratings, rankings); frequency and rank are meaningful but arithmetic is not. Interval data have equal intervals with no true zero (e.g., temperature in Celsius, year); addition and subtraction are meaningful. Ratio data have equal intervals with a true zero (e.g., income, height, count data); all arithmetic operations are meaningful.

The pie chart is appropriate for nominal and ordinal data where the researcher is displaying relative frequency within a population. It is also appropriate for ratio-scale count data when the researcher's interest is specifically in part-whole composition rather than absolute magnitude. Applying a pie chart to interval data without a meaningful zero, or to continuous variables that have been arbitrarily binned, risks producing a compositional representation of a non-compositional phenomenon. This is a category error in measurement theory.

Relative Frequency and the Closure Constraint

The fundamental quantity in a pie chart is relative frequency: the proportion of observations falling in each category, defined as p_i = n_i / N, where n_i is the count for category i and N is the total count. The angular area of each slice is proportional to p_i, and the sum of all p_i equals exactly 1. This is not the same as probability, though the notation is shared. Relative frequency is a property of an observed sample; probability is a property of a model or population. A pie chart reports what was observed; it does not, without additional inference, report what would be expected in the population.

Fundamental Quantities
p_i = n_i / N (relative frequency of category i)
sum(p_i) = 1 (closure constraint: compositional data)
angle_i = p_i × 360° (angular area of slice i)

Ordering Conventions

When data are nominal, the order of slices is arbitrary. Convention suggests ordering slices from largest to smallest, beginning at the 12 o'clock position and moving clockwise, to facilitate comparison (APA, 2020). When data are ordinal, slice order must preserve the natural rank order of the categories, even if this places smaller slices before larger ones.

Interactive Statistical Instrument

Pie Chart Maker

Enter category names and frequencies, choose your style, and click Generate. The chart and full statistical output will appear instantly on the right.

Chart Information
Categories and Frequencies
CategoryNColor
Appearance
APA 7th Tip Use Standard style, white labels, and bottom legend for journal and thesis submission. For black-and-white printing, assign greyscale hex values to each category.
Generate and Export
Preview
N = 0
0 categories

Add data and click Generate Chart

Types: IV DV ABM GAS
Playfair (1801) · APA (2020)

Shannon Entropy and Herfindahl-Hirschman Index

Shannon Entropy (H)

Shannon Entropy measures the evenness or diversity of a distribution. Developed by Claude Shannon (1948) in his foundational paper on information theory, entropy quantifies the average amount of uncertainty or information in a random variable. In a categorical distribution, higher entropy means observations are more evenly spread across categories; lower entropy means they are concentrated.

Shannon Entropy
H = -sum( p_i × ln(p_i) )

H = 0: all observations in one category (no diversity)
H = ln(k): maximum diversity, all k categories equally distributed
H_norm = H / ln(k): normalised to range 0-1 for comparability across different k

Shannon's insight was that information and uncertainty are equivalent: a message that is certain carries no information, while a message from a perfectly even distribution carries the maximum possible information. Applied to survey or observational data, high entropy means no single category dominates, which often signals a well-distributed population or a genuinely contested phenomenon.

Herfindahl-Hirschman Index (HHI)

The HHI was developed independently by Orris Herfindahl (1950) and Albert Hirschman (1945) to measure the degree of concentration in an economic market. It is computed as the sum of squared percentage shares. The squaring operation penalises large shares disproportionately: a category with 60% share contributes 3,600 to the HHI, while six categories each with 10% contribute only 600 combined.

Herfindahl-Hirschman Index
HHI = sum( s_i² ) where s_i = (n_i / N) × 100

HHI < 1,500: low concentration (diverse distribution)
1,500-2,500: moderate concentration
HHI > 2,500: high concentration (dominant category)
HHI = 10,000: all observations in a single category

When Not to Use a Pie Chart

Cleveland and McGill (1984) established through controlled experiments that angle judgments (used in pie charts) are significantly less accurate than position-along-a-scale judgments (used in bar charts). This finding is foundational in data visualisation research and explains why bar charts are generally preferred when precise comparison is the goal.

Philosophy, Measurement and Statistics

The three panels below provide the complete philosophical, measurement-theoretical, and statistical foundations for responsible use of pie charts in academic research.

Philosophical Foundations v

Ontology: What a Pie Chart Claims to Represent

Every data visualization carries an implicit ontological claim: a claim about what exists and how it is structured. A pie chart asserts that reality, at the moment of measurement, can be partitioned into a finite, mutually exclusive, and exhaustive set of categories. This is the ontology of a closed compositional system: the whole is exactly 1 (or 100%), and every observation belongs to exactly one category with no remainder.

This ontological commitment is significant. It means the researcher accepts that the phenomenon under study has a definite boundary, that categories do not overlap, and that all possible states have been enumerated. When any of these conditions cannot be met, the pie chart is the wrong representation. A researcher who forces ambiguous or open-ended data into a pie chart misrepresents the ontological structure of the phenomenon itself.

This framework draws directly from Aitchison's (1986) work on the simplex (the mathematical space of compositional data : where vectors of proportions summing to one occupy a constrained geometry fundamentally different from ordinary Euclidean space.

Epistemology: How the Chart Produces Knowledge

Epistemology asks how we come to know, and what justifies a knowledge claim. The pie chart produces knowledge through visual inference: the viewer perceives angular area and makes comparative judgments about relative magnitude. This process is subject to well-documented perceptual limitations (Cleveland and McGill, 1984): humans are significantly less accurate when judging angles and circular areas than when judging length along a common scale. This does not invalidate the pie chart, but it does define the epistemic boundary of what it can reliably communicate.

The epistemological status of a pie chart in academic research is therefore that of a descriptive claim, not an inferential one. It communicates the composition of a sample at a given point in time. It does not, by itself, test hypotheses, establish causality, or permit generalisation to a population. For those purposes, inferential statistics are required. The APA Publication Manual (7th ed., 2020) makes this distinction explicit: figures are used to communicate patterns and distributions, not to substitute for statistical analysis.

Otsuka (2022) locates statistical inference within epistemology by noting that statistical methods are, at their core, procedures for justifying hypotheses from data. The pie chart occupies the descriptive rather than inferential end of that spectrum, yet it remains epistemically important because accurate description is the precondition for valid inference.

Semiotics: The Sign System of the Circle

Semiotics (the study of signs and their meanings) provides a third lens for understanding what a pie chart does. Following Peirce's triadic sign model, the pie chart operates as an iconic sign: the visual form (a divided circle) resembles the structure of the data it represents (a divided whole). The proportionality between slice area and relative frequency is not arbitrary; it is motivated by structural analogy.

This iconic relationship is the source of both the pie chart's communicative power and its risk of misuse. When the structural analogy holds, the chart communicates efficiently. When it breaks down, when categories are not truly exhaustive, when the data are not truly proportional, or when visual distortions (such as 3D perspective) break the area-proportion correspondence, the semiotic contract is violated and the viewer is misled.

Cairo (2019) and Rettberg (2020) both argue that data visualization must be understood semiotically: every design choice is a choice about which aspects of reality to encode and which to suppress. Researchers who use pie charts are not passively reporting data; they are constructing a representation that foregrounds part-whole relationships while suppressing absolute magnitudes, trends, and variability.

Historical Origins: William Playfair (1801)

The pie chart was invented by William Playfair (1759-1823), a Scottish engineer and political economist. Playfair published what are believed to be the first pie charts in his Statistical Breviary (1801), using circular diagrams divided into segments to represent the proportions of territory held by various states and kingdoms. His earlier work, The Commercial and Political Atlas (1786), had already introduced the bar chart and the time-series line graph.

Playfair's motivation was explicitly epistemological: he argued that charts communicate relationships more efficiently than tables of numbers. In his own words: the graphical method allows "the eye to see at once what calculations would take weeks to perform." He was working against an 18th-century academic culture that considered illustrations inferior to prose and tabular data : a prejudice that persisted long enough that Playfair received little recognition in his own lifetime (Friendly, 2021; Spence and Wainer, 2001).

Friendly (2021) describes Playfair as the "Big Bang" of statistical graphics. His design choices, including proportional angles, colour coding, exterior labels, and a bounded circle representing a complete population, remain the standard today, more than two centuries later.

Sources: Aitchison (1986); APA (2020); Cairo (2019); Cleveland and McGill (1984); Friendly (2021); Otsuka (2022); Peirce (1931-1958); Playfair (1801); Rettberg (2020); Spence and Wainer (2001).

Measurement Theory and Variable Types v

Stevens' Scales of Measurement (1946)

The legitimacy of any statistical chart depends on the measurement scale of the underlying data. Stevens (1946) identified four scales in ascending order of mathematical richness:

  • Nominal: Categories with no inherent order (e.g., academic strand, gender, region). Only frequency counts are meaningful.
  • Ordinal: Ordered categories with no equal intervals (e.g., Likert scale ratings, rankings). Frequency and rank are meaningful; arithmetic is not.
  • Interval: Equal intervals with no true zero (e.g., temperature in Celsius, year). Addition and subtraction are meaningful.
  • Ratio: Equal intervals with a true zero (e.g., income, height, count data). All arithmetic operations are meaningful.

The pie chart is appropriate for nominal and ordinal data where the researcher is representing relative frequency within a population. It is also appropriate for ratio-scale count data when the researcher's interest is specifically in part-whole composition rather than absolute magnitude.

Applying a pie chart to interval data without a meaningful zero, or to continuous variables that have been arbitrarily binned, risks producing a compositional representation of a non-compositional phenomenon. This is a category error in measurement theory.

Relative Frequency and Proportionality

The fundamental quantity represented in a pie chart is relative frequency: the proportion of observations falling in each category, defined as p_i = n_i / N, where n_i is the count for category i and N is the total count. The angular area of each slice is proportional to p_i, and the sum of all p_i equals exactly 1.

This is not the same as probability, though the notation is shared. Relative frequency is a property of an observed sample; probability is a property of a model or a population. A pie chart reports what was observed; it does not, without additional inference, report what would be expected in the population.

p_i = n_i / N (relative frequency of category i)
sum(p_i) = 1 (closure constraint: compositional data)

Nominal vs. Ordinal Representation

When data are nominal, the order of slices in a pie chart is arbitrary. Convention suggests ordering slices from largest to smallest, beginning at the 12 o'clock position and moving clockwise, to facilitate comparison (APA, 2020). When data are ordinal, the slice order should preserve the natural rank order of the categories, even if this means placing smaller slices before larger ones.

Sources: APA (2020); Stevens (1946); Aitchison (1986).

Statistical Background v

Shannon Entropy (H)

Shannon Entropy measures the evenness or diversity of a distribution. Developed by Claude Shannon (1948) in his foundational paper on information theory, entropy quantifies the average amount of uncertainty or information in a random variable. In the context of a categorical distribution, higher entropy means the observations are more evenly spread across categories; lower entropy means they are concentrated.

H = -sum( p_i * ln(p_i) )
  • H = 0: all observations fall in one category (zero uncertainty, zero diversity)
  • H = ln(k): maximum diversity, all k categories equally distributed
  • Normalised H = H / ln(k) ranges from 0 to 1 for comparability across different k

Shannon's insight was that information and uncertainty are equivalent: a message that is certain carries no information, while a message from a perfectly even distribution carries the maximum possible information. Applied to social science data, high entropy means no single category dominates, which often signals a well-distributed population or a genuinely contested phenomenon.

Herfindahl-Hirschman Index (HHI)

The HHI was developed independently by Orris Herfindahl (1950) and Albert Hirschman (1945) to measure the degree of concentration in an economic market. It is computed as the sum of squared market shares, expressed as percentages. In research, it measures the degree to which a categorical distribution is dominated by one or a few categories.

HHI = sum( s_i^2 ) where s_i = (n_i / N) * 100
  • Below 1,500: low concentration (diverse distribution)
  • 1,500 to 2,500: moderate concentration
  • Above 2,500: high concentration (one or few categories dominant)
  • HHI = 10,000: all observations in a single category (monopoly equivalent)

The squaring operation penalises large shares disproportionately: a category with 60% share contributes 3,600 to the HHI, while six categories each with 10% contribute only 600 combined. This makes the HHI sensitive to dominance in a way that simple percentages are not.

When Not to Use a Pie Chart

  • Comparing values across two or more groups (use grouped bar chart)
  • Values are very close in magnitude (angles are perceptually indistinguishable)
  • More than six categories (use bar chart or frequency table)
  • Exact values matter more than proportional relationships
  • The data are continuous or interval-scale without a meaningful compositional interpretation
  • The research question concerns change over time (use line graph)

Cleveland and McGill (1984) established through controlled experiments that angle judgments (used in pie charts) are significantly less accurate than position-along-a-scale judgments (used in bar charts). This finding is foundational in data visualisation research and explains why bar charts are generally preferred when precise comparison is the goal.

Sources: APA (2020); Cleveland and McGill (1984); Herfindahl (1950); Hirschman (1945); Shannon (1948).