Epistemological & Theoretical Foundations
Stratified Random Sampling (StRS) is a probability sampling design in which the population is first partitioned into non-overlapping, exhaustive subgroups — called strata — and an independent probability sample is then drawn from each stratum. The resulting combined sample is statistically superior to simple random sampling (SRS) whenever stratum means differ and within-stratum units are relatively homogeneous.
Stratified random sampling is a method of selecting a probability sample from a population of N units by first partitioning the population into H mutually exclusive and collectively exhaustive strata of sizes N₁, N₂, …, N_H (where ΣN_h = N), and then independently drawing a probability sample of size n_h from each stratum h (where Σn_h = n), using any legitimate within-stratum design — most commonly, simple random sampling without replacement.— Cochran, W.G. (1977). Sampling Techniques (3rd ed.). John Wiley & Sons, pp. 89–91; Kish, L. (1965). Survey Sampling. John Wiley & Sons, p. 75.
Each coloured segment represents one stratum. The width is proportional to stratum size N_h. Independent SRS is applied within each stratum, not across the combined population. This independence is what makes exact variance estimation tractable — a critical advantage over systematic sampling.
Historical Development
The theoretical foundations of stratified sampling were established by Jerzy Neyman in his landmark 1934 paper "On the Two Different Aspects of the Representative Method," published in the Journal of the Royal Statistical Society. Neyman's paper did two things of enduring importance: it placed stratified sampling within a rigorous probabilistic framework, and it derived the allocation formula that minimises the variance of the stratified estimator for a fixed sample size — what is now universally called Neyman optimal allocation. Before Neyman's contribution, the dominant approach was A.L. Bowley's proportional allocation (1926), which had sound practical credentials but no proof of optimality.
William G. Cochran's (1977) Sampling Techniques provided the definitive treatment of stratified sampling for the survey research era, establishing the comparison framework between SRS, stratified SRS, and systematic designs that remains standard in doctoral methodology coursework. Leslie Kish (1965) extended the framework to complex multi-stage designs and developed the design effect (DEFF) measure that quantifies the efficiency gain (or loss) relative to SRS, permitting researchers to communicate the practical value of stratification in terms directly interpretable by applied users.
The Core Mechanic: Why Stratification Reduces Variance
The fundamental insight is that the total variance in any population can be decomposed into two components: variance between strata (differences among stratum means) and variance within strata (differences among units sharing the same stratum). SRS averages across both. Stratified sampling eliminates the between-strata component from the sampling variance entirely, because the allocation across strata is fixed by design rather than left to chance. Only the within-stratum variance contributes to the sampling variance of the stratified estimator.
This is not merely a theoretical result — it is the practical argument for stratification. If you know that hospitals in your study vary dramatically by size (small, medium, large) and that your outcome variable (say, patient readmission rate) differs substantially across size categories, then a purely random SRS could, by chance, over-represent large hospitals. Stratified sampling guarantees that small, medium, and large hospitals are each represented in exactly the proportion intended, and the variance of the estimator reflects only the uncertainty about within-stratum variation — not the certainty about between-stratum differences.
Why Stratified Sampling Is Used
Variance Reduction
When strata means differ substantially and within-stratum units are relatively homogeneous, stratification reduces the sampling variance below what SRS can achieve for the same total sample size n. The efficiency gain is proportional to the between-strata variance — the more heterogeneous the strata means, the larger the gain.
Domain (Subgroup) Estimation
When reliable estimates are needed for specific population subgroups — gender, region, age band, institution type — stratification guarantees an adequate sample within each subgroup. With SRS, small subgroups may receive too few observations for precise estimation. Disproportionate stratified sampling resolves this by oversampling small but important strata.
Administrative Convenience
When a population is naturally organised into administrative units — schools within districts, wards within hospitals, branches within a company — it is often operationally efficient to treat each administrative unit as a stratum and sample independently within each. This matches the data collection infrastructure to the sampling design, reducing logistical error.
Exact Variance Estimation
Unlike systematic sampling, stratified sampling permits an exact, design-based unbiased estimator of the sampling variance. Because n_h ≥ 2 units are drawn independently within each stratum, within-stratum sample variances s_h² are computable and directly usable in the variance formula — no approximation or assumption about the frame ordering is required.
Protection Against Imbalanced Samples
SRS cannot guarantee that a rare but theoretically important subgroup will appear in the sample at all. If 3% of a population are recent immigrants and your research question concerns their experience, a random SRS of n = 200 yields an expected 6 cases — with high probability of zero. Stratification guarantees a pre-specified minimum n_h.
Cost Efficiency via Optimal Allocation
When data collection costs vary across strata — e.g., conducting face-to-face interviews in rural areas costs more per unit than in urban areas — Neyman-extended cost-optimal allocation directs more of the sample budget toward cheaper strata and less toward expensive ones, minimising total cost for a fixed target precision, or minimising variance for a fixed total cost (Cochran, 1977, pp. 96–98).
The Stratification Variable: The Most Consequential Design Decision
The efficiency gain from stratification depends entirely on the choice of stratification variable(s). The ideal stratification variable is one that (1) is known for every population element before sampling, (2) is strongly correlated with the outcome variable of interest, and (3) can be used to form groups with high within-stratum homogeneity. Stratifying on a variable uncorrelated with the outcome produces no variance reduction below SRS — it merely adds administrative complexity. Stratifying on a variable highly correlated with the outcome (e.g., stratifying a study of school achievement by prior year test score band) can yield very large efficiency gains. The rule of thumb is: a good stratification variable is one you would want to control for in the analysis anyway (Cochran, 1977, pp. 127–131; Kish, 1965, pp. 90–95).
Estimators, Allocation Methods & Variance Theory
Stratified sampling is the only common probability design with both an unbiased point estimator and an exact, design-based unbiased variance estimator. This mathematical tractability — combined with the variance reduction from stratification — makes it the preferred design for precision-critical research.
1. Notation and Setup
W_h = N_h / N (stratum weight)
n_h = sample size in stratum h; n = Σn_h
f_h = n_h / N_h (within-stratum sampling fraction)
S_h² = (1/(N_h−1)) · Σᵢ(yᵢₕ − Ȳ_h)² (stratum population variance)
s_h² = (1/(n_h−1)) · Σᵢ(yᵢₕ − ȳ_h)² (stratum sample variance)
All summations over h run from 1 to H unless otherwise noted.
The key convention: W_h = N_h/N are the population stratum weights, used for proper weighting in estimation.
2. The Stratified Estimator of the Population Mean
E(ȳ_st) = Ȳ — the estimator is unbiased regardless of the allocation method used, provided within-stratum sampling is itself unbiased (e.g., SRS within each stratum).
Proof: E(ȳ_st) = Σ W_h · E(ȳ_h) = Σ W_h · Ȳ_h = (1/N)·Σ N_h·Ȳ_h = (1/N)·ΣΣ yᵢₕ = Ȳ. ✓
3. True Variance of the Stratified Estimator
This formula is exact — not an approximation — when SRS without replacement is used within each stratum.
When f_h = n_h/N_h is small (as it usually is in large populations), the FPC ≈ 1 and V(ȳ_st) ≈ Σ W_h² · S_h²/n_h.
Critical advantage over systematic sampling: This variance can be estimated unbiasedly from the data. Substitute s_h² for S_h² — no approximation required.
4. Unbiased Variance Estimator (from the sample)
Requires: n_h ≥ 2 in every stratum (at least two observations per stratum are needed to compute s_h²). This is the minimum requirement for variance estimation — single-unit strata are "collapsed" with adjacent strata before estimation.
E[v̂(ȳ_st)] = V(ȳ_st) — exactly unbiased. No assumptions about the population structure are required.
5. Allocation Methods
The allocation of the total sample size n across the H strata is the most consequential design decision in stratified sampling. Four allocation strategies are in common use, each with a different optimality criterion:
Use case: When all H strata are of equal scientific interest, each stratum's estimate is needed at equal precision, and stratum sizes are similar.
Drawback: Inefficient when strata differ substantially in size — large strata are undersampled, small strata are oversampled. Requires post-hoc weighting to produce an unbiased population estimate.
Never EPSEM unless all N_h are equal.
This produces an EPSEM design — every element has inclusion probability πᵢ = n/N = f regardless of stratum membership. The sample is self-weighting (no post-hoc weights needed).
V(ȳ_st,prop) = (1/n) · Σ W_h S_h² − (1/N) · Σ W_h S_h²
Always ≤ V(ȳ_SRS) when S_h² are finite — proportional allocation never performs worse than SRS (Cochran, 1977, pp. 104–105).
Gain over SRS: V(ȳ_SRS) − V(ȳ_st,prop) = (1/n) · Σ W_h(Ȳ_h − Ȳ)² ≥ 0
Larger strata (larger N_h) and more variable strata (larger S_h) receive proportionally larger samples.
V(ȳ_st,opt) = (1/n) · [Σ W_h S_h]² − (1/N) · Σ W_h S_h² ≤ V(ȳ_st,prop) ≤ V(ȳ_SRS)
Gain of Neyman over proportional: V(ȳ_st,prop) − V(ȳ_st,opt) = (1/n) · Σ W_h(S_h − S̄_W)² where S̄_W = Σ W_h S_h
This gain is large when stratum variances S_h differ substantially — the practical case for Neyman allocation.
This minimises total survey cost for a fixed target variance, or equivalently minimises variance for a fixed total budget C = c₀ + Σ c_h n_h (where c₀ is a fixed overhead cost).
Logic: Expensive strata receive fewer units (high c_h → low n_h); cheap strata receive more. Also assigns more units to high-variance strata (high S_h → high n_h).
Reduces to Neyman allocation when all c_h are equal. Reduces to proportional allocation when all c_h and S_h are equal.
6. The Variance Inequality Chain
Equal allocation may be worse than SRS when stratum sizes differ greatly (heavily oversampled small strata contribute high-variance estimates of small-weight quantities).
The gain from stratification is zero when all stratum means are identical (Ȳ_h = Ȳ for all h) — stratification on an unrelated variable adds complexity without reducing variance.
Visualising Allocation Methods: H = 3 Strata (N₁=500, N₂=300, N₃=200; S₁=10, S₂=20, S₃=30; c₁=1, c₂=2, c₃=4)
Bar lengths represent n_h for n = 120 total. Note how Neyman allocation directs the largest sample to Stratum 3 — smallest in size but highest variance. Cost-optimal allocation then adjusts downward for Stratum 3 because it is also the most expensive to sample.
7. Number and Boundaries of Strata
In general, increasing the number of strata H reduces variance — but with rapidly diminishing returns. Cochran (1977, pp. 127–130) demonstrates that with a uniformly distributed population and proportional allocation, most of the variance reduction achievable through stratification is captured with H = 4 to 6 strata. Beyond H = 6, additional strata produce minimal further variance reduction while substantially increasing design and operational complexity. The widely cited Cum √f rule (Dalenius and Hodges, 1959) provides an optimality condition for stratum boundary placement when the population distribution of the stratification variable is approximately known: boundaries are placed such that the cumulative square root of the frequency distribution is divided into H equal parts.
8. Post-Stratification
When stratum membership cannot be determined before sampling — for example, when age group or region of a respondent is unknown until the survey interview — the researcher can apply post-stratification: draw an SRS of size n from the full population, and after data collection, weight each observation by W_h / (n_h/n) = N_h/n_h, where n_h is the realised sample count in stratum h. The post-stratified estimator is approximately unbiased with variance approximately equal to the stratified estimator variance plus a small additional component due to random variation in the realised n_h values (Cochran, 1977, pp. 134–135). Post-stratification is widely used in political polling and large-scale social surveys to align achieved sample distributions with known population benchmarks (census-based weights).
Stratified Sampling Simulator
Configure the number of strata, their sizes, variances, and the total sample size. Observe how three allocation methods distribute n_h across strata, watch the within-stratum sampling execute, and compare estimated variances in real time.
StRS Interactive Simulator
Visualises stratum structure, allocation methods, and the sampling distribution of ȳ_st
| Stratum | N_h | W_h | S_h (σ) | n_h (allocated) | f_h = n_h/N_h | ȳ_h (sample) | v̂ contribution |
|---|
What the Simulator Demonstrates
Draw Sample: Displays each stratum's population as dots; selected units are highlighted in the stratum colour. Each stratum's n_h is independently drawn. The allocation table shows the exact n_h under the chosen method, the achieved ȳ_h, and each stratum's contribution to v̂(ȳ_st).
Allocation Method toggle: Switching between Proportional, Neyman, and Equal allocation updates n_h instantly. Notice that Neyman allocation concentrates observations in high-variance strata, potentially yielding a dramatically smaller SE(ȳ_st) than proportional allocation when stratum variances differ.
Run Simulation: Executes multiple independent stratified samples and plots the sampling distribution of ȳ_st. The spread of this distribution is the empirical standard error — compare it with the theoretical SE(ȳ_st) shown in the stats panel. Convergence between the two confirms the exactness of the design-based variance formula.
Assumptions, Conditions & Limitations
Stratified sampling carries a specific and consequential set of assumptions. Five of these — exhaustive partitioning, known stratum sizes, adequate within-stratum sample sizes, the relevance of the stratification variable, and the independence of within-stratum sampling — require explicit justification and documentation in doctoral research.
Formal Assumptions
| Assumption | Technical Requirement | Violation Consequence | Diagnostic / Remedy |
|---|---|---|---|
| Exhaustive, Non-Overlapping Strata | Every element of the population belongs to exactly one stratum: ⋃N_h = N and N_h ∩ N_j = ∅ for h≠j | Elements counted in multiple strata produce biased estimates; elements in no stratum are excluded (coverage error) | Conduct a pre-sampling frame audit; resolve dual membership by assignment rules pre-specified in the protocol; track excluded elements as coverage error |
| Known Stratum Sizes N_h | N_h must be known for all h before sampling to compute weights W_h = N_h/N and allocation n_h | Unknown N_h requires estimation; estimated weights introduce bias in ȳ_st proportional to the estimation error in W_h | Use administrative records, census data, or pilot enumeration to determine N_h; document the data source and vintage of the N_h estimates |
| n_h ≥ 2 in Every Stratum | Minimum two sampled units per stratum are required to compute the within-stratum sample variance s_h² | Single-unit strata produce undefined s_h²; variance estimation fails; s_h = 0 assumed, severely underestimating variance | Merge "thin" strata with adjacent strata (collapsed stratum method) before analysis; set minimum n_h = 2 as a hard constraint in allocation |
| Independent Sampling Across Strata | The sample drawn in stratum h must be statistically independent of the sample in stratum j ≠ h | Correlated selections (e.g., shared field worker who systematically selects similar units across strata) invalidate the variance formula and understate true uncertainty | Assign separate and independent randomisation procedures to each stratum; use audited PRNG with different seeds per stratum; document randomisation in the protocol |
| Stratification Variable Known Pre-Sampling | The variable used to assign units to strata must be known for every population element before the sample is drawn | If stratum membership is unknown until after contact, true stratified sampling is impossible; only post-stratification is feasible — with its additional variance component | Use administrative records, prior survey data, or observable proxies for stratum assignment; document the basis and date of stratum membership determination |
| SRS Within Strata (or Known Design) | The within-stratum design must be a legitimate probability design with known inclusion probabilities | Non-probability within-stratum selection (convenience, voluntary) invalidates design-based inference for the entire stratified estimator | Use randomised within-stratum selection; document the randomisation procedure; compute and report within-stratum inclusion probabilities πᵢ|h |
Core Limitations
Stratified sampling requires a substantially richer sampling frame than SRS or systematic sampling. Not only must every population element be listed, but stratum membership must be recorded for each element before sampling commences. In many research contexts, this is operationally demanding: hospital databases may not categorise patients by the clinical variable of interest; school registers may not record the socioeconomic indicators needed to stratify by deprivation band; company employee lists may lack the departmental or seniority classifications required for domain analysis.
When stratum membership cannot be determined from the frame, researchers must either (a) conduct a preliminary screening survey to assign stratum membership — adding cost and time — or (b) fall back on post-stratification, which provides approximately equivalent statistical properties but adds an additional variance component and requires that stratum membership can be determined after contact (Cochran, 1977, pp. 134–135).
Doctoral researchers must therefore document not only the stratification variable chosen, but the source and quality of the stratum membership data, including its currency (how recently was the membership information updated?) and its accuracy (what is the error rate in stratum classification?).
When the allocation is disproportionate — either because Neyman or equal allocation is used, or because small strata were deliberately oversampled for domain estimation — the raw sample data cannot be directly averaged to produce an unbiased estimate of the population mean. Differential inclusion probabilities mean that units from oversampled strata are over-represented in the raw sample; their contribution to the aggregate estimate must be down-weighted by the factor 1/πᵢ or equivalently by N_h/(n_h/N_h) = N_h/f_h per stratum.
This weighting is theoretically straightforward but creates practical complications: (1) survey software packages that do not support complex survey design specifications may compute incorrect standard errors if treated as SRS data; (2) subgroup analyses must account for the sampling weights, or estimates will be biased toward oversampled strata; (3) the design effect (DEFF = V(ȳ_st) / V(ȳ_SRS at same n)) may be greater than 1 for disproportionate allocation applied to outcomes weakly correlated with the stratification variable — disproportionate allocation can be less efficient than SRS for outcomes not related to the stratification variable (Kish, 1965, pp. 94–95; Groves et al., 2009, pp. 101–104).
Doctoral researchers must specify all sampling weights in their data documentation (codebook), confirm that analysis software correctly applies the design weights, and report analyses with and without weights as a sensitivity check if the weighting effect is substantial.
When a stratum contains only a single sampled unit (n_h = 1), the within-stratum sample variance s_h² is undefined — one cannot estimate variance from a single observation. This scenario, which Wolter (2007) calls the "lonely PSU" problem in cluster sampling, has the equivalent "lonely stratum unit" form in stratified sampling. It arises most commonly when strata are very small (N_h is small) and the allocation assigns n_h = 1 to save sample budget.
The standard remedy is the collapsed stratum estimator: pair adjacent strata (e.g., stratum 3 and stratum 4 are merged for variance estimation purposes, though the means are still estimated separately). This produces a conservative (upwardly biased) variance estimate for the merged pair. The degree of conservatism is bounded and predictable, making the collapsed stratum estimator methodologically preferable to any ad hoc variance imputation. The procedure, its rationale, and the stratum pairs involved must be documented in the analysis report (Cochran, 1977, pp. 136–138; Wolter, 2007, pp. 158–162).
Prevention is preferable to remedy: in the allocation phase, enforce a minimum n_h ≥ 2 across all strata. This constraint slightly reduces efficiency relative to unconstrained Neyman allocation but ensures variance estimability for every stratum. In large-scale government surveys, n_h ≥ 2 is a mandatory design requirement (Groves et al., 2009).
If the stratification variable has low or zero correlation with the primary outcome variable, stratified sampling with proportional allocation provides no variance reduction over SRS — and with disproportionate allocation, may actively increase variance. The gain from stratification, as shown in the formula V(ȳ_SRS) − V(ȳ_st,prop) = (1/n)·Σ W_h(Ȳ_h − Ȳ)², is zero if and only if all stratum means Ȳ_h are equal to the population mean Ȳ. If the stratification variable is genuinely uncorrelated with the outcome, this condition holds, and stratification produces a design that is no more efficient than SRS while being more complex to implement and analyse.
In multi-outcome surveys — the norm in social science research — this is a ubiquitous practical dilemma: the stratification variable may be strongly correlated with one outcome (justifying stratification for that outcome) but weakly correlated with another (providing no efficiency benefit for that outcome). The researcher must prioritise the primary outcome for stratification purposes and acknowledge that estimates of secondary outcomes may not benefit from the stratification efficiency gain (Kish, 1965, pp. 90–95; Cochran, 1977, pp. 122–127).
Non-response within strata is more complex in stratified designs than in SRS, because it may differ systematically across strata. If response rates vary by stratum — a nearly universal finding in stratified surveys — the effective achieved sample in each stratum diverges from the planned n_h. If within-stratum non-response is Missing at Random (MAR) conditional on stratum membership, the stratified estimator remains approximately unbiased. If non-response is Missing Not at Random (MNAR) — i.e., the probability of non-response depends on the outcome variable even within stratum — the estimator will be biased regardless of how well the strata were constructed.
The recommended approach: compute within-stratum response rates; apply non-response weighting adjustment factors r_h = n_h,total/n_h,respondents within each stratum; document these adjustments in the analysis report; conduct sensitivity analyses to assess the potential magnitude of MNAR bias. The AAPOR (2016) response rate formulas must be calculated and reported separately for each stratum in a transparent survey methodology report (Little & Rubin, 2002; Groves et al., 2009, pp. 208–213).
Stratified Sampling vs. Other Probability Designs
Stratified random sampling is often the most statistically efficient probability design when strata can be meaningfully defined and the stratification variable is substantially correlated with the outcome. Understanding exactly where it excels and where it is outperformed is essential for methodologically justified design selection.
The Design Effect (DEFF) of Stratified Sampling
The design effect of any complex design relative to SRS is defined as DEFF = V(ȳ_complex) / V(ȳ_SRS,same n). For stratified sampling with proportional allocation: DEFF_st,prop = V(ȳ_st,prop) / V(ȳ_SRS) ≤ 1, with equality only when all strata means are identical. This means the effective sample size of a stratified sample is n_eff = n / DEFF_st ≥ n — the stratified sample of size n is statistically equivalent to an SRS of size n/DEFF_st > n. For Neyman allocation, DEFF is even smaller. This is the fundamental argument for stratification: you get "more than n" in statistical efficiency terms (Kish, 1965, pp. 258–260).
When Stratified Sampling Is the Optimal Choice
Substantive Heterogeneity Among Subgroups
When the population contains meaningfully different subgroups — by institution type, geographic region, demographic category, industry sector, or any variable known to correlate with the primary outcome — stratification is the statistically principled design. The larger the difference between stratum means (Ȳ_h − Ȳ), the larger the variance reduction, and the stronger the justification for stratification over SRS.
Mandatory Subgroup Estimates
When the research design requires separate reliable estimates for each subgroup — by regulatory requirement, multi-site mandate, or stakeholder accountability — stratified sampling is the only probability design that guarantees a minimum n_h for each subgroup by design. SRS cannot provide this guarantee: a rare subgroup comprising 5% of the population will yield an expected 0.05n observations, with substantial variance around this expectation.
Known, Stable Stratum Membership
Stratification is most effective — and its theoretical guarantees hold most cleanly — when stratum membership is known from reliable administrative records, is stable over the study period, and is verifiable for every element in the frame. Administrative databases in healthcare, education, and government contexts often provide exactly this, making them ideal settings for stratified sampling designs.
Cost Differentials Across Subgroups
When data collection costs vary substantially across subgroups — e.g., rural versus urban respondents, inpatient versus outpatient clinical records, international versus domestic firms — cost-optimal allocation makes stratified sampling uniquely able to minimise total survey cost for a target precision. No other common probability design incorporates cost differentials directly into the allocation formula.
Implementation Protocol for Doctoral Research
Rigorous implementation of stratified random sampling requires explicit documentation of every methodologically consequential decision: the stratification variable, the source and accuracy of stratum size data, the allocation method and its justification, and the within-stratum randomisation procedure. The following seven-step protocol meets the reporting standards of APA 7th Edition, STROBE, and CONSORT-equivalent guidelines.
Define Population, Frame & Strata
Write inclusion/exclusion criteria. Obtain the frame. Identify the stratification variable and confirm it is recorded for every element. Document frame source, date, and N_h per stratum.
Audit Frame for Errors
Check for duplicate records, elements belonging to multiple strata, and elements missing stratum classification. Resolve ambiguities using pre-specified rules before any sampling occurs.
Determine n and Allocation
Compute total n using Cochran's formula. Select allocation method: proportional (EPSEM), Neyman (minimum variance), or cost-optimal. Enforce n_h ≥ 2 in all strata. Document choice with justification.
Assign Sequential IDs Within Strata
Number elements 1 to N_h within each stratum. The numbering is the within-stratum frame. The ordering within each stratum does not affect the stratified estimator (SRS within each stratum is used).
Draw Independent SRS Within Each Stratum
For each stratum h, use a validated PRNG with a documented seed to select n_h units without replacement from the N_h elements. Record seeds, software version, and the full list of selected IDs per stratum.
Contact, Collect & Handle Non-Response
Apply the pre-specified non-response protocol. Record response and refusal outcomes per stratum using AAPOR disposition codes. Compute within-stratum response rates. Apply non-response weighting adjustments if warranted.
Estimate & Report
Compute ȳ_st = ΣW_h ȳ_h. Compute v̂(ȳ_st) = ΣW_h²(1−f_h)s_h²/n_h. Report allocation weights, response rates, and variance estimator — not SRS SE as proxy.
Allocation Method Selection Guide
| Research Condition | Recommended Allocation | Key Justification | Weighting Required? |
|---|---|---|---|
| No strong prior knowledge of S_h; equal strata importance | Proportional (Bowley) | EPSEM; self-weighting; always ≤ V(ȳ_SRS); simple to communicate | No — self-weighting |
| S_h known or estimable from pilot; single primary outcome | Neyman Optimal | Minimises V(ȳ_st) for fixed n; substantial efficiency gain when S_h vary | Yes — W_h/f_h weights needed |
| Per-unit costs differ across strata; budget constraint binding | Cost-Optimal | Minimises cost for target precision; practically important in multi-site and international studies | Yes — W_h/f_h weights needed |
| Guaranteed minimum precision per domain required | Equal or minimum-n_h constrained | Ensures each domain estimate meets precision target regardless of stratum size | Yes — disproportionate allocation |
| S_h completely unknown; pilot data unavailable | Proportional (safe default) | Never worse than SRS; requires no prior variance estimates; robust to misspecification | No — self-weighting |
Computing S_h for Neyman Allocation
The S_h Estimation Challenge in Practice
Neyman allocation requires S_h — the within-stratum population standard deviation — before the sample is drawn. In practice, S_h is never known exactly and must be estimated. Four approaches are in common use: (1) Pilot survey: A small preliminary sample (n_h,pilot ≈ 20–30 per stratum) drawn before the main survey provides S_h estimates. Most reliable but adds time and cost. (2) Prior survey data: Within-stratum variances from a previous survey of the same or similar population. Valid when the population is stable. (3) Administrative records: When the outcome variable (or a proxy) is available for all N_h elements, the exact S_h can be computed. Increasingly feasible with linked administrative datasets. (4) Range estimation: For bounded variables, S_h ≈ Range_h/4 (approximately). Crude but sometimes sufficient for allocation planning. Cochran (1977, pp. 105–106) notes that moderately inaccurate S_h estimates still yield near-optimal allocations because the loss of efficiency from imprecise S_h estimates is generally small when the ratio of largest to smallest S_h is less than 3:1.
Reporting Requirements for Stratified Sampling in Peer-Reviewed Research
(a) Stratification variable: Identify the variable(s) used to define strata; justify its relevance as a correlate of the primary outcome variable; document the source, date, and accuracy of the stratum membership data.
(b) Stratum sizes: Report N, H, and all N_h values; document the administrative source from which N_h were obtained; note whether N_h were exact or estimated, and if estimated, describe the estimation method.
(c) Allocation method: Name the allocation method (proportional, Neyman, equal, or cost-optimal); provide the computed n_h for each stratum; justify the allocation choice in terms of the research objectives (precision per domain? minimum variance? cost efficiency?).
(d) Within-stratum randomisation: Specify the software package and version used; report the random seed(s) used for within-stratum selection; confirm SRS without replacement was applied within each stratum.
(e) Sampling weights: Identify all post-sampling weight adjustments, including the base weights W_h/f_h, non-response adjustments, and any raking or calibration weights applied. Provide the weight variable in the dataset with documentation in the codebook.
(f) Variance estimation: Explicitly state that the stratified variance formula v̂(ȳ_st) = ΣW_h²(1−f_h)s_h²/n_h was used; name the survey analysis software (e.g., R survey package, Stata svyset, SAS PROC SURVEYMEANS, SPSS Complex Samples). Do not report standard errors computed assuming SRS — this is a prevalent and consequential error in applied research.
(g) Non-response by stratum: Report within-stratum response rates per AAPOR standards; document the non-response protocol; describe any non-response weighting applied and the assumptions underpinning it.
Survey Software Commands for Stratified Design Specification
| Software | Design Specification Syntax | Notes |
|---|---|---|
| R (survey package) | svydesign(ids=~1, strata=~stratum_var, weights=~wt, fpc=~N_h, data=df) | Lumley (2010); svymean(), svytotal() for estimates |
| Stata | svyset [pweight=wt], strata(stratum_var) fpc(N_h) | Then: svy: mean outcome_var; svy: proportion categorical_var |
| SAS | PROC SURVEYMEANS DATA=df STRATA stratum_var; WEIGHT wt; TOTAL N_h_var; | Outputs stratified estimates, SE, and CL automatically |
| SPSS Complex Samples | CSPLAN … STRATA stratum_var / INCLPROB wt_var. | Followed by CSDESCRIPTIVES or CSSELECT for analysis |
| Python (samplics) | TaylorEstimator(param="mean").estimate(y, strat=strat_var, samp_weight=wt) | samplics library; supports Taylor linearisation for variance estimation |
Doctoral-Level Self-Assessment
These questions require application of theoretical and mathematical concepts, not rote recall. Questions are calibrated to doctoral comprehensive examination standard and emphasise the properties of stratified sampling that distinguish it from SRS, systematic sampling, and cluster designs.
Self-Assessment Quiz — Stratified Random Sampling
Select the best answer for each item, then submit for scored feedback.
Primary Scholarly References
All content in this reference is grounded in peer-reviewed foundational literature in survey sampling methodology. References are formatted per APA 7th Edition.
- (1934). On the two different aspects of the representative method: The method of stratified sampling and the method of purposive selection. Journal of the Royal Statistical Society, 97(4), 558–625. [The foundational derivation of optimal (Neyman) allocation; establishes the probabilistic framework for stratified sampling and proves optimality over proportional allocation when stratum variances differ.]
- (1977). Sampling techniques (3rd ed.). John Wiley & Sons. [Chapters 5 and 6 provide the definitive doctoral-level treatment of stratified sampling theory, all allocation methods, variance formulas, the gain from stratification, post-stratification, and comparison with SRS and systematic designs.]
- (1965). Survey sampling. John Wiley & Sons. [Chapter 3 covers stratified sampling design, EPSEM conditions, design effects, and the efficiency comparison framework between stratified and other probability designs.]
- (1926). Measurements of the precision attained in sampling. Bulletin de l'Institut International de Statistique, 22, 1–62. [The originating paper for proportional allocation — establishes the equivalence between the stratified sample fraction and the population fraction within each stratum, laying the groundwork for Neyman's later optimal allocation proof.]
- (2010). Sampling: Design and analysis (2nd ed.). Brooks/Cole. [Chapters 3–4 provide a rigorous and accessible treatment of stratified sampling, post-stratification, collapsing strata, and software implementation of complex stratified designs in R.]
- (2007). Introduction to variance estimation (2nd ed.). Springer. [Chapters 3 and 6 cover variance estimation in stratified designs, the collapsed stratum estimator for sparse strata, jackknife and balanced repeated replication methods as alternatives to Taylor linearisation.]
- (2009). Survey methodology (2nd ed.). John Wiley & Sons. [Total survey error framework applied to stratified designs; non-response within strata; frame coverage error and stratum membership misclassification; post-stratification calibration weighting.]
- (1953). Sample survey methods and theory (Vols. 1–2). John Wiley & Sons. [Comprehensive design-based inference framework; mathematical proofs of the variance inequality chain for equal, proportional, and optimal allocation in the finite-population context.]
- (1959). Minimum variance stratification. Journal of the American Statistical Association, 54(285), 88–101. [Establishes the cumulative square root frequency rule for optimal stratum boundary determination when the population distribution of the stratification variable is known or estimable.]
- (1992). Model assisted survey sampling. Springer. [Sections 3.5–3.7 provide advanced model-assisted theory for stratified designs; the GREG estimator under stratification; calibration estimators as extensions of post-stratification.]
- (2002). Statistical analysis with missing data (2nd ed.). John Wiley & Sons. [MCAR/MAR/MNAR taxonomy applied to within-stratum non-response; multiple imputation and maximum likelihood methods for non-response adjustment in stratified samples.]
- (2010). Complex surveys: A guide to analysis using R. Wiley. [Practical implementation of stratified sampling designs using R's survey package; svydesign(), svymean(), svytotal(), and domain estimation with strata specification; variance estimation via Taylor linearisation.]
- (2016). Standard definitions: Final dispositions of case codes and outcome rates for surveys (9th ed.). AAPOR. [Mandatory reference for within-stratum response rate computation and reporting; non-response classification codes applicable to stratified survey designs.]
Recommended Further Reading for Doctoral Candidates
For the most rigorous mathematical treatment of optimal stratification boundaries: Dalenius, T. (1950). The problem of optimum stratification. Skandinavisk Aktuarietidskrift, 33, 203–213. For model-assisted post-stratification and calibration weighting in complex samples: Deville, J-C., & Särndal, C-E. (1992). Calibration estimators in survey sampling. Journal of the American Statistical Association, 87(418), 376–382. For applied implementation in Stata of stratified complex survey designs: StataCorp (current edition). Stata Survey Data Reference Manual — specifically the svyset and svy: mean documentation. For the design effect literature and its relationship to stratified versus cluster designs: Kish, L. (1987). Statistical Design for Research. Wiley — Chapter 4.