How to Calculate Sample Size for Your Study: A Practical Guide

Sample size is one of the few methodology decisions that can make or break a study before data collection begins. Too few participants and your study won't have the statistical power to detect the effect you're looking for, even if the effect is real. Too many and you've wasted time and resources collecting data you didn't need, possibly while exposing more participants to research procedures than necessary. Calculating the right sample size isn't a guess. It's a calculation based on what you're trying to find, how confident you want to be, and what kind of study you're running.

This guide walks through how to calculate sample size for both quantitative and qualitative research, with a concrete worked example using G*Power, sample size guidelines by study type, and what to do when you can't get the recommended number. For the broader methodology context, see our research methodology guide. For the foundational concepts of population and sample, see our companion guide on population vs sample in research.

Quick Answer: How Big Should Your Sample Be?

For quantitative studies. Run a power analysis. The calculation requires four inputs: effect size, statistical power (typically 0.80), alpha level (typically 0.05), and the type of statistical test you'll use. Free tools like G*Power do the math.

For qualitative studies. Plan for saturation, the point at which additional participants stop revealing new themes. For interview studies, this is typically 12 to 30 participants depending on sample homogeneity and topic complexity.

For mixed methods. Calculate each strand separately. The quantitative strand uses power analysis. The qualitative strand uses saturation logic.

The biggest mistake. Treating "more is always better." Underpowered studies waste time but so do unnecessarily large studies. The right size is what your specific question requires.

Why Sample Size Matters

Sample size determines whether your study can answer the question you set out to ask. In quantitative research, sample size determines statistical power, which is the probability that your study will detect a real effect when one exists. A study with 80% power has an 80% chance of finding a true effect of the size you specified, and a 20% chance of missing it (a Type II error). Studies with low power produce non-significant results that don't actually demonstrate the absence of an effect. They demonstrate that the study wasn't capable of finding it.

In qualitative research, sample size determines whether you've heard enough perspectives to understand the phenomenon you're studying. Too few participants and you risk drawing conclusions from idiosyncratic experiences. Too many and you're collecting interview data that won't add to your understanding while burning time and resources. The qualitative answer to "how many" is "enough to reach saturation," and the practical question is how to plan for that.

Reviewers screen for sample size adequacy in the methodology section. A study with a sample size that's too small to detect the predicted effect, or a qualitative sample that hasn't been justified relative to the research question, is one of the most common reasons for revise-and-resubmit decisions in peer review.

Calculating Sample Size for Quantitative Studies

For quantitative research, sample size is calculated through a power analysis. The power analysis takes the inputs you provide and tells you the minimum sample size needed to detect an effect of the size you specified, with the statistical confidence you specified. The four inputs are non-negotiable: every quantitative power analysis requires them.

The Four Inputs to a Power Analysis

Input	What it is	How to choose it
Effect size	The magnitude of the relationship you expect to find	Use prior research or pilot data; if neither is available, use Cohen's small (0.2), medium (0.5), or large (0.8) conventions
Statistical power (1 - β)	The probability of detecting an effect if one exists	Convention is 0.80; some fields require 0.90
Alpha level (α)	The probability of a false positive (Type I error)	Convention is 0.05; some fields use 0.01 for stricter standards
Statistical test	The specific test you'll use to analyze the data	Determined by your research question and data type (t-test, ANOVA, regression, chi-square, etc.)

Where Effect Size Estimates Come From

Effect size is the input most graduate researchers struggle with, because it requires you to estimate something you don't know yet (the size of the effect you're trying to detect). Three sources are commonly used.

Prior published research. If similar studies have been published, look up their effect sizes and use the average or median as your estimate. This is the strongest approach when good prior research exists.
Pilot data. If you've run a small pilot study, calculate the effect size from your pilot results. Pilot effect sizes are often unstable due to small samples, so treat them as ballpark estimates rather than precise values.
Cohen's conventions. When neither prior research nor pilot data is available, use Cohen's standard small (0.2), medium (0.5), or large (0.8) effect size conventions. The trade-off is that smaller assumed effect sizes require larger samples, so a small effect size assumption produces a more conservative (larger) sample size estimate.

A common error is to assume a large effect size to keep the calculated sample size manageable. This produces a study that's underpowered if the actual effect is medium or small. The honest approach is to use the smallest effect size that would be theoretically or practically interesting, and recruit accordingly.

A Worked Example Using G*Power

Suppose you're planning an experiment comparing two groups (an intervention group and a control group) on a continuous outcome. You expect a medium effect size, you want 80% power, and you'll use a two-tailed alpha of 0.05. Here's how the calculation runs in G*Power, a free power analysis tool widely used in social science and health research.

Open G*Power and select the test family. For a two-group comparison on a continuous outcome, choose "t tests" from the test family menu.
Select the specific test. Choose "Means: Difference between two independent means (two groups)."
Set the analysis type. Select "A priori: Compute required sample size – given α, power, and effect size."
Enter the inputs. Effect size d = 0.5 (medium). Alpha = 0.05. Power = 0.80. Allocation ratio N2/N1 = 1 (equal groups).
Read the output. G*Power calculates that you need 64 participants per group (128 total) to detect a medium effect with 80% power at α = 0.05.
Adjust for expected attrition. If you expect 10% dropout, increase your recruitment target by that proportion. 128 ÷ 0.90 = 143 participants total.

If you change any input, the required sample size changes. Increasing power to 0.90 raises the requirement to 86 participants per group. Decreasing the assumed effect size to 0.3 raises it to 176 per group. This sensitivity is the point: the power analysis tells you what your specific design requires given your specific assumptions. Document the inputs you used in your methodology section so reviewers can evaluate the choices.

Calculating Sample Size for Qualitative Studies

Qualitative sample size doesn't follow a power analysis logic because qualitative research isn't trying to estimate effects in a population. The goal is to understand a phenomenon in depth, and the question is how many participants are needed to develop that understanding adequately. The concept that does this work in qualitative research is saturation.

What Saturation Means and How to Plan for It

Saturation is the point at which additional data collection stops yielding new information. In an interview study, you reach saturation when the next interview produces themes you've already heard, perspectives you've already documented, and explanations you've already coded. Continuing to interview past that point increases your data volume without increasing your understanding.

The challenge is that you can't predict exactly when saturation will arrive before you start collecting data. The honest answer in your methodology section is to specify a planned sample size range based on the literature for similar studies, commit to assessing saturation iteratively during data collection, and adjust upward if saturation hasn't been reached at the planned endpoint.

Typical Qualitative Sample Sizes by Study Type

Qualitative study type	Typical sample size	What affects the range
In-depth interview study	12 to 30 participants	Sample homogeneity, topic complexity, interview length
Focus group study	3 to 6 groups of 6 to 10 participants	Homogeneity within groups, number of comparison conditions
Phenomenological study	5 to 25 participants	Smaller samples support deeper interpretation; larger samples expand thematic range
Grounded theory study	20 to 30 participants	Theoretical sampling continues until categories are saturated
Ethnographic study	Varies; depth and duration matter more than count	Months to years of fieldwork; participant count is secondary
Case study (single case)	1 case studied in depth	Multiple data sources (interviews, observations, documents)
Multiple case study	3 to 10 cases	Replication logic across cases; not statistical sampling

More homogeneous samples reach saturation faster. A study of first-year international graduate students at one US university typically reaches saturation around 12 to 20 interviews because the participants share important characteristics. A study of working adults from diverse backgrounds in multiple cities may take 30 or more interviews to reach saturation because the sample is more heterogeneous.

Sample Size by Study Type: Quick Reference

The table below summarizes typical sample size ranges across common study types. Treat these as starting points, not as substitutes for a proper power analysis or saturation assessment specific to your study.

Study type	Typical sample size	Sample size logic
Two-group experiment (medium effect)	~64 per group, 128 total	Power analysis with d = 0.5, power = 0.80, α = 0.05
Survey for population estimate	385 to 1,067	Margin of error of 5% to 3% at 95% confidence for large populations
Multiple regression study	10 to 20 per predictor variable	Rule of thumb; verify with power analysis for specific design
Structural equation modeling	200 to 500	Larger samples needed for stable parameter estimates
Pilot study	10 to 30	Preliminary feasibility, not effect size estimation
Interview study	12 to 30	Saturation point varies by sample homogeneity
Case study	1 to 10 cases	Replication logic, not statistical generalization
Systematic review	All studies meeting inclusion criteria	Defined by search strategy, not pre-specified

What to Do When You Can't Get the Recommended Sample Size

Real-world constraints often prevent researchers from collecting the ideal sample. The right response depends on the study type and the size of the gap between what you can collect and what would be ideal.

Reframe the research question. If your sample is too small to detect a small effect, narrow the question to focus on a larger effect that your sample can detect. A small study that answers a focused question well is more valuable than a small study that fails to answer a broader question.
Use a more powerful design. Within-subjects designs (where participants serve as their own controls) typically require fewer participants than between-subjects designs to detect the same effect. Repeated measures designs increase power for the same sample size.
Use a more sensitive outcome measure. Continuous outcomes generally require smaller samples than dichotomous outcomes for equivalent statistical power. If your hypothesis allows it, prefer continuous measurement.
Acknowledge the limitation transparently. If you can't recruit the recommended sample size and can't change the design, name the limitation in your methodology and discussion sections. Reviewers are more confident in studies that acknowledge limitations honestly than in studies that pretend none exist.
Reframe as a pilot study. If your sample is too small to support definitive conclusions, frame the study explicitly as preliminary or exploratory. Pilot studies have legitimate scientific value when reported as such, including informing power analyses for future larger studies.

How to Justify Sample Size in Your Methodology Section

Reviewers expect to see sample size justified, not just stated. The methodology section needs to explain why you chose the sample size you did, not just report the number you ended up with. Strong justifications follow a predictable structure.

State the sample size. Specify the number of participants enrolled, the number who completed the study, and any attrition.
Cite the calculation method. For quantitative studies, name the power analysis tool (e.g., G*Power 3.1) and cite the source. For qualitative studies, cite the saturation literature for similar studies.
Specify the inputs. For quantitative studies, report the assumed effect size, statistical power, alpha level, and statistical test. For qualitative studies, describe how saturation was assessed.
Justify the inputs. Cite prior research, pilot data, or convention for the assumed effect size. Cite established standards for power and alpha (typically 0.80 and 0.05).
Address attrition and missing data. If your final analytic sample is smaller than enrollment, explain why and discuss whether the loss was random or systematic.
Discuss the limits. If your sample is smaller than ideal or doesn't perfectly fit the planned analysis, address the implications in the limitations section.

Common Sample Size Mistakes

The same problems appear in graduate research over and over. Knowing them in advance saves a round of revisions.

Assuming a large effect size to keep the sample manageable. This produces a study that's underpowered if the actual effect is medium or small. Use the smallest effect size that would be theoretically interesting, not the largest you can defend.
Running the power analysis after data collection. Post-hoc power analysis based on observed effect sizes provides no useful information about whether the original study was adequately powered. The power analysis must come before data collection.
Treating qualitative sample size as a number to hit rather than a process to follow. Qualitative sample size depends on saturation, which depends on the data. Specify a planned range and assess saturation iteratively, rather than committing to a fixed number before collecting data.
Confusing sample size for descriptive statistics with sample size for inferential statistics. A sample of 30 may be plenty to describe characteristics of a specific group; it's typically too small to support generalization to a larger population.
Forgetting attrition. Recruit enough participants to end up with the analytic sample your power analysis requires. If you expect 10% dropout, recruit 10% extra at minimum.
Ignoring clustering or nesting in the design. If participants are nested within groups (students within schools, employees within companies), the effective sample size for some analyses is smaller than the participant count. Multilevel power analysis is required.

Frequently Asked Questions

How do I calculate sample size for my study?

For quantitative studies, run a power analysis. The calculation requires four inputs: effect size (the magnitude of the relationship you expect to find), statistical power (typically 0.80), alpha level (typically 0.05), and the type of statistical test you'll use. Free tools like G*Power do the math. For qualitative studies, plan for saturation, the point at which additional participants stop revealing new themes. Interview studies typically reach saturation at 12 to 30 participants depending on sample homogeneity and topic complexity.

What is a power analysis?

A power analysis is the formal calculation that determines how many participants a quantitative study needs. The calculation takes four inputs: the effect size you expect to find, the statistical power you want (typically 0.80, meaning an 80 percent chance of detecting a true effect), the alpha level you'll use (typically 0.05), and the specific statistical test you'll run. The output is the minimum sample size needed to detect an effect of the specified size with the specified confidence.

What sample size do I need for an interview study?

Interview studies typically reach saturation at 12 to 30 participants. The exact number depends on sample homogeneity, topic complexity, and the depth of each interview. More homogeneous samples reach saturation faster. A study of first-year graduate students at one university typically reaches saturation around 12 to 20 interviews. A study of working adults from diverse backgrounds may take 30 or more interviews. The methodology section should specify a planned range and describe how saturation was assessed during data collection.

What is statistical power?

Statistical power is the probability that a study will detect a real effect when one exists. A study with 80 percent power has an 80 percent chance of finding a true effect of the size specified in the power analysis, and a 20 percent chance of missing it (a Type II error). The conventional standard is 0.80, with some fields requiring 0.90 for higher confidence. Studies with low power produce non-significant results that don't actually demonstrate the absence of an effect; they demonstrate that the study wasn't capable of finding it.

What is effect size in a power analysis?

Effect size is the magnitude of the relationship between variables that the study aims to detect. Common effect size measures include Cohen's d for differences between means, r for correlations, and f for analysis of variance. Cohen's conventions classify effect sizes as small (d = 0.2), medium (d = 0.5), or large (d = 0.8). Effect size estimates for power analysis come from prior published research, pilot data, or Cohen's conventions when neither is available. Smaller assumed effect sizes require larger samples, so the choice has direct consequences for sample size requirements.

What is saturation in qualitative research?

Saturation is the point at which additional data collection stops yielding new information. In an interview study, saturation is reached when the next interview produces themes already heard, perspectives already documented, and explanations already coded. The concept replaces statistical sample size logic in qualitative research because the goal is depth of understanding rather than population estimation. Saturation can't be predicted exactly in advance; the methodology section should specify a planned sample size range and commit to assessing saturation iteratively during data collection.

What sample size do I need for a survey?

Survey sample size depends on the desired margin of error, confidence level, and population size. For a large population at 95 percent confidence, a margin of error of 5 percent requires approximately 385 respondents, and a margin of error of 3 percent requires approximately 1,067 respondents. The calculation assumes random sampling; convenience samples don't support the same kind of inference regardless of size. Online sample size calculators perform this calculation if you input the population size, confidence level, and desired margin of error.

Can I run a power analysis after collecting data?

Post-hoc power analysis based on observed effect sizes provides no useful information about whether the original study was adequately powered. The calculation is mathematically circular: a study that found a large observed effect will compute as having had high power, while a study that found a small observed effect will compute as having had low power, even if the true effect is the same in both cases. The power analysis must be conducted before data collection, with effect size estimated from prior research or convention rather than from the study's own results.

What if I can't get the sample size my power analysis recommends?

Several strategies help when the recommended sample size isn't feasible. Reframe the research question to focus on a larger effect that your sample can detect. Use a more powerful design (within-subjects rather than between-subjects, repeated measures). Use a more sensitive outcome measure (continuous rather than dichotomous). If you can't change the design or the question, acknowledge the limitation transparently in the methodology and discussion sections. Reframing the study as preliminary or exploratory is also legitimate when the sample is too small to support definitive conclusions.

Professional Editing for Your Research Manuscript

Sample size justification is one of the methodology elements reviewers screen for first. A vague or incomplete sample size discussion can trigger a desk rejection or major-revisions decision before reviewers evaluate the rest of the study. Clear writing about the inputs to your power analysis, the source of your effect size estimates, and the limitations of your final sample is one of the strongest signals that the rest of the methodology is rigorous.

Editor World provides dissertation editing and academic editing services for researchers preparing theses, dissertations, and journal article submissions. Every editor is a native English speaker from the United States, the United Kingdom, or Canada, with an advanced degree in their field. Every document is reviewed by a real person, never by AI. To see who would be working on your manuscript, you can choose your own editor from the Editor World roster, or request a free sample edit of up to 300 words before committing to a full edit. Pricing is fully transparent through an instant price calculator that shows your exact cost before you commit.

A certificate of editing confirming human-only native English editing is available as an optional add-on for journal submissions where AI use must be disclosed. For more on research methodology and design, see our research methodology guide, population vs sample in research, quantitative vs qualitative research, and research bias guide.

This article was reviewed by the Editor World editorial team. Editor World, founded in 2010 by Patti Fisher, PhD, provides professional editing and proofreading services for graduate students, academics, and researchers worldwide. BBB A+ accredited since 2010 with 5.0/5 Google Reviews and 5.0/5 Facebook Reviews. More than 100 million words edited for over 8,000 clients in 65+ countries.