Descriptive Statistics: A Complete Guide with Examples

Quick Answer

What descriptive statistics are.
Numbers that summarize and describe the features of a dataset. They don't let you make claims beyond your sample, but they tell readers exactly who you studied, what you measured, and what you found.

The three categories.
Measures of central tendency (mean, median, mode). Measures of variability (range, standard deviation, variance, interquartile range). Measures of shape (skewness, kurtosis).

Why they matter.
Every research paper needs descriptive statistics. They come before any inferential test. They give context that makes the rest of your analysis meaningful.

What Descriptive Statistics Are

Descriptive statistics are the numbers that summarize what's in your data. The mean. The standard deviation. The percentage of women in your sample. The range of ages. The most common response to a survey item. They describe your sample but don't let you make claims beyond it.

Every research paper needs them. Without descriptive statistics, readers can't tell whether your sample is representative, whether your scores are clustered tightly or spread widely, or whether your data is shaped in ways that affect which tests you should run. Descriptive statistics come first in your results section for a reason. They set the stage for everything else.

Consider a real example. Fisher and Yao (2017) studied gender differences in financial risk tolerance using the Survey of Consumer Finances, a large nationally representative dataset. Before they ran any inferential analysis, they reported descriptive statistics for every variable: the mean age of respondents, the mean income and net worth, the proportion of men and women in different risk tolerance categories. Those descriptive statistics gave readers the context to interpret the inferential findings that followed. Without them, the analysis would have been impossible to evaluate.

The Three Categories

Descriptive statistics divide into three categories. Each one answers a different question about your data.

Central tendency answers the question: what's the typical value? The mean, median, and mode are the three measures of central tendency. Each one defines "typical" differently.

Variability answers the question: how spread out is the data? Range, interquartile range, variance, and standard deviation all measure spread in different ways.

Shape answers the question: what does the distribution look like? Skewness measures asymmetry. Kurtosis measures how heavy the tails are.

Measures of Central Tendency

Central tendency describes the center of your data. The three measures use different definitions of "center" and give different answers on the same data.

Mean

The arithmetic average. Add all values and divide by the count. The mean uses every data point in the calculation. It's the standard reporting choice for continuous, roughly symmetric data without extreme outliers.

In a study like Fisher and Yao (2017), the mean age of respondents and the mean income are descriptive statistics that establish who was in the sample. A paper studying retirement decisions in a 65-year-old population means something very different from a paper studying the same outcomes in a 25-year-old population.

Median

The middle value when data is sorted from smallest to largest. The median ignores most of your data and looks only at the center. This makes it resistant to outliers, which is exactly why it's the right choice for skewed variables.

Income is the classic example. Survey of Consumer Finances data, like the dataset used in Fisher and Yao (2017), includes a small number of very high earners whose values pull the mean far above what the typical household earns. The median household income gives a more accurate picture of the typical respondent. This is why government statistics agencies report median household income, not mean household income.

Mode

The most frequent value. The mode is the only measure of central tendency that works for categorical data. For variables like gender, political affiliation, or marital status, the mode tells you the most common response.

When to use each

For continuous symmetric data: mean. For skewed data or outliers: median. For categorical data: mode. For deeper coverage of when each measure works best, including worked examples, see our guide to mean, median, and mode.

Measures of Variability

Variability describes how spread out your data is. Two samples can have the same mean and look completely different. One can be tightly clustered. The other can be widely scattered. Measures of variability capture that difference.

Range

The simplest measure. The largest value minus the smallest value. The range is easy to calculate but tells you almost nothing about how the data is distributed between those extremes. It's most useful for quickly summarizing the limits of your sample, like reporting that participants ranged in age from 18 to 75.

Interquartile range (IQR)

The range of the middle 50% of your data. Calculated as the third quartile (75th percentile) minus the first quartile (25th percentile). The IQR is resistant to outliers, which makes it the right partner for the median when describing skewed data.

For skewed financial data like net worth in the Survey of Consumer Finances, researchers like Fisher and Yao report both the median and the IQR. The IQR shows the spread of the middle half of households without being distorted by the small number of very wealthy outliers at the top of the distribution.

Variance

The average of the squared differences from the mean. Variance captures spread but is in squared units, which makes it hard to interpret directly. If your data is in dollars, the variance is in squared dollars. This is why variance rarely appears in results sections, even though it underlies most inferential tests.

Standard deviation

The square root of the variance. Standard deviation brings the measure back to the original units of your data. If your data is in years, the standard deviation is in years. This is what you report alongside the mean in research papers.

For roughly normal data, the standard deviation has specific interpretations. About 68% of values fall within one standard deviation of the mean. About 95% fall within two. About 99.7% fall within three. This is the 68-95-99.7 rule, and it makes the standard deviation directly useful for thinking about how your data is distributed.

When to use each

For symmetric continuous data: mean paired with standard deviation. For skewed data: median paired with interquartile range. Both pairings come up in papers like Fisher and Yao (2017) because the SCF includes both symmetric variables (age) and skewed variables (income, net worth). For deeper coverage of variability measures, including formulas and worked examples, see our guide to standard deviation and variance.

Measures of Distribution Shape

Shape describes whether your data is symmetric, skewed, or unusual in other ways. Two measures capture shape.

Skewness

A measure of asymmetry. Positive skew means the data has a long right tail (a few large values pulling the distribution rightward). Negative skew means a long left tail. Symmetric data has a skewness near zero.

Financial variables almost always show positive skew. Income, net worth, household wealth, and similar measures cluster around modest values with a small number of very large outliers stretching the tail. The skew of these variables matters for descriptive statistics because it tells you whether to report mean and standard deviation (for symmetric data) or median and IQR (for skewed data).

Kurtosis

A measure of how heavy the tails are. High kurtosis means extreme values are more common than they'd be in a normal distribution. Low kurtosis means the data is more tightly bunched in the center. The normal distribution has a kurtosis of 3, and many statistical packages report "excess kurtosis," which subtracts 3 to make the normal distribution's kurtosis equal to zero.

For most research papers, you don't need to report skewness and kurtosis numerically. A histogram or box plot shows the same information visually. But knowing the shape matters because it affects which tests you can validly run. Many statistical tests assume the data is roughly normal, and severe skew or extreme kurtosis violates that assumption.

Visualization Goes With Descriptive Statistics

Descriptive statistics and visualization work together. The summary statistics give you precise numbers. The visualizations show you the shape and spread that the numbers compress.

A histogram shows the distribution of a single continuous variable. A box plot shows the median, quartiles, and outliers compactly. A scatter plot shows the relationship between two continuous variables. These three visualizations cover most of what researchers need at the descriptive stage. For a full decision framework on choosing the right visualization for your data, see our guide to box plots, scatter plots, and choosing the right visualization.

A study like Fisher and Yao (2017) couldn't have been written without visualization at the descriptive stage. The shape of variables like net worth and income tells the researcher which measures of central tendency and variability are appropriate. Without a histogram or box plot, the choice between mean and median is hard to defend.

Reporting Descriptive Statistics in APA Format

APA 7 has specific conventions for reporting descriptive statistics.

In the text

For a small number of variables, report descriptive statistics directly in the prose. Example:

Respondents ranged in age from 21 to 92 years (M = 49.4, SD = 17.2). Approximately 52% of the sample identified as female. The median household income was $52,000 (IQR = $28,000 to $98,000).

Note the mixed reporting. Symmetric variables (age) get mean and standard deviation. Skewed variables (income) get median and interquartile range. The proportion of women gets a percentage. APA conventions don't require one format for everything; they require the right format for each kind of variable.

In tables

For studies with many variables, descriptive statistics belong in a table. Each row is a variable. Columns include M, SD, Mdn, range, and percentages where relevant. Tables let readers compare descriptive statistics across variables quickly without slowing down the prose.

Studies using large datasets like the Survey of Consumer Finances, including Fisher and Yao (2017), almost always present descriptive statistics in tables. With 20 or 30 variables in a typical financial study, prose reporting would be overwhelming.

Specific notation rules

Italicize statistical symbols: M, SD, Mdn, N, n.
Capitalize and italicize "Mode" when used as a measure: Mode = blue.
Drop leading zeros only on values that can't exceed 1 (proportions, correlations, p-values). Keep them on means, SDs, and other values that can exceed 1.
Report percentages with a percent sign: 52%, not 0.52 or .52.
For tables, include a note line below explaining abbreviations, sample sizes, and any unusual notation.

Common Mistakes

Reporting only the mean for skewed data. If your data is skewed, the mean misrepresents the typical value. Use median and IQR instead, or report both.
Reporting mean without standard deviation. APA requires both. A mean alone doesn't tell readers anything about variability.
Skipping descriptive statistics entirely. Some authors jump straight to inferential results. Reviewers will ask for descriptive statistics. You can't evaluate inferential findings without knowing what the data looked like.
Reporting descriptive statistics that don't match the table. If the text says M = 42.7 and the table says M = 42.1, reviewers will catch the discrepancy. Verify that prose and tables match before submission.
Using mean and SD on Likert scales without justification. Some fields accept this. Others require median for ordinal data. Check your discipline's conventions.
Treating outliers as errors automatically. An outlier in your data isn't necessarily a mistake. It might be a real but unusual case worth investigating. Don't remove outliers without thinking through what they represent.

When You're Ready for Inferential Statistics

Descriptive statistics summarize what's in your data. Inferential statistics let you make claims about a larger population based on your sample. Once your descriptive statistics are reported, the next stage is typically hypothesis testing, confidence intervals, or both.

A study like Fisher and Yao (2017) is a useful model for the full sequence. The authors began with descriptive statistics for every variable. Then they used inferential methods, including ordinary least squares regression and decomposition analysis, to test whether the descriptive differences between men and women in risk tolerance were explained by other variables in their model. The descriptive stage didn't disappear when the inferential analysis began. It anchored everything that followed.

Get Your Research Paper Reviewed

Descriptive statistics look simple but are easy to report incorrectly. Means without standard deviations. Medians without IQR. Italics on the wrong elements. Prose that doesn't match the tables. Reviewers and instructors catch these issues quickly.

Editor World's academic editing services include review of descriptive statistics reporting. Editors verify that the format is correct, the prose matches the tables, italics appear on the right elements, and the right measures of central tendency and variability are paired together. For graduate students preparing a thesis or dissertation, this kind of review catches the small errors that affect committee evaluations.

A free sample edit is available from any editor before you commit. Browse editor profiles by subject expertise to find someone whose background matches your field.

Frequently Asked Questions

What's the difference between descriptive and inferential statistics?

Descriptive statistics summarize what's in your data. They describe your sample but don't let you make claims beyond it. Inferential statistics let you draw conclusions about a larger population based on your sample. Means, standard deviations, medians, ranges, and percentages are all descriptive. T-tests, ANOVA, regression, and chi-square are inferential. Most research papers use both. Descriptive stats describe the sample. Inferential stats test the hypotheses.

Do I need to report all three categories of descriptive statistics?

Not always. Central tendency and variability are almost always reported. Shape (skewness and kurtosis) usually isn't reported numerically unless you have a specific reason. Most papers show shape through a histogram or box plot instead. The exception is when normality matters for a downstream test and you need to demonstrate that the data meets the assumption. In that case, report skewness and kurtosis with the relevant cutoffs.

Why pair mean with SD and median with IQR?

Different measures of central tendency pair naturally with different measures of variability. The mean uses every data point, so its companion measure of variability (the standard deviation) should also use every data point. The median is resistant to outliers, so its companion measure (the IQR) should also be resistant to outliers. Mixing the pairings, like reporting median with standard deviation, gives an inconsistent picture: the central tendency ignores outliers, but the variability measure doesn't.

How many decimal places should I use?

Two decimal places for most descriptive statistics. Three for very small values where two decimals would lose information. The general principle is to use one or two more decimal places than the original data. If your data is in whole numbers, report descriptive statistics to one or two decimals. If your data is already to two decimals, you can go to three. Don't report more precision than the measurement actually has.

Should I report descriptive statistics for every variable in my study?

Yes for variables that play a role in your analysis. Demographics (age, gender, education), key independent variables, key dependent variables, and any covariates all need descriptive statistics. Variables you collected but never used can be skipped or noted briefly. The principle is that readers should be able to understand who was in your sample and what their values looked like on the variables that matter to your conclusions.

Can I report descriptive statistics from someone else's dataset?

Yes. Secondary data analysis is common, and researchers using large public datasets like the Survey of Consumer Finances, the General Social Survey, or the National Longitudinal Survey of Youth all report descriptive statistics for the variables they analyze. The descriptive statistics describe your analytic sample, which may be a subset of the full dataset filtered by inclusion criteria. Document those criteria clearly so readers know which observations are included.

Page last reviewed: May 2026. Editor World, founded in 2010 by Patti Fisher, PhD, is a professional human-only writing, editing, and proofreading marketplace serving researchers and students worldwide. BBB A+ accredited since 2010 with 5.0/5 Google Reviews and 5.0/5 Facebook Reviews.