Skewness and Kurtosis: What Your Data's Shape Tells You

Skewness and kurtosis are the two statistics that describe the shape of a distribution. Skewness measures how asymmetric the distribution is. Kurtosis measures how heavy its tails are. Together they tell you whether your data behaves like a normal distribution or departs from it in ways that affect which statistical tests are appropriate.

This guide covers what skewness and kurtosis are, how to interpret their values, what real-world variables show extreme shape statistics, how to detect shape problems in your data, what to do when your data is heavily skewed or has heavy tails, and how to report shape statistics in APA format. For the broader framework that shape statistics sit within, see the complete guide to descriptive statistics. For the reference distribution against which shape statistics are usually compared, see the guide to the normal distribution and its properties.

Quick Answer

Skewness measures asymmetry. Positive skew means a long right tail. Negative skew means a long left tail. A skewness of 0 means perfect symmetry. As a rough guideline, absolute skewness above 1 indicates substantial asymmetry, and absolute skewness above 2 indicates severe skew.

Kurtosis measures tail weight. High kurtosis (leptokurtic) means heavier tails and more extreme values than a normal distribution. Low kurtosis (platykurtic) means lighter tails and fewer extreme values. The normal distribution has a kurtosis of 3, and "excess kurtosis" subtracts 3 so a normal distribution has excess kurtosis of 0.

When shape matters. Most parametric tests ( t-tests, ANOVA, linear regression) assume approximate normality. Substantial skew or extreme kurtosis can violate that assumption. The Central Limit Theorem provides protection at larger sample sizes, but small-sample tests are sensitive to shape.

What to do when data is skewed. Visualize first. Then consider transformations (log, square root), non-parametric tests, or robust methods that don't rely on normality assumptions.

What Is Skewness?

Skewness is a single number that summarizes how symmetric or asymmetric a distribution is. A perfectly symmetric distribution, like the normal distribution, has a skewness of exactly zero. Distributions with a long tail extending to the right have positive skewness. Distributions with a long tail extending to the left have negative skewness. The further skewness is from zero, the more asymmetric the distribution.

Skewness is mathematically defined as the standardized third moment of the distribution, which sounds intimidating but means something simple. It captures how much weight is in one tail relative to the other. Most statistical software reports skewness automatically as part of any descriptive statistics output, so you rarely need to calculate it by hand.

Positive skew (right skew)

Positive skew means the right tail of the distribution is longer than the left. A few large values pull the distribution rightward. Most of the data clusters at lower values, with a small number of large values extending far to the right. The mean is pulled higher than the median, because the large values affect the mean but not the median.

Income is the classic example of positive skew. Most households earn modest amounts. A small number earn very high amounts. The distribution has a long right tail extending toward the highest earners, and the mean income exceeds the median income by a substantial margin. The Survey of Consumer Finances, the dataset used in Fisher and Yao (2017), shows positive skew in income, net worth, asset holdings, and almost every other financial measure.

Negative skew (left skew)

Negative skew means the left tail is longer than the right. A few unusually small values pull the distribution leftward. Most of the data clusters at higher values, with a small number of low values extending far to the left. The mean is pulled lower than the median.

Negative skew is less common in nature than positive skew, but it shows up in specific contexts. Examination scores on an easy test can be negatively skewed: most students score near the top, with a small number of low scorers in the tail. Age at death in a wealthy modern population is also negatively skewed: most people live to old age, with a small number of earlier deaths in the left tail.

Interpreting skewness values

Statistical software returns a skewness value as a number. The interpretation follows rough guidelines that researchers use as a heuristic:

Skewness near 0 (between roughly -0.5 and 0.5): approximately symmetric. The distribution is close enough to symmetry that the mean and median should give similar values.
Skewness between -1 and -0.5 or 0.5 and 1: moderately skewed. The asymmetry is noticeable but not severe. Parametric tests are usually still reasonable at moderate sample sizes.
Skewness beyond -1 or 1: substantially skewed. Mean and median diverge meaningfully. Consider whether parametric assumptions are met.
Skewness beyond -2 or 2: severely skewed. The distribution is far from normal. Transformations or non-parametric methods are usually more appropriate.

These guidelines are not hard cutoffs. They depend on sample size, the specific test being used, and how robust that test is to non-normality. A skewness of 0.8 in a sample of 5,000 is rarely a problem. The same skewness in a sample of 30 is.

What Is Kurtosis?

Kurtosis is a single number that summarizes how heavy the tails of a distribution are. High kurtosis means extreme values (far from the mean) are more common than they would be in a normal distribution. Low kurtosis means extreme values are less common. Like skewness, kurtosis is reported automatically by most statistical software.

Kurtosis is mathematically defined as the standardized fourth moment of the distribution. The normal distribution has a kurtosis of exactly 3, which is the reference value. Most software reports "excess kurtosis," which subtracts 3 so that a normal distribution has an excess kurtosis of 0. Pay attention to which definition your software is using, because the same data can produce a kurtosis of 3 or an excess kurtosis of 0 depending on the convention.

Leptokurtic distributions (heavy tails)

Leptokurtic distributions have a sharper peak and heavier tails than a normal distribution. Extreme values are more frequent than the normal distribution would predict. Excess kurtosis is positive.

Financial returns are the most-studied example. Stock returns, bond returns, and exchange-rate movements all show leptokurtic distributions. Most days have small returns, but extreme moves (large gains or losses) happen more often than a normal distribution would predict. This is one reason that risk models built on the assumption of normality (like the original Black-Scholes options-pricing assumptions) systematically underestimate the frequency of extreme events.

Platykurtic distributions (light tails)

Platykurtic distributions have a flatter peak and lighter tails than a normal distribution. Extreme values are less common than the normal distribution would predict. Excess kurtosis is negative.

A uniform distribution (one where every value in a range is equally likely) is the classic platykurtic example. Real-world data is rarely strongly platykurtic, but bounded measurements like proportions or ratings on short scales can show light-tailed behavior at the extremes of their ranges.

Mesokurtic distributions

Mesokurtic distributions have tail weight similar to a normal distribution. The normal distribution itself is mesokurtic by definition. A distribution can be approximately mesokurtic without being normal: it just needs to have tails of similar weight to the normal.

Kurtosis versus excess kurtosis

The two conventions cause regular confusion. Plain kurtosis sets the normal distribution at 3 and reports values relative to zero. Excess kurtosis subtracts 3 so the normal distribution sits at zero. Most modern software defaults to excess kurtosis. When you read a number like "kurtosis = 2.5," check which convention the source is using. A "kurtosis" of 2.5 in the plain definition is less heavy-tailed than normal. An "excess kurtosis" of 2.5 is substantially heavier-tailed than normal. The values describe opposite phenomena.

Real-World Examples of Skewed and High-Kurtosis Data

Knowing which kinds of variables tend to show extreme shape statistics helps you anticipate diagnostics before running them.

Variables that are typically positively skewed: income, household wealth, net worth, home values, company revenues, length of hospital stay, reaction times in raw form, counts of rare events, and most measures with a natural lower bound at zero but no upper bound.

Variables that are typically negatively skewed: test scores on easy assessments, age at death in modern wealthy populations, proportions of correct responses near a ceiling, and some satisfaction ratings where most respondents rate items positively.

Variables that often show high kurtosis: financial returns, insurance claim sizes, network packet sizes, and most variables where extreme events occur more often than a normal-distribution model would predict. Researchers studying financial decision-making, including Fisher and Yao (2017), routinely deal with variables that are simultaneously positively skewed and leptokurtic. The Survey of Consumer Finances includes some variables with skewness above 10 and excess kurtosis above 100, far outside any normality assumption.

For deeper coverage of the central-tendency measures that respond differently to skewed data, see Editor World's guide to mean, median, and mode. For the variability measures that pair with each, see the guide to standard deviation and variance.

How to Detect Skewness and Kurtosis in Your Data

Three diagnostic approaches work together to assess shape. Use at least two of them before drawing conclusions.

Visual inspection

A histogram is the first and most useful diagnostic. Plot your data and look at it. Symmetric data forms a balanced shape with similar tails on both sides. Positively skewed data has a peak on the left with a long tail trailing right. Negatively skewed data has the opposite pattern. Heavy-tailed data has more weight in the extremes than you would expect for the spread of the distribution.

A box plot complements the histogram by showing the median, quartiles, and outliers explicitly. Asymmetric whisker lengths in a box plot are a clear visual signal of skew. A box plot with many outlier dots in one tail signals heavy tails on that side. For a full decision framework on which visualization fits which question, see Editor World's guide to box plots, scatter plots, and choosing the right visualization.

Summary statistics

Skewness and kurtosis values quantify what the visualizations show. Use them alongside the visualizations, not as a replacement. A skewness value of 0.3 paired with a clearly skewed histogram may indicate a software calculation issue or a sample size too small for stable shape statistics.

The simplest summary diagnostic: compare the mean and median. If they're close, the distribution is approximately symmetric. If the mean is meaningfully higher than the median, the distribution is positively skewed. If the mean is meaningfully lower than the median, the distribution is negatively skewed. This rough check catches most major asymmetries even without computing the skewness statistic.

Comparison to a reference distribution

A Q-Q plot compares your data's quantiles to those of a theoretical normal distribution. If the data is normal, points fall along a straight diagonal line. Skewed data produces a curved Q-Q plot. Heavy-tailed data produces a Q-Q plot with departures from the line at both tails. Q-Q plots show shape problems clearly and are a standard diagnostic in research workflows.

What to Do When Your Data Is Skewed

Substantial skew or extreme kurtosis doesn't automatically invalidate your analysis. The right response depends on the test you want to run, your sample size, and the severity of the shape problem.

Use the Central Limit Theorem at large sample sizes

Many parametric tests are robust to moderate non-normality when sample sizes are large. The Central Limit Theorem ensures that the sampling distribution of the mean approaches normality as the sample size grows. With samples above 30 or 50, mild to moderate skewness rarely causes problems for t-tests or simple ANOVA. With samples above several hundred, even substantial skewness is often acceptable for these tests.

Apply transformations

When skewness is severe and sample sizes are not large enough to rely on the Central Limit Theorem, a transformation can bring the distribution closer to normality. The most common transformations for positively skewed data are the log transformation (best for data with strong right skew and all positive values), the square root transformation (milder, suitable for count data), and the Box-Cox or Yeo-Johnson transformations (more flexible, can handle negative values).

Transformations change the units of your data, which has implications for interpretation. A regression coefficient on log income is interpreted as a percentage change, not a dollar change. Report transformed analyses clearly and interpret coefficients in the transformed scale when reporting results.

Switch to non-parametric methods

Non-parametric tests don't assume normality. The Mann-Whitney U test replaces the independent-samples t-test. The Wilcoxon signed-rank test replaces the paired-samples t-test. The Kruskal-Wallis test replaces one-way ANOVA. These methods rank the data rather than working with the raw values, which makes them robust to skewness and outliers but at the cost of some statistical power.

Use robust methods

Modern robust methods, including trimmed means, M-estimators, and bootstrapping, provide alternatives that handle non-normal data without requiring transformation or rank-based analysis. These methods are increasingly available in standard statistical software and are worth considering when the data shape is genuinely problematic.

Reporting Skewness and Kurtosis in APA Format

Most research papers don't report skewness and kurtosis values numerically. Visual diagnostics (histograms and Q-Q plots) and a brief written assessment are usually enough. Numerical shape statistics belong in the manuscript only when a specific decision depends on them.

When you do report numerical values, use APA notation conventions. Skewness and kurtosis are typically reported in a descriptive statistics table alongside the mean, standard deviation, and other summary measures. Use italicized labels:

For text reporting: "The dependent variable showed substantial positive skew (skewness = 1.8) and moderately heavy tails (excess kurtosis = 2.1)."

When the assumption check leads to a methodological decision, document the decision explicitly: "Because the dependent variable showed substantial positive skew, a log transformation was applied before running the regression analysis." Or: "Given the substantial skewness and heavy tails, a Mann-Whitney U test was used in place of the t-test."

Don't over-report. A paragraph on every shape statistic for every variable is excessive. A sentence or two confirming that shape was checked, with details only where it affected a decision, is usually enough.

Common Misunderstandings About Skewness and Kurtosis

"Skewness near zero means the data is normal." No. Skewness near zero means the data is symmetric. A symmetric distribution can still have heavy or light tails that distinguish it from normal. Check kurtosis too.
"High kurtosis means a tall, narrow distribution." Not exactly. Kurtosis measures tail weight, not peak height. Two distributions can have very different peak heights and similar kurtosis values. The visual signal of high kurtosis is the tails, not the peak.
"My data has skewness of 0.3, so I can't use a t-test." Skewness of 0.3 is small. The t-test is robust to small departures from normality, especially at sample sizes above 30. Don't abandon parametric methods over trivial shape statistics.
"Transforming the data fixes the analysis." Transformations can help with shape, but they change the meaning of the variable. A regression on log income answers a different question than a regression on raw income. Interpret transformed results in their transformed scale.
"Kurtosis equals 3 means heavy tails." Depends on the convention. Plain kurtosis of 3 is normal-distribution kurtosis (mesokurtic). Excess kurtosis of 3 is substantially heavier-tailed than normal. Always check which version your software reports.
"Removing outliers will fix the skewness." Outliers and skewness are related but not identical. Removing a few extreme values may reduce skewness measurements, but the underlying shape of the distribution may still be asymmetric. Don't treat outlier removal as a routine skewness fix without justifying it on substantive grounds.

When Professional Editing Helps

Shape statistics are one of the easiest places for small reporting errors to undermine reviewer confidence. Reporting kurtosis without specifying which convention is being used, claiming normality when the visualization clearly shows skew, applying a transformation without explaining the interpretation, and using parametric tests on severely non-normal data without justification are all common issues. Editor World's academic editing services include review of statistical notation, APA compliance, and the substantive accuracy of methodological claims. The same standard is applied across dissertation editing, journal article editing, and essay editing. 100% human editing, no AI at any stage. You choose your own editor from verified profiles, and a free sample edit is available before you commit. Browse available editors by subject expertise to find someone whose background matches your field.

Frequently Asked Questions About Skewness and Kurtosis

What is skewness?

Skewness is a single number that summarizes how symmetric or asymmetric a distribution is. A perfectly symmetric distribution, like the normal distribution, has a skewness of zero. Distributions with a long tail extending to the right have positive skewness. Distributions with a long tail extending to the left have negative skewness. The further the skewness value is from zero, the more asymmetric the distribution. Income, wealth, and most financial measures show positive skewness. Test scores on easy assessments often show negative skewness.

What is kurtosis?

Kurtosis is a single number that summarizes how heavy the tails of a distribution are. High kurtosis (leptokurtic) means extreme values are more common than they would be in a normal distribution. Low kurtosis (platykurtic) means extreme values are less common. The normal distribution has a kurtosis of 3, which serves as the reference value. Most modern software reports excess kurtosis, which subtracts 3 so the normal distribution sits at zero. Financial returns and insurance claim sizes are common examples of leptokurtic data.

What is the difference between skewness and kurtosis?

Skewness measures asymmetry. It tells you whether one tail of the distribution is longer than the other. Kurtosis measures tail weight. It tells you whether extreme values are more or less common than in a normal distribution. A distribution can be symmetric (skewness near zero) and still have heavy tails (high kurtosis). It can also be skewed but have approximately normal tail weight. Both statistics describe different aspects of distribution shape and are interpreted together when assessing departures from normality.

What is the difference between kurtosis and excess kurtosis?

Plain kurtosis sets the normal distribution at 3 and reports tail weight relative to zero. Excess kurtosis subtracts 3 so the normal distribution sits at zero. Most modern software defaults to excess kurtosis. When reading a kurtosis value, check which convention the source is using. A value of 2.5 means very different things in the two conventions. Plain kurtosis of 2.5 is less heavy-tailed than normal. Excess kurtosis of 2.5 is substantially heavier-tailed than normal.

How do I interpret a skewness value?

Researchers use rough guidelines, not hard cutoffs. Skewness between -0.5 and 0.5 indicates approximate symmetry. Skewness between -1 and -0.5 or 0.5 and 1 indicates moderate asymmetry. Skewness beyond -1 or 1 indicates substantial skew. Skewness beyond -2 or 2 indicates severe skew where transformations or non-parametric methods are usually preferred. These thresholds depend on sample size and the test being run. A skewness value of 0.8 with 5,000 observations is rarely a problem. The same value with 30 observations can be.

Is my data too skewed to use a t-test?

Probably not. The t-test is robust to moderate skewness, especially at sample sizes above 30. The Central Limit Theorem makes the sampling distribution of the mean approximately normal even when the underlying data is not. Most published t-tests run on samples with some skewness without problems. Severe skewness (absolute skewness beyond 2) at small sample sizes is more concerning. In that case, consider a transformation, a non-parametric alternative like the Mann-Whitney U test, or a robust method that doesn't assume normality.

How do I fix skewed data?

Several options exist, depending on the severity and the analysis you want to run. The log transformation works well for strongly right-skewed data with all positive values. The square root transformation is milder and suitable for count data. The Box-Cox and Yeo-Johnson transformations are more flexible and can handle data with zero or negative values. Non-parametric tests (Mann-Whitney U, Kruskal-Wallis, Wilcoxon) work on ranked data and don't require normality. Robust methods like trimmed means and bootstrapping provide further alternatives. Each option has implications for how the results should be interpreted.

How do I report skewness and kurtosis in APA format?

Most research papers don't report skewness and kurtosis numerically. Visual diagnostics such as histograms and Q-Q plots, combined with a brief written assessment, are usually enough. When numerical reporting is needed, place the values in a descriptive statistics table alongside the mean and standard deviation, or report them in the text in a brief sentence. Always specify which kurtosis convention is being used (plain kurtosis or excess kurtosis). When a shape statistic leads to a methodological decision such as a transformation or a non-parametric test, document the decision explicitly.

Content reviewed by Editor World editorial staff. Editor World, founded in 2010 by Patti Fisher, PhD, graduate of The Ohio State University, provides professional editing and proofreading services for academic researchers, doctoral candidates, faculty, business professionals, and authors worldwide. BBB A+ accredited since 2010 with 5.0/5 Google Reviews and 5.0/5 Facebook Reviews. More than 100 million words edited for over 8,000 clients in 65+ countries. Stevie Award winner. Native English editors from the United States, the United Kingdom, and Canada with subject-matter expertise across the social sciences, the natural and physical sciences, medicine, engineering, computer science, and the humanities. 100% human editing, no AI at any stage. Less than 5% of applicants are accepted to the editor panel. Recommended by the Boston University Economics Department, University of San Diego, University of Michigan, UCLA, University of Missouri, and more.