Understanding QQ Plots

Nick Anderson
2 min readAug 27, 2021

Quantile-Quantile plots can be difficult to understand, and looking at them to try to decipher what they’re telling you can be intimidating. This blog post is here to help.

For starters we need to know what a quantile is. A quantile is just a point in the data where we are ‘cutting’ the data to divide it into a range (usually with the same continuous intervals). So for example if we have a dataset from 1–12 and we wanted to cut it into 4 quantiles, we would make cuts at 3,6, and 9.

1,2,3 — 4,5,6 — 7,8,9 — 10,11,12

These cuts, as illustrated above, would divide the data into one fewer quantile than the number of groups created. (With 3 quantiles, we would have 4 group, etc.)

So now that we understand what a quantile is, we can look to see what plotting them out can tell us.

For a normal curve, groups of equal size tell us that there is an equal chance of observing a value from each group. Groups towards the extremes of the data must be slightly wider than those in the middle, since throughout a normal distribution there will be fewer observations towards the edges.

Plotting the QQ graph is the next step. A QQ graph has 2 axes, one from our data set and one from quantiles created from a normal distribution. Here we plot our quantiles from our dataset with the normal quantiles. We’re looking to see if there is a linear relationship in the points which would confirm a normal distribution. (If our data’s quantiles line up with the normal distribution quantiles, we can assume our data is normally distributed.)

If our data doesn’t match up (the QQ plotting of our data doesn’t ‘hug’ the normal quantiles) on the QQ plot then we will conclude that our data isn’t normally distributed and take measures to adjust the data to be able to properly fit a model (so that it isn’t under/over-fitting).

The biggest thing with understanding if you have a similar distribution between datasets, or deciding if your one dataset is normally distributed is collecting enough data. The more data you collect/have the easier it will be to determine if your datasets are similar or not.

To end, the two biggest takeaways are that QQ plots, while they may look daunting, are pretty simple and very useful when working with data. The other takeaway is that more is better when it comes to having data, because the more data you have, the more you make confident conclusions about said data.

--

--