Ready to Know Something about Distribution?
As you are likely aware, the word "distribution" usually means either the action of passing something to a group of people or the way in which people or things spread out in a place.
However, we treat the word “distribution” as a special term in Mathematics (Statistics). You may want to make a guess about the special meaning of this word following the example given below.
Imagine we want to know how many life insurance policies a UW professor owns. We proceed by sending out some questionnaires to the reachable professors and collecting responses from those who are willing to participate in our small study. After collating the information, we will be able to tell the most or least policies a professor has and sort out the number of professors owning a certain number of policies – i.e., the “distribution” of professors’ life insurance policies.
You may notice that from a mathematical perspective the meaning of a distribution leans towards “the way in which things spread out in a place”, as the numeric answers we receive comprise a distribution, and the observations we make reflect the role played by a distribution in terms of showing how often each value can be found.
But why do we need to define the special term “distribution”? In the other words, what we can do with mathematical distribution? Well, there are many things we can do with distribution. We can do some math to understand its features, just like how we locate the maximum and minimum in the previous example. We can create some plots to visualize how the values scatter among themselves. We can also take advantage of a distribution to forecast; for example, we can predict how many a group of new-hired UW professors are likely to have three life insurance policies. All these sounds cool right?
What makes this idea more appealing is, ever since we recognized the importance and potential power of distributions, our statisticians have named a number of distributions, typically those with unique and distinguishable characteristics, as they believe these distributions are invaluable to both theoretical study and real-world applications.
All of the distributions defined are widely used in one or more than one case, while one of them might be the most commonly used piece – the Normal Distribution. Many of you may wonder, does it mean the other distributions are not “normal”? It’s not quite like that. Early statisticians noticed that many distributions take an identical form, so they decided to give it a generalized name, i.e., normal distribution.
Before we dive into the properties of a normal distribution, I guess those of you who like weight training probably have seen something similar to the image on the bottom left. As you can easily find out, the worn condition appears to be the worst for the weight plates between 70 and 130 lbs. It makes a lot of sense as most people pick weights in this range for their workouts. The wear becomes slighter as the weights go either way, lighter or heavier. Remarkably, there is barely any wear on the most extreme weights, below 30 lbs. and above 250 lbs., probably because very few people would ever attempt those weights in their lives.
On the bottom right is a typical normal distribution. You may realize the change in color saturation follows the same pattern as the change in wear condition. The curve itself appears to be a bell shape. The values depart symmetrically from the center, which is also the average of all values. Most values cluster around the center and taper off as they move away from the center.
Now you have learned some basic concepts of distribution. Want to know more? Click on the link to watch a video that talks about different types of distributions out there ("Probability: Types of Distributions .")!
Please feel free to comment below if you have any questions or have anything that you would like to share. Looking forward to hearing your thoughts!
References:


Hi Huanhuan, thank you for posting this! It's great to learn some statistical stuff. Just curious, you mention that most values are scattering around the centre point, would it be possible to quantify the specific amount of values in that region? Thank you!
ReplyDeleteHi! Thank you for your comment. Yes, it's about 68% of values fall in the central region, so roughly 2/3! Hope this helps.
DeleteHi Huanhuan, thank you for preparing this blog! The topic of distribution looks very interesting. Could you please give some examples on any other things that follow a Normal distribution?
ReplyDeleteThank you Jing! There are many things follow or roughly follow a normal distribution. For example, height, shoe size, assignment grade, etc.
DeleteHi Huanhuan, thank you for the quick introduction to probability distribution. I was looking at the normal distribution plot. I wonder is it correct that the left tail represent the smallest values in a distribution while the right tail represent the largest values?
ReplyDeleteNo problem Tutu! And yes, your interpretation is correct!
DeleteHello! Just want to let you know that I really enjoy reading through this. I see you mention that statisticians would name certain distributions if they have distinctive characteristics. That said, other than the normal distribution, could you give an example of an existing named distribution which can be easily identified?
ReplyDeleteHi Fanfan, it is very encouraging to hear that you like this short blog. Another example I can think of is the so-called Exponential Distribution. It has a shape of a downward curve.
Delete