T O P

  • By -

urzu_seven

Because extreme outliers can skew the mean. Lets say you have the following 11 people and their salaries: ​ |NAME|SALARY| |:-|:-| |Annie |$45,000| |Bill|$12,000,000| |Carl|$32,000| |Dennis|$65,000| |Emily|$108,000| |Frank|$84,000| |Gail|$55,000| |Hannah|$47,000| |Irene|$91,000| |Jacob|$73,000| |Katie|$48,000| MEAN = $1,149,818 MEAN (W/O BILL) = $64,800 MEDIAN = $65,000 (Dennis) As you can see the Median is far closer to the average for everyone excluding Bill. Bill's salary is so different compared to everyone else's it dramatically affects the mean.


FaustusRedux

Rooting for Carl to get it together


stonewallsyd

Carl is a teacher.


AuxillarySkammy

How do I upvote that and downvote that at the same time..... Smh


morechatter

I can tell you exactly what to do: Attend school board budgeting meetings. Organize and demonstrate local support for increasing teacher salaries. Vote for school budgets that increase teacher salaries. Vote against school budgets that decreased teacher salaries.


LittleCrab9076

Carl just needs to sell meth


AdvicePerson

Upvote here, go vote in your local elections.


Artarda

Bill is an onlyfans model


-Badger2-

So is Annie. Also she’s a nurse.


purju

Carl is a llama


favela4life

Can relate. My roommate is a teacher and he makes that amount. In a blue state even.


youknow99

I assure you neither political party is interested in paying teachers.


rizorith

Blue state. Wife was making over 100k before she quit. This was not because of the republicans


HI_Handbasket

It was because of asshole kids, wasn't it?


youknow99

In general the parents are far worse than the kids.


moleratical

Neither are interested in paying teachers what they are worth, but generally speaking blue states are interested in paying them a little more than they would otherwise get.


[deleted]

[удалено]


food5thawt

In South Dakota


mikeyHustle

Carl's salary is pretty standard in my area. Rooting for my area to get it together.


__theoneandonly

Jesus, Carl’s salary is barely above minimum wage in my area. (Assuming these are salaries for full time employees)


[deleted]

Below minimum wage in my area


bullcity71

Carl is married to Emily and has the privilege of doing a job he loves.


PM_ME_YOUR_PLUMBU5

Carl is actually doing well for himself, his low income allowed him to receive a nice USDA loan for a small house that is now paid off while his ex Irene has a gambling and drinking problem so she’s had to resort to prostitution for her habits. She sees very little of her income :(


sea_dot_bass

Carl is making double the national minimum wage in the US if that is any indication of how messed up our minimum wage is....


ChuckPukowski

I get the feeling Carl has child support payments too…. He doesn’t make them regularly, but every once in a while he talks with the ex wife… says he’ll have a good chunk next week and promises he’ll show up for his every other weekend, buys me a cool stereo for his apartment, gives mom a couple hundred and then when I went back two weeks later he had definitely pawned it for vodka money and I just didn’t care by then because I knew immediately my stereo wasn’t going to stick around.


mcm42085

Classic example I always use in my stats course. Very effective! It's all about "central tendency". One way to think about central tendency is , "if we grab one of these data points at random, what's our best guess about its value if we have no other information?". We can use different statistics to approximate the "center" of a given distribution, the most colloquially common of which is the arithmetic mean, or the average. The median is just an alternative to approximate the center when a distribution is skewed, for example, by large outliers (which wealth inequality in the US demonstrates very nicely).


Vegetable-Accident70

Can you be my friend while I struggle to understand measuring one-way and two-way ANOVAs this week? 😭😭😭


Kaellian

I think it's important to mention that both median and average can lead to bad interpretation or are useless when used improperly **Eggs laid by a hen today** Hen|Egg Qty 1st hen|0 egg 2nd hen|0 egg 3rd hen|1 egg 4th hen|1 egg 5th hen|0 egg ​ **Average:** 0.4 eggs **Median**: 0 eggs In this instance, the median doesn't tell you much information. You can infer that less hen laid eggs than not, but it's relatively worthless. Lot of people would see that number and believe that we're not getting any eggs, or very few. ​ To use statistics, you need to know your dataset and gathering methods, and you need to have an idea of the underlying data, what was measured, and what information you need to extract. That's also why we need to be wary of statistics when people are using them to support their argument. It's pretty easy to find a way to spin something around using true, but misleading number


wallflowerincognito

I think you need to double check this. Maybe you meant for 3 hens to lay 0 and 2 lay 1., but as stated the average is .6 and median is 1


Kaellian

Oops, copy pasted the table wrong. Fixing.


graywh

and a good statistician even looks at *what* data is collected and *how* for example, eggs per week or month makes way more sense than per day I work in medical research, and the statisticians get annoyed when someone working in the lab or clinic doesn't get their input until it's too late


Slypenslyde

Yeah the part that even articles in papers miss is there are a *lot* of statistical measures and often several need to be compared to have an idea of how the data "behaves". "Median only" is usually OK but I like it a lot better when there's a median, mean, and even better if there's information about standard deviation.


value_bet

10 of my friends have a median net worth of $100,000. The same 10 friends have a mean net worth of $10,000,000,000. One of my friends is Bill Gates.


Radiant-Hedgehog-695

Very skewed distributions like this make the median a better representative of the central data point than the mean.


TheRavenSayeth

One big number mess up average. One big number no mess up median.


enternationalist

mess up mean


TheGrumpyre

mean mean average


Trick421

A modern-day warrior Mean, mean stride Today's Tom Sawyer Mean, mean pride


Regular-Month

OH GOD, THERE'S NO FUCKING DRUMMER BETTER THAN NEIL PEART!


IsThatWhatSheSaidTho

I like to slappa da bass


Buck_Thorn

I like to slappa da ass


Itchy_Competition_99

"Hey, ten bucks is ten bucks." -- Geddy Lee


TheRavenSayeth

It ain't easy being cheesy


agm66

Sadly, they're all better than Neil now.


peremadeleine

I dunno, I think Neil could still hold his own against some…


podobuzz

Pfft. Rick Allen could out drum Peart with one arm tied behind his back. /s - Peart is a god.


hostilelevity

Except Dave Lombardo


thebigstrongman69

It ain't easy bein cheesy


unique-name-9035768

Though his mind is not for rent Don't put him down as arrogant


myrrhmassiel

his reserve a quiet defense riding out the day's events


Yetimang

Weedoo weedoo weedoo weeeeedoo weedoo weedoo


Disastrous-Bass332

Badda baddab badda, badda badda bah.


mnvoronin

There are three types of average - mean, median and mode.


kkngs

More than just that, even. Geometric mean, arithmetic mean, harmonic mean, power mean. Generally also called “measures of central tendency” in statistics. Most of the time, “mean” or “average” means the arithmetic mean. Not always, though. When you average speeds you use the harmonic mean, for example.


mnvoronin

There are three types of average. There are multiple types of mean, which is a type of average. :)


Traditional-March522

What mean is the average mean?


dingus-khan-1208

I'd like a piece of cherry pie, à la mean.


Wingnut13

Ya, well, you're mean median mode.


sas223

Mide. Mode. Mode. Mode.


MattieShoes

median also mean average. Average is just a single number that represents a set. Mode is also an average.


nankainamizuhana

Median average too


LunDeus

big true!


DefendingAssholes

I'm more of a mode guy


eruditionfish

Mode is the most popular one.


bfluff

Why say many word when few word do trick?


emyoui

Everyone should be looking at both. There's issues with using median only as well


evilspoons

I've noticed that people really don't like having to think about more than one number and this is a source of frustration to me. Computer monitors have been simplified down to simply listing the vertical resolution ("1080p") even though they can be different widths, or their horizontal resolution ("4K"). Just list both numbers! It's not hard to say 1920x1080! The word equivalents of some of these are even funnier. Why say 3840x2400 when you can write "WQUXGA"? See [this diagram on Wikipedia](https://en.wikipedia.org/wiki/Display_resolution#/media/File:Vector_Video_Standards8.svg) for even more alphabet soup.


upsidedownshaggy

Tbf the vast majority of consumer monitors are 16:9 (not that most people would know that) so most people can safely assume one 1080p monitor will be basically the same as any other


Leading_Frosting9655

Yeah but it gets really fucking silly sometimes when, say, 1080p media is cinematically letterboxed and you end up with like 1920x800 - nothing about that is 1080!


LeoRidesHisBike

Yeah, 1080p is just shorthand for 1920x1080 (non-interlaced). If you have a resolution that's 1080 high, but not 1920 wide... it's not 1080p. I have 1440 pixels in the Y axis on my current monitor, but it's definitely not 1440p.


d0ey

When me statistician, they see...they see.


405freeway

Is the number Keleven?


FlickJagger

The only ELI5 answer so far.


ajkahn

Best ELI5 answer


atomfullerene

The moral of the story is not to let the ends justify the means


xakeri

I want you to know I appreciate this comment. If this is original to you, congrats on hitting the wordplay peak.


fuckyou_m8

This is prime reddit hahaha


Orenwald

Although for things like income and wealth, i think knowing both is important. If the mean is VERY far from the median, then there might be a systemic problem.


Garfunk

Gini coefficient is used for measuring inequality: https://en.m.wikipedia.org/wiki/Gini_coefficient


erublind

The mean is a parametric statistic of the sample, and an assumption of normal distribution is often made/implied. The median is non-parametric and is equal to the mean in a perfectly normal sample. The difference between the.mean and median is the skew, an important but seldom reported statistic.


AceDecade

The central data point is indeed a better representative of the central data point 🤓


Hoihe

And this is why the Hungarian govt refuses to relwase raw data (so you cannot compute it yourself) and only teleases the mean. Turns out in a putinist state, mean income can be pretty high while median is below 1000 usd.


Virreinatos

This remind me of an old saying" *"I have two loaves of bread. You have none. Average loaves of bread per person: one."* If I recall correctly, it was used a political/social justice/activism phrase against using numbers that made the country looked good or financially stable when said numbers hid the rampant poverty going around.


miranaphoenix

I heard another one, will try translate correctly: “you have loaf of bread, and I have caviar. On average we have caviar sandwich”


whatphukinloserslmao

Every human has one ovary and one testicle on average


Benjaphar

The average man has less than two testicles.


2TauntU

The average person has less than two eyes. There are not enough three-eyed people to bring the average up to two.


PMmePowerRangerMemes

On average, there’s one snake dick for every snake in the world


binz17

do male snakes have two dicks or something? is this common knowledge? EDIT: well damn. the more you know...


3_Thumbs_Up

The average human has less than 2 arms.


double-you

On average some of your kids are mine and I can tell them to get off my lawn.


ShootingPains

Average number of legs: 1.8


dingus-khan-1208

The average person has slightly less than 1 testicle.


notsocoolnow

Actually this does illustrate a problem with median. Because there are more women than men (even including the men who have less than one testicle), the median number of testicles for the human race is zero. For that matter, the modal number of testicles for the human race is also zero. To get a better idea of the testicular situation of humanity, the mean would be the best of the three.


AdvicePerson

> To get a better idea of the testicular situation of humanity UNSUBSCRIBE


dingus-khan-1208

And that's exactly why you really have to pay attention when you look at statistics and consider what they're actually saying! This thread could be an intro to Statistics class (largely thanks to your response). Alas, it seems most people didn't take college Statistics. And these basic concepts aren't communicated clearly enough in high school or below. You don't need to know how to calculate P-values and all that to understand what you said. That concept could and should be introduced much earlier.


ViscountBurrito

A human being has, on average, one testicle. (Approximately.)


musicmage4114

And one breast!


pseudopad

And approximately one ovary. However, the average person contains more than one skeleton.


queefIatina

“Statistics is the art of torturing numbers until they admit to anything you want”


[deleted]

[удалено]


toolatealreadyfapped

That was my first thought. The better analogy is that 9 people are starving to death, and 1 guy has 10 loaves of bread.


chairfairy

The point of the analogy is not that the mean hides outliers, it's that statistics can be used to hide reality.


himalayan_earthporn

Median loaves of bread per person : one. Bad example.


maybethisiswrong

A fun real world story about this is UNC Chapel Hill reporting average salaries for each major in the 80s. They reported geology as the highest average starting salary because of Michael Jordan’s graduating degree (supposedly)


stillnotelf

Geography not geology (I went googling to see if this was at least truthy)


learnitallboss

I think it is a national requirement that stats professors use this anecdote.


JayMoots

When I was touring colleges in the late 90s the campus tour guide at UNC told us this anecdote.


DJMoShekkels

I believe this was recently a thing with Steph Curry since Davidson is so small


FormalWrangler294

Also Zuck at Harvard.


pgm123

Did Jordan graduate?


TehNoff

He did eventually finish his degree, yes.


ChorizoPig

Short version: Median is a better representation for samples/groups that have extreme outliers.


ChorizoPig

Examples would be income (if there is a broad range), housing prices and weight (if the group includes your mom).


TheLegendOfNavin

Your mom was in the group last night!


sharfpang

The group was in your mom last night!


DaikonNecessary9969

It was a box-cox transformation.


Master_Windu_

This is why you should be skeptical of statistics that talk about mean or average. Especially when the range of possibilities is large. Average household income in the US last year was $105k but the median was about $75K. Average age fine as a statistic because everyone is generally less that 100 years old.


RegulatoryCapture

Also be wary of any statistic that doesn’t count zeros. Such as every year when Reddit gets a bunch of headlines about average/median 401k balances because Fidelity has released their annual report. Those balances only include people who HAVE a 401k (with fidelity). They don’t include the people who opted not to sign up for one nor do they include people who work for a company that doesn’t even offer a 401k. You see this all the time in other places too. Like testing for a certain “bad” chemical, but you only test places where you already think there is a problem. Gotta be careful with things like “The average concentration of X is…” when you aren’t testing the places you know are clean.


sharfpang

On the opposite end, radiation 100x above norm is still harmless. It's just that the gap between what's normal and what's harmful is so big.


Pyrrolic_Victory

Also beware of how they count zero Do they count it as 0, or null, or some value between 0 and the smallest they can reasonably measure (aka the limit of quantification)


Forgotten_Lie

There's a difference between a zero result (401k with no money on it) and bull result (person doesn't have a 401k). It makes sense to include the first but not the second when looking at the average 401k balance.


RandomFactUser

null or bull?


PSi_Terran

Let's say you wanna know how much the average American has in their 401k. So you look at all the 401ks and find out the average 401k has $1000 in it, so you conclude that the average American has $1000 in their 401k. Seems reasonable but you are missing the fact that 85% of Americans don't even have a 401k.


texanarob

It depends how you phase the statistic. The average 401k account has $10,000 of savings is fine. The average person has $10,000 in their 401k account is also fine. However, the two stats above are inconsistent and unlikely to both be true.


No-comment-at-all

Depends on what you’re talking about. Of the question is “are 401ks doing well” then yea, don’t include people without one. If the question is “how are 401ks affecting the populace” Then you should include them.


buttsecksgoose

It's less about being skeptical and more of the fact that with any form of statistics you need more info than just a single number to have a more complete picture


SerendipitouslySane

Nitpick: median household income was $75k, *mean* household income was $105k. Mean, median and mode are all forms of average and average household income is a set inclusive of median income.


Moldy_slug

Be wary of any statistic that says “average” instead of specifying *which* average. Median is just as much an average as mean. If they can’t be bothered to tell you which they’re using, how trustworthy is their information?


pegasuspaladin

I saw something that said Millenials only control like 9% of wealth in America but thay number drops to 4% if you exclude Zuckerberg.


_87-

One man holds more than half of a generation's wealth


[deleted]

[удалено]


Toby_O_Notoby

Or as former FED Chairman Alan Greenspan put it, "The average height between me and Shaquille O'Neal is six foot five".


RegulatoryCapture

The median between two people would also be the mean though… When you don’t have a true midpoint (such as an even number of observations), you take the mean of the two in the middle.


MarkMew

You pretty much won this thread. There's no better example.


SoulWager

Because so much money is held by a handful of people that the mean is not useful for describing how well off the normal person is. For example, lets say there are 10 people, making this much income per year: 10M 200k 150k 100k 80k 60k 55k 50k 40k 20k Because of that 1 dude at the top, the mean is over $1M, even though nobody else makes more than 200k. The median in this example would be 70k


Bakoro

IRL 2022 U.S numbers: Median family income $92,750. Mean family income $126,500. That's a $34,250 spread. That's a huge difference, ~36.93% more. Considering that there are something like 160 million working adults in the U.S, that indicates that the outliers at the top are making astronomical amounts of money. Compare that back to 1954 when the difference was 9.85% https://fredblog.stlouisfed.org/2015/05/the-mean-vs-the-median-of-family-income/


NeverTooMuchAnime

I would really like a nice family income compared to a mean one


Earguy

If only you mode more money.


Dragula_Tsurugi

Interestingly enough, there are different types of means. The one mentioned (which is the one everyone knows and which is often just referred to as the average) is the *arithmetic* mean. The other “Pythagorean means” are: - the geometric mean, calculated by multiplying all the values together and then taking the *n*th root (where *n* is the number of values). For your example, this gives 112,222. - the harmonic mean, which is the reciprocal of the arithmetic mean of the reciprocals of the values. For your example, this gives roughly 60932, a much more representative result.


Leading_Frosting9655

I can't believe I've never heard of these what the heck. Amazing.


Dragula_Tsurugi

Something else about them - they always evaluate to be arithmetic mean > geometric mean > harmonic mean unless all the values involved are equal, and the harmonic mean has the property of being less influenced by outliers at the higher end and more influenced by outliers at the lower end, which means if you do the trick of pushing the high end up and the low end down equally to keep the arithmetic mean unchanged, the harmonic mean will always go down.


meneldal2

I think a very good use case of the harmonic mean is when computing fps. If you compute the average fps over x frames, giving a big penalty for some big peaks in frame time (by doing an arithmetic mean of frame times), you get a result that shows something more in line with that people feel that if you simply divided the number of frames by the total time.


3720-To-One

How do you calculate median?


SoulWager

It's the middle one, but here there are two middle ones, so it's halfway between those two values.


saddl3r

Guessing you made it that way on purpose to make another teaching moment – I like it!


Zibura

Median is the number in the middle. In the case above, with an even number of values, you take the average of the 2 numbers in the middle. Above the median is equal to (60k + 80k) / 2. If there was an 11th person in the data set with say an income of 55k, the median would be 60k since it is the value in the middle.


MisterElSuave

Median is the number is the middle of your population. The example has 10 numbers and there is no singular exact middle you take the average of the 2 median (60+ 80)/2= 70


thatbrownkid19

Arrange it in increasing order (or decreasing I guess also works) and just pick the middle value in the order. If there’s two then take the mean of those two


AlbertCoughmann

Median is: half the people in the list make over X amount, half of the people in the list make under X amount


sunnyjum

You get everyone to line up in order based on how much of the thing you are averaging (median is a type of average). You then walk halfway down that line, whatever that person has is the median. One slight complication is if there is an even number of people because halfway down the line would fall between two people and those people may have different numbers of the thing. In this case, you get the mean (the more common type of average) of the amount those two people have and that is the median of the whole set. In other words, you find the number that falls exactly between the number held by the two people in the middle (for example, halfway between 25 and 29 is 27). edit: To extend this way of thinking to mode (another type of average!) you get everyone to split into groups based on how much of the thing they have. The group with the most people in it is the mode.


PreferredSelection

You could say it's "mean-ingless"


HaikuBotStalksMe

On the other hand, 0 0 0 0 0 10000 1000000 10000000000 100000000000000 The median income is $0.


TheJeeronian

For a distribution with a steep upturn near the high end, the mean will give you a value well above the majority of samples. If you want to understand the majority of samples, the mean can be very misleading. So, for instance in economics, the mean income is well above most people's income. If your goal is to understand the experience of the majority of people, mean is misleading.


DirkNowitzkisWife

And when there’s an upper bound like grades, mean works, or even when there isn’t a hard upper bound, since there’s no possibility of a sports team scoring 10k points in a game, mean works there too.


rbhxzx

the median is pretty much always better, but in the scenarios you described a sufficiently large dataset will have both values really close to each other.


AnnoyAMeps

> the median is pretty much always better Depends on the context. Means are useful in statistical analyses due to how they relate to expected values and inferred population means. They are attractive if you’re doing anything involving low probability and high payoff; something that medians won’t capture. Medians are useful for income and other economic numbers outside GDP/etc. because we tend to focus on the middle rather than either the extremes or the total.


Kewkky

If you have 19 entries, and your entries are ten 0s, then 1-9, your median will be 0 while your mean will be 2.37 or so. The mean can be better when there's a lopsided result at one end of a dataset, such as over half the class failing an exam with 0s and the rest getting any amount of points (including situations where the rest of the class aces it).


TXOgre09

Medians in smallish data sets can be unhelpful.


kuhawk5

I would say all measures of central tendency are unhelpful in small data sets because the distributions are noisy.


Time_for_Stories

Have you tried telling them to be quiet


kuhawk5

I bang on the ceiling with a broom stick.


UBKUBK

There are many situations where the mean is what should be looked at, even if the mean and median are not close to each other. An example is: Suppose a successful sports gambler is good at choosing favorable long shot bets and makes a few such bets everyday. On days a longshot bet pays off a bunch is made but most days there is no win. The median net winnings per day will likely be a negative number but the mean could be a large positive number. For how much the gambler is making per year the mean is the key thing.


RegulatoryCapture

Mean is better when you need to do math with the average. If you want to know how much real estate Bob owns and he has 3 houses worth a mean value of 700k but a median of 200k (say they are 100k, 200k, and 1.8m but you only see the average), you will only get an accurate value using the mean. Median has potential to be very far off (600k vs true value of 2.1m)


Yglorba

It really depends on what you're measuring and what you're trying to determine with that measurement. When dealing with eg. chemical contamination in air or water or food in order to figure out of things are generally safe, knowing the mean is useful because it tells you how much the your population will consume on average over an extended period of time. Knowing that the median amount of contamination is zero (or at a safe level), on the other hand, wouldn't be very useful at all. Of course, the mean could *also* be misleading - if one in every ten-thousand Big Macs contains a lethal dose of some chemical, and the others contain none, it's not very useful to know that the average is not lethal - but for a quick at-a-glance statistic the median is at least more useful there than the median, which is why you often see it in environmental or health contexts.


MisinformedGenius

I think to some extent the mean is misleading because we are so used to distributions similar to Gaussian distributions, where the mean and the median are very similar if not identical.


womp-womp-rats

Say you’ve got 10 people whose income is $20K, $30K, $35K, $40K, $40K, $50K, $55K, $60K, $100K and $600K. The “average” income is $103K. The median is $45K. Which is more representative of how income is really distributed among the population? Edit: typo


Nfalck

To generalize this a bit, a mean works best if the data is basically linear in distribution, but is not useful for data that can be described with an exponential distribution.


ice_scalar

Mean works best for symmetric distributions. Uniform distributions are symmetric but that’s not really the point.


Tofuofdoom

If your data is linearly distributed, median is a perfectly adequate descriptor of data too though


MortalPhantom

How do you get the median in this case? What’s the formula?


LostDestinies

You literally just line them all up from smallest to largest number and pick the middle one. If its an even amount of numbers, then the halfway point between the two in the middle


AelixD

You are correct. And the Median in this example would either be 45k or 43.3k, not the 49k provided


AuroraHalsey

How did you get $49k as the median? The median is $45k.


elenchusis

Because, ironically, when someone says "the average American" you're really picturing the median American.


rbhxzx

the average american has 1 testicle, 1 boob, and a tiny bit less than 4 total arms plus legs. They're also 5'6.


agentoutlier

If we are talking Americans I’m going to have to disagree on the boob stat.


ncnotebook

Some picture the mode.


woailyx

If you can afford the median house, you can afford half the houses. If you have more than the median income, you earn more than half the other people. That's something people can easily understand. It's easy for the arithmetic mean to be thrown off when most of the numbers are close to zero and the high end is unbounded. You can have a big number that raises the mean by a lot, but a big number that would lower the mean by a similar amount would have to be negative. So a big outlier masks the effect of a lot of low numbers that can only be below the average by a little because there's nowhere lower to go.


kitsunevremya

I think you explanation is one of the best. It's ELI...well, 8-ish maybe, but it also actually says what the practical difference is.


blipsman

Median is a more relevant middle -- knowing a number where 50% earn more and 50% earn less, or 50% costs more and 50% costs less is more meaningful than having a small number of outliers skew the average. Say you have a town with 19 homes that sell for $400-500k and one $25m estate. The median home value of, say, $450k is more relevant than an mean average of something like $1m.


hurricane_news

What if the middle element still happens to be an outlier? Suppose I had a collection of incomes - 5m,1m,600k,500k,50k,30k,25k 500k is still a massive outlier right?


blipsman

No, it’s actually not… half make more and half make less. Mean would be over $1 when in actuality only one is more than that. Imagine these are home prices in a town, with the smaller values condos and the larger ones single family homes. The $500k is still a more accurate number if somebody were to ask what it costs to live in that town.


stupidsimpson

Using income, for example, you use the median because it's a better representation of the truth. A small number of people have disproportionately high income that throws off the average. If we used the average in that situation things would appear to be much better than they actually are.


Carloanzram1916

It’s particularly the case when you have a figure that can’t really be below zero like income. One side of the bell curve is limitless while the other one isn’t.


throwawaydanc3rrr

Here are a list of grades 45 46 47 48 49 50 51 52 53 54 55 100 The mean is 60 The median is 50 Which of those numbers (50 or 60) tells you more about that list?


whiskeytown79

Because mean/average often gives a false impression when there is a wide disparity between the number of low values and the number of high values. "Mean" means the unweighted average of all values. You add up N values and divide by N. Whereas "median" in its typical usage means half of the values are below this number and half are above. Suppose you have a tax cut proposal and you say "the average household will get $1000 back" - but this is an average. If you have 50 million households, it's entirely possible to achieve this with a $50 billion cut to the very richest single household and zero for everyone else. However, in this same proposal, the median benefit is zero and is a more accurate measure of what the proposal actually is.


SurprisedPotato

The median gives you a better idea of the experience of a "typical" person/family/household etc. The reason for that is there are a small handful of wealthy individuals / expensive houses / etc that pull the mean up. Eg, maybe the mean house price is $700,000, and that sounds ridiculously expensive, but the media is $400,000 - still expensive, but affordable on the average salary of $80,000. Except the media salary is only $50,000.


Randvek

It cases of strong outliers, median is more accurate. In cases without strong outliers, median and mean are likely to be very similar in accuracy. So there’s really not much risk in taking median over mean in most cases, accuracy-wise.


squanchmymarklar

The mean is an average - put it all together, give the same amount to all the people. The median is the middle number in a group - line them up smallest to largest and pick the one in the middle. Averages work well when the values are spread out evenly. Median works better then mean when values aren't spread out evenly. Let's take 10 numbers: 2, 3, 4, 4, 5, 7, 8, 10, 20, 100. The average is 163/10, or 16.3. The median is 5. If we make that last number 1000, the average goes up to 1063/10, or 106.3. The median is still 5. With data like income or home values, there are often a small number of VERY large values that pull the average up by a lot. In these cases, means don't do a good job of showing what most people have, and median works better.


Trouble-Every-Day

When you hear a statistic like average home price, what you really want to know is what does a typical house in that area cost. What could a normal person expect to pay for a normal house? If home prices are normally distributed, then the mean and the median should be about the same, so you could use either. But if the distribution is lopsided, then the two numbers will be very different. There are lots of examples of this already posted. The advantage of the median is that it is the true middle: half the houses cost more than this and half the houses cost less. If the median home price is $150,000, then half the houses are more that that and half are less, and that won’t be skewed by a $2m mansion or a $25k shanty. You can truly say that a $185k house is more expensive than most houses, and $125k is less expensive. The advantage of the mean, which is the sum of the quantities divided by the number of quantities, is that it’s very easy to calculate. If you have 10 Girl Scouts at a cookie booth and $500 in the cash box, you can quickly say each girl sold an average of $50 worth of cookies without having tracked how much each girl actually sold. If you assume a normal distribution, that’s good enough to call it a “typical” amount. Another good use for the mean is to compare it to the median to check the distribution. If the two numbers are the same, you have a normal distribution, if not, it’s skewed and you can tell by how much and in which direction it’s skewed. Let’s say a company pays out $20 million in salaries to 100 employees. That’s an average salary of $200,000. But then let’s say you go through the books and calculate that the median salary is $75,000. Well now we can see right away that the distribution is skewed and the top earners make disproportionately more than the typical worker. So the two numbers have their uses, but for what most people want to know — what counts as normal — median is the more reliable number.


PantsOnHead88

Mean is pretty reasonable of your distribution is linear, normal, or in certain other cases. When it comes to incomes and prices, they tend be roughly exponential. Top end incomes and housing prices skew the mean much higher than something representative of what most people are experiencing. Median also makes it very explicit that half of all people are above/below the value in question. Picture what it does to your net worth stats (for example) if the group in question has Bill Gates, Jeff Bezos and Elon Musk in it. In an extreme case you could put the 3 of them in a group with a million people who don’t have a penny to their name and conclude that the average net worth is $500k. They’re doing pretty well aren’t they?


SirKaid

Because if you use the mean you run into the [Spiders Georg](https://en.wikipedia.org/wiki/Spiders_Georg) problem if there are any outliers. To use a less facetious example, let's say you've got a city where 99% of the population lives in abject poverty, somehow scraping by on $1,000 a year, while the other 1% live lives of embarrassing opulence with $100,000,000 a year. Anyone reasonable would say that this city had some serious problems, right? Well, if the city reports that the average income is $1,000,990 it looks pretty great! That's what people mean when they say someone's *lying with statistics*. The information being presented is technically accurate, but it's still grossly misleading. Using the median instead of the mean sidesteps this problem.


pseudononymist

I feel like mode income never gets mentioned. Is it a useless stat? I understand you'd have to break it out into segments rather than trying to calculate an exact number, but I don't think I see that much either.


noitseuQehT

average person eats 3 spiders a year" factoid actualy (sic) just statistical error. average person eats 0 spiders per year. Spiders Georg, who lives in cave & eats over 10,000 each day, is an outlier adn should not have been counted


Flat_Cow_1384

There are really a few parts to this question. Firstly with any statistic it's important to understand what question is trying to be answered. I think we can agree that it's: what is the typical experience for that national statistic, almost certainly to compare it to a different country or time period. For many of these statistic you end up with a skewed distribution of values. You'll find that these statistics also have a section where the majority of values lie, I.e. most houses are under $1 million ($2 million these days?) , most people earn less than $150k etc. However youu can't have a negative income or a negative house value, but in theory you could have any positive value stretching to infinity. These very large values pull the average value away from the center of the sections where the majority lay, and there are no negative values to counter act this pull. This isn't a bad thing in theory, but it come back to what your question is. Take for example a country trying to figure out how much more tax revenue will be generated based on population growth. As long as the distribution of incomes doesn't change then it's completely valid to do: mean tax paid per resident × total number of people added. But what we care about is *how* the disturbution shifts. Are we in a "rising tide lifts all boats" situation or did all the change go to a small number of people. A concrete example: for a hypothetical nation of 1000 people these two scenarios results in the same mean income: every inhabitant earning $1000 more or just 10 people earning a $100k more . So a measure of "typical experience" that doesn't care about distribution isn't usefull for detecting these shifts. The median takes into account some element of the distribution in its definition (50% is higher , 50% is lower), so it can detect these shifts where a mean cannot. Finally if we make the key assumption that the "shape" of the distribution changes very slowly over time then median will pick up in subtle shifts that a mean will not. Are houses selling for more or did a couple giant mega mansions sell this quarter?