The Cellar

The Cellar (http://cellar.org/index.php)
-   Technology (http://cellar.org/forumdisplay.php?f=7)
-   -   Interesting "Laws" (http://cellar.org/showthread.php?t=26511)

HungLikeJesus 12-16-2011 10:00 AM

Interesting "Laws"
 
Stigler's Law of Eponymy

From Wikipedia:
Quote:

Stigler's law of eponymy is a process proposed by University of Chicago statistics professor Stephen Stigler in his 1980 publication "Stigler’s law of eponymy." In its simplest and strongest form it says: "No scientific discovery is named after its original discoverer." Stigler named the sociologist Robert K. Merton as the discoverer of "Stigler's law", consciously making "Stigler's law" exemplify Stigler's law.

HungLikeJesus 12-16-2011 10:06 AM

Benford's Law
From Wikipedia:

Quote:

Benford's Law, also called the first-digit law, states that in lists of numbers from many (but not all) real-life sources of data, the leading digit is distributed in a specific, non-uniform way. According to this law, the first digit is 1 about 30% of the time, and larger digits occur as the leading digit with lower and lower frequency, to the point where 9 as a first digit occurs less than 5% of the time. This distribution of first digits is the same as the widths of gridlines on the logarithmic scale.


This counter-intuitive result has been found to apply to a wide variety of data sets, including electricity bills, street addresses, stock prices, population numbers, death rates, lengths of rivers, physical and mathematical constants, and processes described by power laws (which are very common in nature). It tends to be most accurate when values are distributed across multiple orders of magnitude.


The graph (below) shows Benford's law for base 10. There is a generalization of the law to numbers expressed in other bases (for example, base 16), and also a generalization to second digits and later digits.
It is named after physicist Frank Benford, who stated it in 1938, although it had been previously stated by Simon Newcomb in 1881.
http://upload.wikimedia.org/wikipedi...nforda.svg.png

Sundae 12-16-2011 11:59 AM

Sod's Law.
It's like ten thousand spoons, when all you need is a knife.

infinite monkey 12-16-2011 12:07 PM

1 Attachment(s)
Hahhahaahaa...brother just sent me this today.

Sundae 12-16-2011 12:17 PM

Synchronicity!
It must mean something.

Probably being deluged by spoons.

HungLikeJesus 12-16-2011 12:38 PM

Pareto principle
From Wikipedia:

Quote:

The Pareto principle (also known as the 80–20 rule, the law of the vital few, and the principle of factor sparsity) states that, for many events, roughly 80% of the effects come from 20% of the causes.

Business-management consultant Joseph M. Juran suggested the principle and named it after Italian economist Vilfredo Pareto, who observed in 1906 that 80% of the land in Italy was owned by 20% of the population; he developed the principle by observing that 20% of the pea pods in his garden contained 80% of the peas.

It is a common rule of thumb in business; e.g., "80% of your sales come from 20% of your clients". Mathematically, where something is shared among a sufficiently large set of participants, there must be a number k between 50 and 100 such that "k% is taken by (100 − k)% of the participants". The number k may vary from 50 (in the case of equal distribution, i.e. 100% of the population have equal shares) to nearly 100 (when a tiny number of participants account for almost all of the resource). There is nothing special about the number 80% mathematically, but many real systems have k somewhere around this region of intermediate imbalance in distribution.
Quote:

Due to the scale-invariant nature of the power law relationship, the relationship applies also to subsets of the income range. Even if we take the ten wealthiest individuals in the world, we see that the top three (Warren Buffett, Carlos Slim Helú, and Bill Gates) own as much as the next seven put together.
...
In the systems science discipline, Epstein and Axtell created an agent-based simulation model called SugarScape, from a decentralized modeling approach, based on individual behavior rules defined for each agent in the economy. Wealth distribution and Pareto's 80/20 Principle became emergent in their results, which suggests that the principle is a natural phenomenon.

In health care in the United States, it has been found that 20% of patients use 80% of health care resources.

Several criminology studies have found that 80% of crimes are committed by 20% of criminals.

ZenGum 12-16-2011 05:53 PM

85% of ....

You know the rest. :)

Lamplighter 12-16-2011 06:52 PM

[quote=HungLikeJesus;780792]Benford's Law
From Wikipedia:

Quote:

This counter-intuitive result has been found to apply to a wide variety of data sets, including electricity bills, street addresses, stock prices, population numbers, death rates, lengths of rivers, physical and mathematical constants, and processes described by power laws (which are very common in nature). It tends to be most accurate when values are distributed across multiple orders of magnitude.
OK, pls explain to me why this is "counter-intuitive ?
and, what difference would it make if it was distributed across orders of magnitude.

If I am going to count anything, I start with "1". Therefore in a set of any size,
there will always be a larger number of "1's" than of "2's" than of "3's"...etc.
And to point out the obvious, in small sets, there may not even be a "9" or "0"

In the examples of the quote above,
a small town might have street addresses of 100's, 200's... to 700's, but no 800's or 900's. etc.

I guess I'm not getting Benford's idea of the whole thing. :neutral:

HungLikeJesus 12-20-2011 02:43 PM

As a real-world example, Lamp, let's say that we looked at the number of views for threads in one Cellar forum - Nothingland, for example. Would you expect the first digit to have a uniform distribution, or would you expect it to follow the distribution indicated by Benford's Law?

Lamplighter 12-20-2011 04:54 PM

HLJ, by the same "reasoning" I had above.
The first post (or first view) in the thread must be "1".
and there might or might not be a second ("2")
..If there is a "2" there might or might not be a third ("3")
..If there is a .... and so on up to "N"

That is, two "1's" must occur (1 and 10) before there can be two "9's" as the first digit.

So the probability at any given test of the number of posts is going to be higher for "1's" than any other digit, etc.
Therefore in repeated measurements, the distribution of digits will not be equal.

With respect to the "distribution indicated by Benford's Law", my example might or might not be the same.
But as in most treaties on Statistics, "The derivation is left to the reader" ;)

HungLikeJesus 12-20-2011 05:14 PM

OK - now someone just has to do the analysis.

ZenGum 12-20-2011 05:56 PM

Some things - street numbers, numbers of posts in a thread, lend themselves to a natural explanation of the preponderance of lower first digits. This is becaue they are built in a series - you can't have post 3 without post 2, but you can have post 2 with no post 3. So there will be more 2-post threads than 3-post threads.

Interestingly, though, it works just as well with things like river lengths and mountain heights, despite the fact that you CAN have a 3 mile long river without having a 2 mile long river. Weirder still, it holds up just as well no matter what units you measure in. Feet, meters, inches, whatever.

Lamplighter 12-20-2011 06:04 PM

Z, what is the difference between counting posts in a thread,
and measuring height or length of a natural object ?
It's not like counting live and dead cats in a box.

ETA: Above, I said:
...two "1's" must occur (1 and 10) before there can be two "9's" as the first digit."

But in fact,
...eleven "1's" (as the first digit) must occur (1,10,11,12,...and 19)
before there can be two "9's" as the first digit."

Sorry, but I'm just not seeing the significance of Benford's Law.
I must be misinterpreting something or other ???
.

HungLikeJesus 12-20-2011 06:25 PM

One condition of Benford's Law is that
Quote:

It tends to be most accurate when values are distributed across multiple orders of magnitude
If you make a list of consecutive numbers covering a large range (e.g. from 1 to 999) and count how many start with 1, 2, 3, etc. you'll find that there are an equal quantity starting with each digit (111, in this example). So wouldn't you expect that a list of measurements or values would have this same uniform distribution?

I would.

Lamplighter 12-20-2011 06:46 PM

HLJ and Z, you guy are talking to a dummy here... or a stubborn jackass.
I still don't see the difference.

I can argue that if we were measuring "a single" river,
the probability of leading digits = 1 would be skewed,
because only few rivers are 1 mile or 1,000 miles in length
compared to the number of rivers of 9 or 90 or 900 miles.
But that's a function of our definition of a "river" compared with a brook or stream.

I'll stop now, but I'm hoping someone will continue this discussion.
I'm willing to believe there is significance to this law... I just don't see it yet. :(
.

ZenGum 12-20-2011 09:19 PM

Hey, UT, how hard/easy would it be to analyse the cellar threads in terms of the number of posts? Then analyse that data in terms of the first digit? We could check this law on ourselves.

Lamplighter, remember that we are only focusing on the first digit. Lets take the number of posts in a thread as an example. To keep it simple I'll pretend that threads can't have more than 999 posts, but I'll explain later how to deal with the fact that they can.

Any thread with 1, 11, 12, 13, 14, ... 19, 100, 101, 102 ... etc goes in the "starts with a 1" category.

Threads with 2, 20, 21, 22 ... 29, 200, 201 ... etc will go in the "starts with a 2" category.

We could continue this all the way to 9, and the possibility of any starting digit seems equal.

BUT! In reality, many threads have only a single post, or just a handful. Many struggle into the teens or twenies before they die. Fewer make it into the 30s and 40s, still fewer into the 80s and 90s. This means that there will be more thread totals starting with a 1 than any other digit.

The same pattern happens whe we consider the 100s and 200s and so on.

And if we want to go past 999 posts, the same pattern will apply. 1,000 to 1,999 all start with 1, and so on. It is the same pattern as before.

In a sentence: thread post counts will usually start with lower digits because threads die before they can get to the higher digits.


Well, that is how it is for things like thread post counts. Here, they grow from one upwards without missing a step. You have to go through 1 to get to 2, and you might stop along the way, which is why there are more 1s than 2s. You have to go through the teens before you get to the 20s, and you might stop on the way, so again there are more 1s than 2s.

However, the case with things like river lengths is different, or at least it seems different to me. You can have a 2 mile river without there being a 1 mile river, so there is no risk of "stopping along the way". So the frequency of 1s and 2s in things like this is ... umm ... not explained in the way it is for thread post totals.

In fact, I cannot explain it and have never heard of a good explanation. It just is. And you'd think that changing the units of measurement - yards to feet, for example, should shift the results, since a 1 yard river is a 3 foot river ... but it doesn't, since all those 0.34 yard rivers are now 1.1 foot rivers.

It's freaking weird, now that I come to think of it.

tw 12-20-2011 10:54 PM

Nine is a most frequent digit whenever I buy gas. Nine appears more often than any other number in the price. Today, it was $3.299 per gallon. Nine gallons is a typical fillup. When they ask me how much, I say, "Give me the whole nine yards".

Whenever I buy gas, nine times out of ten, even the weather is nice. Change one letter and another nine appears.

Good weather always leaves me feeling on cloud nine. How can this be? Well, I always avoid one - the loneliest number.

ZenGum 12-20-2011 11:17 PM

:lol: are you :rasta:?

Remember, though, it is the first digit the law applies to.

(That makes it sound like a rude hand gesture, doesn't it?)

tw 12-20-2011 11:26 PM

Quote:

Originally Posted by ZenGum (Post 781828)
(That makes it sound like a rude hand gesture, doesn't it?)

Nine.

classicman 12-21-2011 12:24 AM

Nein!

Clodfobble 12-21-2011 08:36 AM

Quote:

Originally Posted by ZenGum
However, the case with things like river lengths is different, or at least it seems different to me. You can have a 2 mile river without there being a 1 mile river, so there is no risk of "stopping along the way". So the frequency of 1s and 2s in things like this is ... umm ... not explained in the way it is for thread post totals.

In fact, I cannot explain it and have never heard of a good explanation. It just is. And you'd think that changing the units of measurement - yards to feet, for example, should shift the results, since a 1 yard river is a 3 foot river ... but it doesn't, since all those 0.34 yard rivers are now 1.1 foot rivers.

I have two things to put forth on this. First, consider that all units of measurement were created by people for their usefulness. Yards vs. feet is not such a big difference after all. Miles is getting closer to causing problems, but still these are units that people chose for a reason. Subconsciously we prefer things that are measured in less than 10, or in multiples of ten, because they are easier for our brains to keep track of, so the units we derive are going to reflect that preference.

Second, it is less likely to have a 3-mile river without stopping at a 2-mile river. Because there is a statistical probability of all the things that cause rivers to be diverted, blocked, or run out of water. Think of it more like a series of coin tosses. The probability of flipping all heads gets less and less the more flips you require (yes I understand each flip is independent, but considering the probability from the beginning before you start flipping.) The probability that wellspring's water will go 10 feet without a problem? Pretty high. The probability that it can go 1 mile without encountering a boulder or a beaver dam? Less likely, but still pretty good. The probability that it can go 3 miles without such a problem? Even less. The problem is that you can't just have the third mile of a river without having the first and second miles. The nature of measurement means you must always start at 1.

And anyway, here is an example that doesn't fit: adult male heights, measured in feet. You're going to have a huge frequency of 5s and 6s, and almost zero prevalence of 1s. While there is a certain probability on any given day of your life that you might be maimed and lose your legs, the chances are small and the majority of individuals make it to the 5-6 foot range. If you were to consider the final height of every person born, not just those that make it to adulthood, then you'd have to count all those short people who die in childhood and you might very well get the same distribution. But only in countries with a reasonably high infant mortality rate, in the US the distribution would still be radically skewed towards 5s and 6s.

ZenGum 12-21-2011 06:36 PM

First two paragraphs have me thinking hard....

Third one ... that is covered in the bit about this law working best for measurements scattered over several orders of magnitude, using power laws. It doesn't work for measurements around a tight bell curve.

Pete Zicato 12-22-2011 09:57 AM

Cole's Law


Quote:

Thinly sliced cabbage.

infinite monkey 12-22-2011 10:04 AM

lol @ pete.

This thread must be where all the smart people hang out. I'm going down the street. My head hurts. ;)

HungLikeJesus 12-22-2011 11:28 AM

1 Attachment(s)
I did a quick analysis of the page views of the Image of the Day forum for the last year. Here is the distribution of first digits. I think the analysis would have been better if I included a longer period of time.

All of the 8s are 800 to 899; all of the nines except two are 900 to 999. It may be too small of a distribution, because 90% of the values are between 500 and 4,000.

Lamplighter 12-22-2011 12:31 PM

Imma gonna guess...

What if views of IOD are bimodally distributed ( popular vs not-so-popular )

If so, the population of IOD's with less than 1000 views might follow Benford's Law
And, the population of IOD's with more than 1000 views might also follow Benford's Law

So if the graph were drawn with 2 cycles (1-100-1000),
there would be two peaks (bimodal) at the 1's,
each falling off and following the Benford distribution after the 1's.

Otherwise, the IOD's would have to be assumed to be equally popular,
and then the distribution doesn't follow the prediction.

HungLikeJesus 12-22-2011 12:38 PM

Maybe roughly the same number of people look at the IOTD each day. I should probably have picked a different forum to get a wider range. Or maybe most of the views are due to spiders and robots.

ZenGum 12-22-2011 05:32 PM

Note this is page views not number of posts. It is still an interesting result.

Clod, your second paragraph has me persuaded. I think. It feels like when you're wrestling with the anthropocentric principle, that ... wait does this really work? moment. I think it does, as of right now. Thank you.

HungLikeJesus 01-19-2012 02:04 PM

Quote:

Originally Posted by HungLikeJesus (Post 780792)

Hey, I just noticed that they left off consideration of the digit 0 (and the digit ţ)

Pete Zicato 01-19-2012 02:20 PM

Here they are HLJ!

Half of all people are below average.

Kaa's Law: In any sufficiently large group of people most are idiots.

ZenGum 01-19-2012 07:59 PM

Quote:

Originally Posted by HungLikeJesus (Post 789241)
Hey, I just noticed that they left off consideration of the digit 0 (and the digit ţ)

No number starts with a zero.

Or else they all do. 0001, 002, etc. We take the first significant figure.

What about 0.005, you ask?

5 x 10^-3

HungLikeJesus 01-19-2012 08:03 PM

Well, they mentioned addresses, and sometimes those start with 0. And some Zip codes in the US start with 0.

ZenGum 01-19-2012 08:06 PM

You need to convert your zip codes to metric and write them in scientific notation, then. :D

Beest 01-25-2012 07:32 AM

Dunning-Kruger Effect

Quote:

The Dunning–Kruger effect is a cognitive bias in which unskilled suffer from illusory superiority, mistakenly rating their ability much higher than average, while the highly skilled underrate their own abilities. This bias is attributed to a metacognitive inability of the unskilled to recognize their mistakes.[1]

Kruger and Dunning proposed that, for a given skill, incompetent people will:
  1. tend to overestimate their own level of skill;
  2. fail to recognize genuine skill in others;
  3. fail to recognize the extremity of their inadequacy;
  4. recognize and acknowledge their own previous lack of skill, if they can be trained to substantially improve.

Dumb people who know a little, think they are experts and are certain they are right, experts who know a lot realise how much they don't know and are less sure.
Quote:

Studies on the Dunning–Kruger effect tend to focus on American test subjects. Similar studies on European subjects show marked muting of the effect[citation needed]; studies on some East Asian subjects suggest that something like the opposite of the Dunning–Kruger effect operates on self-assessment and motivation to improve:
:p:

glatt 01-25-2012 07:40 AM

So was Dunning the expert, and Kruger the imbecile, or vice-verse? I want to know the story behind the naming of that one.

Clodfobble 01-25-2012 08:05 AM

Either way I think Kruger-Dunning would have sounded much better.

footfootfoot 01-25-2012 10:38 AM

Gravity.
Not just a good idea; it's the law.

plthijinx 01-25-2012 04:13 PM

boy did i work too hard yesterday on my project bid drawings. when i read this:

Quote:

It tends to be most accurate when values are distributed across multiple orders of magnitude
i saw this:
Quote:

It tends to be most accurate when values are disrupted across multiple personality disorders of magma.



i'm not going to work 18 hours in a day again for a while. i hope.

BigV 01-25-2012 08:31 PM

Quote:

Originally Posted by ZenGum (Post 789305)
No number starts with a zero.

Or else they all do. 0001, 002, etc. We take the first significant figure.

What about 0.005, you ask?

5 x 10^-3

orly? What about the product of

5 x 0 = ?

ZenGum 01-26-2012 05:42 AM

DOES NOT COMPUTE! DOES NOT COMPUTE! *head explodes*

BigV 01-26-2012 09:13 AM

heh....

Or, you could just say, "I misspoke." Please don't explode, you're far too entertaining to be spent in one burst of fireworks (pig that special you don't eat all at once...)

:)

classicman 01-26-2012 10:25 AM

If - No number starts with a zero.

Then - 5 x 0 = ? is not possible :)

tw 01-26-2012 11:35 PM

Quote:

Originally Posted by ZenGum (Post 789305)
No number starts with a zero.

In computers, all positive numbers start with zero. Negative numbers start with one.

toranokaze 08-30-2012 08:39 PM

A woman who writes a song called irony, but has no concept of the word; now that is ironic.


All times are GMT -5. The time now is 05:28 PM.

Powered by: vBulletin Version 3.8.1
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.