The Cellar  

Go Back   The Cellar > Main > Technology

Technology Computing, programming, science, electronics, telecommunications, etc.

Reply
 
Thread Tools Display Modes
Old 12-20-2011, 04:54 PM   #1
Lamplighter
Person who doesn't update the user title
 
Join Date: Jun 2010
Location: Bottom lands of the Missoula floods
Posts: 6,402
HLJ, by the same "reasoning" I had above.
The first post (or first view) in the thread must be "1".
and there might or might not be a second ("2")
..If there is a "2" there might or might not be a third ("3")
..If there is a .... and so on up to "N"

That is, two "1's" must occur (1 and 10) before there can be two "9's" as the first digit.

So the probability at any given test of the number of posts is going to be higher for "1's" than any other digit, etc.
Therefore in repeated measurements, the distribution of digits will not be equal.

With respect to the "distribution indicated by Benford's Law", my example might or might not be the same.
But as in most treaties on Statistics, "The derivation is left to the reader"
Lamplighter is offline   Reply With Quote
Old 12-20-2011, 05:14 PM   #2
HungLikeJesus
Only looks like a disaster tourist
 
Join Date: Feb 2007
Location: above 7,000 feet
Posts: 7,208
OK - now someone just has to do the analysis.
__________________
Keep Your Bodies Off My Lawn

SteveDallas's Random Thread Picker.
HungLikeJesus is offline   Reply With Quote
Old 12-20-2011, 05:56 PM   #3
ZenGum
Doctor Wtf
 
Join Date: Oct 2007
Location: Badelaide, Baustralia
Posts: 12,861
Some things - street numbers, numbers of posts in a thread, lend themselves to a natural explanation of the preponderance of lower first digits. This is becaue they are built in a series - you can't have post 3 without post 2, but you can have post 2 with no post 3. So there will be more 2-post threads than 3-post threads.

Interestingly, though, it works just as well with things like river lengths and mountain heights, despite the fact that you CAN have a 3 mile long river without having a 2 mile long river. Weirder still, it holds up just as well no matter what units you measure in. Feet, meters, inches, whatever.
__________________
Shut up and hug. MoreThanPretty, Nov 5, 2008.
Just because I'm nominally polite, does not make me a pussy. Sundae Girl.
ZenGum is offline   Reply With Quote
Old 12-20-2011, 06:04 PM   #4
Lamplighter
Person who doesn't update the user title
 
Join Date: Jun 2010
Location: Bottom lands of the Missoula floods
Posts: 6,402
Z, what is the difference between counting posts in a thread,
and measuring height or length of a natural object ?
It's not like counting live and dead cats in a box.

ETA: Above, I said:
...two "1's" must occur (1 and 10) before there can be two "9's" as the first digit."

But in fact,
...eleven "1's" (as the first digit) must occur (1,10,11,12,...and 19)
before there can be two "9's" as the first digit."

Sorry, but I'm just not seeing the significance of Benford's Law.
I must be misinterpreting something or other ???
.
Lamplighter is offline   Reply With Quote
Old 12-20-2011, 06:25 PM   #5
HungLikeJesus
Only looks like a disaster tourist
 
Join Date: Feb 2007
Location: above 7,000 feet
Posts: 7,208
One condition of Benford's Law is that
Quote:
It tends to be most accurate when values are distributed across multiple orders of magnitude
If you make a list of consecutive numbers covering a large range (e.g. from 1 to 999) and count how many start with 1, 2, 3, etc. you'll find that there are an equal quantity starting with each digit (111, in this example). So wouldn't you expect that a list of measurements or values would have this same uniform distribution?

I would.
__________________
Keep Your Bodies Off My Lawn

SteveDallas's Random Thread Picker.
HungLikeJesus is offline   Reply With Quote
Old 12-20-2011, 06:46 PM   #6
Lamplighter
Person who doesn't update the user title
 
Join Date: Jun 2010
Location: Bottom lands of the Missoula floods
Posts: 6,402
HLJ and Z, you guy are talking to a dummy here... or a stubborn jackass.
I still don't see the difference.

I can argue that if we were measuring "a single" river,
the probability of leading digits = 1 would be skewed,
because only few rivers are 1 mile or 1,000 miles in length
compared to the number of rivers of 9 or 90 or 900 miles.
But that's a function of our definition of a "river" compared with a brook or stream.

I'll stop now, but I'm hoping someone will continue this discussion.
I'm willing to believe there is significance to this law... I just don't see it yet.
.
Lamplighter is offline   Reply With Quote
Old 12-20-2011, 09:19 PM   #7
ZenGum
Doctor Wtf
 
Join Date: Oct 2007
Location: Badelaide, Baustralia
Posts: 12,861
Hey, UT, how hard/easy would it be to analyse the cellar threads in terms of the number of posts? Then analyse that data in terms of the first digit? We could check this law on ourselves.

Lamplighter, remember that we are only focusing on the first digit. Lets take the number of posts in a thread as an example. To keep it simple I'll pretend that threads can't have more than 999 posts, but I'll explain later how to deal with the fact that they can.

Any thread with 1, 11, 12, 13, 14, ... 19, 100, 101, 102 ... etc goes in the "starts with a 1" category.

Threads with 2, 20, 21, 22 ... 29, 200, 201 ... etc will go in the "starts with a 2" category.

We could continue this all the way to 9, and the possibility of any starting digit seems equal.

BUT! In reality, many threads have only a single post, or just a handful. Many struggle into the teens or twenies before they die. Fewer make it into the 30s and 40s, still fewer into the 80s and 90s. This means that there will be more thread totals starting with a 1 than any other digit.

The same pattern happens whe we consider the 100s and 200s and so on.

And if we want to go past 999 posts, the same pattern will apply. 1,000 to 1,999 all start with 1, and so on. It is the same pattern as before.

In a sentence: thread post counts will usually start with lower digits because threads die before they can get to the higher digits.


Well, that is how it is for things like thread post counts. Here, they grow from one upwards without missing a step. You have to go through 1 to get to 2, and you might stop along the way, which is why there are more 1s than 2s. You have to go through the teens before you get to the 20s, and you might stop on the way, so again there are more 1s than 2s.

However, the case with things like river lengths is different, or at least it seems different to me. You can have a 2 mile river without there being a 1 mile river, so there is no risk of "stopping along the way". So the frequency of 1s and 2s in things like this is ... umm ... not explained in the way it is for thread post totals.

In fact, I cannot explain it and have never heard of a good explanation. It just is. And you'd think that changing the units of measurement - yards to feet, for example, should shift the results, since a 1 yard river is a 3 foot river ... but it doesn't, since all those 0.34 yard rivers are now 1.1 foot rivers.

It's freaking weird, now that I come to think of it.
__________________
Shut up and hug. MoreThanPretty, Nov 5, 2008.
Just because I'm nominally polite, does not make me a pussy. Sundae Girl.

Last edited by ZenGum; 12-20-2011 at 11:19 PM.
ZenGum is offline   Reply With Quote
Old 12-20-2011, 10:54 PM   #8
tw
Read? I only know how to write.
 
Join Date: Jan 2001
Posts: 11,933
Nine is a most frequent digit whenever I buy gas. Nine appears more often than any other number in the price. Today, it was $3.299 per gallon. Nine gallons is a typical fillup. When they ask me how much, I say, "Give me the whole nine yards".

Whenever I buy gas, nine times out of ten, even the weather is nice. Change one letter and another nine appears.

Good weather always leaves me feeling on cloud nine. How can this be? Well, I always avoid one - the loneliest number.
tw is offline   Reply With Quote
Old 12-20-2011, 11:17 PM   #9
ZenGum
Doctor Wtf
 
Join Date: Oct 2007
Location: Badelaide, Baustralia
Posts: 12,861
are you ?

Remember, though, it is the first digit the law applies to.

(That makes it sound like a rude hand gesture, doesn't it?)
__________________
Shut up and hug. MoreThanPretty, Nov 5, 2008.
Just because I'm nominally polite, does not make me a pussy. Sundae Girl.
ZenGum is offline   Reply With Quote
Old 12-21-2011, 12:24 AM   #10
classicman
barely disguised asshole, keeper of all that is holy.
 
Join Date: Nov 2007
Posts: 23,401
Nein!
__________________
"like strapping a pillow on a bull in a china shop" Bullitt
classicman is offline   Reply With Quote
Old 12-21-2011, 08:36 AM   #11
Clodfobble
UNDER CONDITIONAL MITIGATION
 
Join Date: Mar 2004
Location: Austin, TX
Posts: 20,012
Quote:
Originally Posted by ZenGum
However, the case with things like river lengths is different, or at least it seems different to me. You can have a 2 mile river without there being a 1 mile river, so there is no risk of "stopping along the way". So the frequency of 1s and 2s in things like this is ... umm ... not explained in the way it is for thread post totals.

In fact, I cannot explain it and have never heard of a good explanation. It just is. And you'd think that changing the units of measurement - yards to feet, for example, should shift the results, since a 1 yard river is a 3 foot river ... but it doesn't, since all those 0.34 yard rivers are now 1.1 foot rivers.
I have two things to put forth on this. First, consider that all units of measurement were created by people for their usefulness. Yards vs. feet is not such a big difference after all. Miles is getting closer to causing problems, but still these are units that people chose for a reason. Subconsciously we prefer things that are measured in less than 10, or in multiples of ten, because they are easier for our brains to keep track of, so the units we derive are going to reflect that preference.

Second, it is less likely to have a 3-mile river without stopping at a 2-mile river. Because there is a statistical probability of all the things that cause rivers to be diverted, blocked, or run out of water. Think of it more like a series of coin tosses. The probability of flipping all heads gets less and less the more flips you require (yes I understand each flip is independent, but considering the probability from the beginning before you start flipping.) The probability that wellspring's water will go 10 feet without a problem? Pretty high. The probability that it can go 1 mile without encountering a boulder or a beaver dam? Less likely, but still pretty good. The probability that it can go 3 miles without such a problem? Even less. The problem is that you can't just have the third mile of a river without having the first and second miles. The nature of measurement means you must always start at 1.

And anyway, here is an example that doesn't fit: adult male heights, measured in feet. You're going to have a huge frequency of 5s and 6s, and almost zero prevalence of 1s. While there is a certain probability on any given day of your life that you might be maimed and lose your legs, the chances are small and the majority of individuals make it to the 5-6 foot range. If you were to consider the final height of every person born, not just those that make it to adulthood, then you'd have to count all those short people who die in childhood and you might very well get the same distribution. But only in countries with a reasonably high infant mortality rate, in the US the distribution would still be radically skewed towards 5s and 6s.
Clodfobble is offline   Reply With Quote
Old 12-21-2011, 06:36 PM   #12
ZenGum
Doctor Wtf
 
Join Date: Oct 2007
Location: Badelaide, Baustralia
Posts: 12,861
First two paragraphs have me thinking hard....

Third one ... that is covered in the bit about this law working best for measurements scattered over several orders of magnitude, using power laws. It doesn't work for measurements around a tight bell curve.
__________________
Shut up and hug. MoreThanPretty, Nov 5, 2008.
Just because I'm nominally polite, does not make me a pussy. Sundae Girl.
ZenGum is offline   Reply With Quote
Old 12-22-2011, 09:57 AM   #13
Pete Zicato
Turns out my CRS is a symptom of TMB.
 
Join Date: Jan 2010
Location: Chicago suburbs
Posts: 2,916
Cole's Law


Quote:
Thinly sliced cabbage.
__________________


Talk nerdy to me.
Pete Zicato is offline   Reply With Quote
Old 12-22-2011, 10:04 AM   #14
infinite monkey
Person who doesn't update the user title
 
Join Date: Mar 2011
Posts: 13,002
lol @ pete.

This thread must be where all the smart people hang out. I'm going down the street. My head hurts.
infinite monkey is offline   Reply With Quote
Old 12-22-2011, 11:28 AM   #15
HungLikeJesus
Only looks like a disaster tourist
 
Join Date: Feb 2007
Location: above 7,000 feet
Posts: 7,208
I did a quick analysis of the page views of the Image of the Day forum for the last year. Here is the distribution of first digits. I think the analysis would have been better if I included a longer period of time.

All of the 8s are 800 to 899; all of the nines except two are 900 to 999. It may be too small of a distribution, because 90% of the values are between 500 and 4,000.
Attached Images
 
__________________
Keep Your Bodies Off My Lawn

SteveDallas's Random Thread Picker.
HungLikeJesus is offline   Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT -5. The time now is 05:44 AM.


Powered by: vBulletin Version 3.8.1
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.