What the lotto can teach us about probability.*

So I have heard the passed along story of an individual who was put in charge of buying all the lotto tickets for a workplace lotto share.  In case you are not aware, the way this works is that everyone puts in money (usually equal amounts) and a great number of lotto tickets are bought.  If there are any winnings, the winnings are split amongst those who contributed money (in equal shares if everyone contributed equally, or otherwise proportional to the contirubtions.)

In this particular story, the individual in charge went to the local store and purchased the tickets.  When filling out the tickets(do you remember when you had to do them by hand with pencil?) he chose combinations with conecutive values.  When he brought back tickets with combinations like 12-13-14-15-16-17, his coworkers were outraged.  “How could he possibly have chosen combinations like this that weren’t as random as other combinations, and thus had less chance to win?” they thought.  He was shunned and not allowed to participate in the lotto pool again.

Unfortunately, this would be the common reaction in our generally statistically ignorant society.  What they fail to realize is that those numbers have just as likely a chance of coming up as any other combination of 6 “random numbers”.

Allow me to prove to you that not only was this man’s method just as good as picking any 6 “random” numbers, but actually BETTER, if you can believe it, than the alternative.

As humans, we make associations, and generally those associations are good.  From a very young age, we associate the smell of food cooking as good but touching a hot stove in an attempt to sneak a taste as bad.  In that case, however, the association of the sequence of numbers with non-randomness is harmful to thinking logically about the situation.

The lotto machine doesn’t know what numbers are written on the balls.  It shoots 6 balls out of its chute and if the numbers on those balls match your numbers you win.  But after the first number, say a 7, comes up, the odds of a numbered ball appearing that is adjacent to 7, either 6 or 8, is just as likely as any other number, and so on, with every consecutive number.

Look at it another way. Let’s replace the number you choose with a color, and then color one of the balls with each color.

Thus, instead of picking between numbers 1-10 for a lotto, let’s say you have to choose between: Red, Yellow, Blue, Orange, White, Green, Pink, Black, Purple, and Brown.

Would it be just as likely that Orange, Pink, and Brown came up as Red, Yellow, and Blue?  The answer is yes.  The difference here, is that our brain doesn’t see an irrelevant ordering of the possible outcomes, it just sees a list of 10 colors.

So we have established that the odds are the same.  But didn’t I claim that the man’s method was actually better than what most people do.  Here goes.

Random numbers chosen by human’s aren’t ever truly random.  If you don’t believe, try the following.  Write down 8 random numbers from 1-100.  It’s okay…I’ll be waiting for you to scroll down when you are done.

Got them? Good.  Now, look through them and determine how many are odd.  We would expect 4 to be even and 4 to be odd, but is that the case.  Unlikely.  Also, given that there are 25 prime numbers from 1-100, we would expect 1/4th or 2 out of your 8 numbers to be prime.  I’ll bet you chose more than 2 primes.  For reference, the primes up to 100 are 2, 3, 5, 7, 11, 13, 17, 18, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97.

For more on why this is the case, check out this cool site: http://scienceblogs.com/cognitivedaily/2007/02/is_17_the_most_random_number.php

Anyway, what we have determined is that people are more likely to choose a combination like 7, 11, 25, 31, 36, 41 when choosing “random” numbers, than a combination like 1, 2, 3, 4, 5, 6.  Some would say 1, 2, 3, 4, 5, 6 is the type of combination of numbers only an idiot would choose.

Unfortunatley, what people really think…watch me!

In reality, however, keep in mind that in the event of more than one person having a winning lotto ticket with all the numbers matched, the prize is split amongst all winners.  If every combination has the same odds of winning, but UNequal chances of being chosen by humans, and thus different chance of being split, than based on a comparison of expected values, it would seem that it is actually the BETTER choice to choose number combinations that people would not pick.

The man was 100% in the right, but in the minority of those who realized it.

Think about that the next time you are choosing those 6 magic digits that stand between you and a life of luxury.

* All of this is a moot point, however, when the truly wise realize the expected value of the lotto yields a payoff percentage of only about 61% and that the best choice one can make, statistically, is to not play the lotto at all.

My most recent scrabble results, now based on a sample size of 14 games.  Results going strong. If you are not sure what this is all about, check out the first post in this blog.

The Statistics of Snow Days

The other day, one of my colleagues sent out an email with the following message:

“Ray is better at statistics than I am, but it seems that out of the 19 school days so far for the month of January, our last full/regular day was back on Jan 14th.  Due to weather and exams, we  have had only 8 normal days of school this month.

Yikes!”

It made me wonder, “what is the likelihood of having a January like this?”  So as I sit here, home, on another snow day, I am going to find out.

First, let’s revisit the month:

Jan 3: Full Day

Jan 4: Full Day

Jan 5: Full Day

Jan 6: Full Day

Jan 7: Snow Day

Jan 10: Full Day

Jan 11: Full Day

Jan 12: Snow Day

Jan 13: 90 Minute Delay

Jan 14: Full Day

Jan 18: Snow Day

Jan 19: 90 Minute Delay + Half Day: Exams

Jan 20: Half Day: Exams

Jan 21: Snow Day

Jan 22: Half Day: Exams

Jan 24: Half Day: Exams

Jan 25: Half Day: Exams

Jan 26: Early Dismissal

Jan 27: Snow Day

Jan 28: Half Day: PD

Jan 31: Full Day

Second, we need to know the probability of having no school, early dismissal, or 90 minute delay on any given January day in Ct.  Unfortunately, I couldn’t find any such data from any decent sources, so our probabilities will have to be hypothetical estimates.

For every 16 days of school in previous years I believe a good estimate of the expected number of 90 minute delays, school cancellations, and early dismissals in January to be about 2, 1, and 1 respectively.  Thus:

P(90 minute delay) = 2/16 = 1/8 = 0.125

P(School Cancellation) = 1/16 = 0.0625

P(Early Dismissal) = 1/16 = 0.0625

P(Full Day) = 12/16 = 3/4 = 0.75

Now, even though they were half days, the exam days and the PD day were planned, so for purposes of rating the frequency of snow, we will treat those days as full days(with the exception of the 90 minute delay exam day).

2 90 minute delays

5 school cancellations from snow

1 early dismissals

12 full days

When based on our probabilities above we would have expected January to look like:

0.125*20 = 2.5 90 minute delays

0.0625*20 = 1.25 school cancellations

0.0625*20 = 1.25 early dismissals

0.75*20 = 15 full days.

On first inspection, although the number of 90 minute delays and early dismissals is actually less than what we might expect for a typical January, the number of snow days is significantly larger.

Now for the actual Statisitical jargon. For those not interested, jump down to below the line.

Performing a Chi-Squared Test for Univariate Categorical Data with:

Ho: January is a typical month

versus

Ha: At least one of the ways in which the day can be disrupted is significantly different than what is expected for January

we have a Chi-Squared Test Statistic Value of ((2-2.5)^2)/2.5 + ((5-1.25)^2)/1.25 + ((1-1.25)^2)/1.25 + ((12-15)^2)/15 = 12 with 3 degrees of freedom.  This corresponds so a P-value of between .005 and .01.

The key feature, of course, being the number of full day school cancellations(5 versus the expected 1.25).  Considering a binomial situation where every day is considered either a “success”(No school) or a “failure”(some part of a school day), the probability of at least this many snow days is:

P(X > or = 5 Snow Days) = 20!/(5!(15!))*(0.0625^5)*(0.9375^15) =.0056

____________________________________________________________________

Therefore, we can say with AT LEAST 99% certainty, that this is a statistically significantly atypical January.

Furthermore, the likelihood of seeing this many days off in January is about 56 out of 1000 or approximately only a half of a percent.

The Best Calculus Formula Sheet Ever!

Some of you may be familiar with the famous mathematical surface, the Mobius Strip.  For those who are unaware, the Mobius Strip is a shape made from a flat plane, like a strip of paper, that is twisted and connected at its ends so that the front side of the original connects to the back side of the original and vice versa.  Because a single line drawn along the surface will cover the entire surface of the object before connecting back to itself, this is considered a singled sided object.

Here’s a video for the visual learners out there.

Now let’s say the following happens(as is the case for the story one of my fellow math teacher friends told me to prompt this blog entry) that you are preparing for a Calculus (or any other) exam and your teachers says you can use all the notes/formulas that you can fit on one side of one half of a piece of paper.

Want to get to use twice as many notes as everyone else?  Follow these simple steps.

Step 1:

Cut an 8.5 x 11 inch piece of paper the long way.

Step 2:

Put your notes on both sides of this strip.

Step 3:

Twist to make a Mobius Formula Sheet.

Step 4:

Print out information from the net about Mobius Strips just in case your teacher doesn’t believe you that this is a one sided formula sheet.

I know that this is supposed to be a blog for my own thoughts, but this was just to awesome not to include here.

10/10 on Entry #1 - BRAVO

Is the macro that you used similar to the macro that you wrote and modified for my differentiation tool?

How often do you and Karen play Scrabble? Define 'rabid' that is...

What is your prediction for color in say, one month? One year?

The macro is similar. The one main difference is that it has a predefined number of gradients.

We average one full game every three days.  Sad I know, but it is super easy to play a game over the iphone taking a turn whenever you have a spare moment.  Is that rabid…I don’t know…I lack a degree in “rabidology”.

My predictions are as follows:

The darkest colors on the diagram showing frequency of tiles will focus most on the double and triple words since we fight tooth and nail for these.

The darkest colors on the diagram that counts the values of the tiles played will be highest on double and triple letters.  Because we don’t often leave a word multiplier within a square or two of word branches, and it is hard to make long words with J, Q, Z, and X, we typically end up throwing x’s on a triple letter space right next to an o (making ox) or things like that.

Finally, I think the corners beyond the triple word scores will be barren deserts of nothingness.

The Gamble

So, two days ago in class a student told me, with 100% certainty that we would be in school the following day.  I tried to convince him that the predictions called for a large amount of snow, and that it was quite doubtful we would have school.  We agreed on a friendly wager.

Yesterday various parts of CT saw anywhere between 21 and 30 inches of accumulation.

Here are the spoils for winning the bet:

The Mathematical and Artistic Beauty of Scrabble

Ahh…First entry…Here goes.

(For those who don’t want to get into any pure mathematics…skip to the good stuff…the paragraph before the first video.)

Consider the function f(x) = x^2 + 0.25.

Now consider a situation where you put a number in the function and then calculate the function value.  Now you take that function value, put it into the function and calculate a new function value.  Repeat doing this.

f(1) = 1.25

f(1.25) = 1.8125

f(1.8125) = 3.53515625

… and so on…

Continue to do this forever unless one of the following things stops you:

a.) You hit a value that keeps returning itself as a function value.

ex.) f(0.5) = 0.5…so every time you put 0.5 into the function, it gives itself back.

b.) You reach a point where a repeating cycle of numbers occurs.

c.) Your values go asymptotic to a certain value.

c.) Your values diverge off to infinity.

Now…let’s say you consider a number line from 1 to 100. Ignore those values that don’t go off to infinity.  Every other possible starting value (those that go off to infinity) gets a different color depending on how many times they must be put into the function before expanding past a certain point, say 1,000,000,000.  Perhaps those who do it more slowly get a lighter color along the spectrum, a yellow perhaps.  Those which diverge quite quickly get a darker color, say brown.

This concept of using colors to represent mathematical patterns is not new.  Benoit Mandlebrot, a famous mathematician, has a geometrical image called the Mandelbrot Set named after him.  Behold:

Another fascinating display of the color patterns that emerge within mathematics is one I stumbled upon from another blog by a very talented “Mathemusician” named Vi Hart.  Check out this video of hers investigating visual patterns in things like Prime Numbers and Pascal’s Triangle:

So now it is my turn for a little fun with colors.

A little background info about my wife and I: We LOVE Scrabble.  We are rabid.  Seriously, we trash talk the hell out of each as we play games on our Iphones.  She is hardcore and our games are never one-sided.

I had the thought: “What would a board, similar to the square of primes that Vi drew, look like if it displayed a history of our games?”  Would beautiful patterns emerge or would it be a garbled, chaotic mess?  Here’s what I did:

Step 1- Create a Microsoft Excel Workbook with separate sheets for each of our games.  They look like this:

The top left board is where I input what actually happened in our game.  The bottom left contains a 0 in any spot where a tile was not played and a 1 where a tile was played.  The top right contains the value associated with the tile placed there.

Step 2- Create a sheet to summarize all our games.  That looks like this:

Nothing is displayed on this sheet in the top left corner.  The other two sum up the results of all the grids in the game sheets.

Step 3- Notice the color.  No, I was not nostalgic for Autumn.  The varying shades represent the magnitude of that spot, either, number of times a tile was played there for the bottom left, or combined value of tiles played for the top right.  A macro provided the coloring process and can be run again when additional data(games) are entered.

Right now I only have 5 games, a sample size too small to produce any statistically significant data.  As this “experiment” continues, however, I will be entering the results of more games hoping to see if any visible patterns emerge.

I’ll leave you today with a quote.

The mathematician’s patterns, like the painter’s or the poet’s, must be beautiful; the ideas, like the colours or the words, must fit together in a harmonious way. Beauty is the first test: there is no permanent place in the world for ugly mathematics.

- G. H. Hardy