Informed Choices in Data Science
This is a blog post I wrote for a private audience while learning about Big Data and Machine Learning. Reproducing the article below:
[Circa: April 2015] There is an interesting problem circulating the web lately. It’s a problem that appeared in a math paper of Singapore school for 5th grade. Why am I talking about a 5th-grade math problem here? Let’s find out.
Below is the problem as it appeared in the exam.
I hope you have read the problem by now. So let me ask you a question, When is Cheryl’s birthday?
I hope everyone remembers the 3 door problem. Famously known as the Monty Hall Problem. One door has a Ferrari behind it and the other two don’t. You have to choose and if the door you chose has the car, you get the car. You can read about it here.
The case in the above problem is we need to combine two pieces of information Day and Month, each is with a different person Albert has a month and Bernard has a day. We have to do all of this based on informed choices (Bayes’ theorem) by listening to a conversation between them. We need to start with assumptions and as the conversation progresses, we need to update our probabilities.
What does each of the persons Albert and Bernard know?
Albert: Month in all, there are four distinct months [May, June, July, August], So Albert knows one of these.
Bernard: Days in all, there are six distinct days [14, 15, 16, 17, 18, 19], So Bernard knows one of these.
This is the information we have at the start.
Albert: I don’t know when cheryl’s birthday is, but I know that Bernard does not know too.
Now in the first sentence by Albert, he says he doesn’t know the birthday and we agree with him because since he knows the only month and in the list of 10 dates given to us, each month has more than one value, so there is no way Albert knows the birthday. Now he also makes other comments in his first line and says, “but I know Bernard does not know too.” Because he knows the month, Cheryl told him the month, he figures out that the only way Bernard can know the complete date is if Cheryl told him 18 or 19. Listening to that, Bernard updates his information, ruling out May and June for a month now. He is left with July and August and he knows the date. So he knows the full date now.
Possible Month: [July, August]
Possible Date: [14, 15, 16, 17]
Bernard: At first I didn’t know when Cheryl’s birthday is, but now I know.
Bernard comments that he did not know the exact date, then he also comments that he exactly knows the date. This information lets Albert update his choice (Probability), the only way Bernard can know the exact date is if its unique the only day unique in the list is 16 [list is 14, 15 for Albert]. Since Albert already knows the month, he knows that the birthday is 16th July.
I read this problem and solution here: theguardian
Thought to give it a different dimension. I hope you enjoyed it. Do let me know your comments if you have any alternate thoughts.
For any questions and inquires, visit us on thinkitive