Today, we talk about Model Selection and Cross-Validation - two of the most important topics in this course.
We will start with a simple thought experiment. This experiment's motivation is to demonstrate the core intuition behind the model selection and cross-validation.
Suppose we have a bent coin, so that the probability of landing heads is no longer necessarily equal to 0.5. Let us say that we flip this bent coin 10 times and we get 4 heads.
⚫ ◯ ◯ ⚫ ◯ ◯ ◯ ⚫ ⚫ ◯ $\\LARGE \\left( \\frac{4}{10} \\right)$
Here in the above figure ⚫ refers to a coin toss that resulted in heads, while ◯ refers to a coin toss resulting in tails.
If you wanted to predict the true heads-probability of the coin, a reasonable guess would be $\frac{4}{10}$. This guess is unbiased in the sense that if you did the whole experiment (flipping the coin 10 times and averaging) many times you'll make lots of different guesses, but the average of all your guesses would be the true heads-probability.
(In general, any random variable is unbiased if its expected value is what you're trying to estimate.)
Let suppose we have three bent coins. Each has a different "bend", i.e. a different heads-probability. We want to find the coin with the highest heads-probability.
Suppose we flip each coin 10 times, during which coin 1 turns up heads 4 times, coin 2 turns up heads 6 times, and coin 3 turns up heads 5 times.
**Coin 1:** ⚫ ◯ ◯ ⚫ ◯ ◯ ◯ ⚫ ⚫ ◯ $\\LARGE \\left( \\frac{4}{10} \\right)$
**Coin 2:** ⚫ ◯ ⚫ ◯ ◯ ⚫ ⚫ ⚫ ◯ ⚫ $\\LARGE \\left( \\frac{6}{10} \\right)$← best performer
**Coin 3:** ◯ ⚫ ◯ ⚫ ◯ ◯ ◯ ⚫ ⚫ ⚫ $\\LARGE \\left( \\frac{5}{10} \\right)$
Coin 2 did the best. If you had no other information, clearly you'd pick coin 2. But how well do you expect coin 2 to perform in the future? Would the performance of the best coin (6/10 in this case) be an unbiased estimate of the coin that we pick?
Before we answer that, let's perform another experiment.
Now suppose we have a 1000 coins. We flip each of them 10 times. It happens that coin 249 does this best, with a total of 10 heads.
**Coin 1:** ◯ ⚫ ◯ ⚫ ⚫ ⚫ ◯ ⚫ ⚫ ◯ $\\LARGE \\left( \\frac{6}{10} \\right)$
$\\vdots$
**Coin 249:** ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫ ◯ ⚫ ⚫ $\\LARGE \\left( \\frac{9}{10} \\right)$← best performer
$\\vdots$