Bayes’ Theorem: A Deep Dive into Statistics

So, I pondered various ways to introduce the concept of statistics, which I’m currently delving into along side Data Science concepts.

Here is my current repo where I explore different statistical methods for working with datasets in Python: View Repo on GitHub.

However, I was afraid that listing out the benefits of statistics might be a bit boring. I believe that most of us with an a basic high school education have encountered concepts such as mean, median, and mode at some point in our lives and understand, to a certain degree, on the importance of statistics.

What I find most intriguing is not the nitty-gritty of the math itself, but rather the practical applications of statistics within the field of Machine Learning (ML). Exciting, right? So, I decided to focus on one thing that I find super interesting per post.

I decided to focus on Bayes’ Theorem, developed by a mathematician named Thomas Bayes, who wanted to find a way to predict occurrences based on new information provided so that the model could be updated.

Bayes’ Theorem Summary:

  1. P(A): How likely A is to be true (What we thought before) – The initial probability of our idea or hypothesis.
  2. P(B): How likely B is to happen (The evidence) – The chance of seeing the evidence.
  3. P(A|B): How likely A is to be true after considering the evidence (What we think now) – Our updated belief based on the evidence.
  4. P(B|A): How likely the evidence is to show up if our idea is true (The connection) – The chances of seeing the evidence when our idea is correct.

In simple words, Bayes’ Theorem helps us adjust our thinking (P(A)) based on new facts (P(B|A)), while taking into account how likely we’d see those facts (P(B)) and how our idea might be true given the facts (P(A|B)).

bayes' theorem formula

Here’s a breakdown of how it works:

Imagine a situation where there’s a rare disease, let’s call it “Unicornitis,” that affects only 1 in 10,000 people. You are  worried about having this rare disease because you have some symptoms.

You decide to take a medical test for Unicornitis, and this test is known to be pretty accurate — it correctly identifies the disease 99% of the time. If you don’t have Unicornitis, the test correctly says you’re healthy 99% of the time.

Prior Knowledge (Before the Test): Before taking the test, your prior probability of having Unicornitis is based on the general population’s rate, which is 1 in 10,000, or 0.01%.

grayscale photo of topless woman
Photo by Elīna Arāja on

Test Result: After taking the test, it comes back positive, indicating that you have Unicornitis. I’m really sorry to hear that. 😔

Now, with this new information, you might think you have a 99% chance of having Unicornitis, right? Well, what if there were other factors to consider, or what if something changes?

Recap: In  Bayes’ Theorem, we use new information, or posterior knowledge, to update the statistical model instead of just relying on prior knowledge.  Using this formula can actually result in something different because it considers various criteria that may need to be checked off, such as days of exposure, age, weight, etc. All of these variables can change, so there needs to be a way for us to measure changing variables and data with accuracy.

So, your chances of having Unicornitis might defer… which could be good news. Yay!

This ability to update models with new information is especially crucial in the healthcare industry when dealing with patient diagnoses and in weather forecasting.

Practical Applications:

Another practical application of Bayes’ Theorem in the field of AI can be seen in predicting what you’ll watch on Netflix.

Ever noticed how Netflix suggests shows or movies you might like? That’s AI in action, and Bayes’ Theorem is part of the magic.

Prior Knowledge: Netflix knows what shows and movies you’ve watched before and what other viewers with similar tastes have watched. It has a library of data.

New Choice: When you pick a movie or show, Netflix uses Bayes’ Theorem to guess what else you might enjoy watching based on your history and what people like you have watched.

Bayes’ Theorem in Action: It combines what it knows about you (the prior knowledge) with your current choice (the new information) to make predictions. That’s why you often see recommendations for shows that are similar to what you’ve watched before.

If you’re interested in delving deeper into Bayes’ Theorem, you can check out this amazing video created by one of my favorite YouTube channels 3Blue1Brown: Bayes’ Theorem Video.


Leave a Reply