Bayes Theorem underlies a lot of modern thinking about about data driven marketing analytics. The purpose of this post is to draw a quick understanding of a few basic concepts around this topic.
When we talk about Bayes Theorem we are speaking about probabilities of things occurring given the evidence. We go into situations with a prior understanding of the way things are and use evidence to update our beliefs based on what the reality has shown us to be true.
For example, let's say there is a 1% chance of being sick on a given day. My dog predicts sickness with an 85% accuracy by when he comes up and sits on my lap in the morning. Given these two statements we can use basic probability theory to say that the probability of my dog predicting that I am sick given that I actually am is P(Dog Predicts | Sick) = .0085 or .85% which is my True Positive rate. We can take the compliment of our facts to say that the probability that I am not sick given my dog predicted I am and sat on my lap is P(Not Sick | Dog Predicts) = .198 or 19.8%, which is our False Positive rate.
So now let us say I don't yet have any symptoms of being sick. But my dog being the especially attentive individual he is decides to sit on my lap in the morning and take care of me. What is the probability I am sick that day? We take the probability of the true positive outcome divided by the sum of all positive outcomes. This gives us 0.0085 / (0.0085 + .198) which is .041 or 4.1%.
This does not seem very high. But remember what we said about prior understandings of the way things are. Our prior probability of being sick on a given day was 1%. With our evidence we have updated our probability of being sick to 4.1%. There is still a large probability that we are not sick and our dog is just overreacting but the magnitude of the increase in our probability of being sick on a given day has increased considerably.