Other issues in this category (48)
Bayes: A complex idea in simple words
As you know, nothing is more precise than the probability theory, for it is the only theory that deals with statistics based on accurate measuring.
Named after its inventor, the 18th-century Presbyterian minister Thomas Bayes, Bayes’ theorem is a method for calculating the validity of beliefs (hypotheses, claims, propositions) based on the best available evidence (observations, data, information). Here’s the most dumbed-down description: Initial belief plus new evidence = new and improved belief
Here’s a fuller version: The probability that a belief is true given new evidence equals the probability that the belief is true regardless of that evidence times the probability that the evidence is true given that the belief is true divided by the probability that the evidence is true regardless of whether the belief is true. Got that?
The basic mathematical formula takes this form: P(B|E) = P(B) X P(E|B) / P(E), with P standing for probability, B for belief and E for evidence. P(B) is the probability that B is true, and P(E) is the probability that E is true. P(B|E) means the probability of B if E is true, and P(E|B) is the probability of E if B is true.
If you didn't manage to grasp the definition's meaning on your first attempt, try to play it in your head one more time.
Meanwhile, let's use an example to see how the theorem works. An advertisement:
|Anti-virus 1||Anti-virus 2||Anti-virus 3|
|Overall malware detection according to Virus Bulletin (May 1998 - December 2009)||75%||97%||94%|
And the choice looks to be pretty obvious, right?
Let's alter the text from the post we've quoted above:
Assume that we've checked our system for malware and conclude with 1% probability that the system is clean.
Let’s say we also evaluate the reliability of our anti-virus to be around 99%. Question: if the anti-virus reports that the system is clean, how likely is it to be infected?
Now Bayes’ theorem displays its power. Most people assume the answer is 99%, or close to it. That’s how reliable the scan is, right? But the correct answer, yielded by Bayes’ theorem, is only 50%.
To solve for P(B|E), you plug the data into the right side of Bayes’ equation. P(B), the probability that your system was infected prior to getting scanned, is 1%, or .01. So is P(E), the probability that an infection is detected.
What about the denominator, P(E)? Here is where things get tricky. P(E) is the probability of testing positive whether or not your system is infected. In other words, it includes false positives as well as true positives.
To calculate the probability of a false positive, you multiply the rate of false positives, which is 1%, or .01, times the percentage of systems that haven't been infected, .99. The total comes to .0099. Yes, your terrific, 99%-accurate test yields as many false positives as it does true positives.1
Let’s finish the calculation. To get P(E), add true and false positives for a total of .0198, which when divided into .0099 comes to .5. So once again, P(B|E), the probability that your system is really infected is 50%.
Let's put it in simpler terms. If you scanned your system only once, the probability that your computer is clean is 50-50. That may sound weird, but this it is correct because the anti-virus doesn't recognise unknown malware. But everything changes if you run the scan again.
If you go through the test again, you can reduce your uncertainty enormously because the probability of your system being infected, P(B), is now 50% rather than 1%. If the second scan also comes up positive, Bayes’ theorem tells you that the infection probability is now 99%, or .99. As this example shows, iterating Bayes’ theorem can yield extremely precise information.
That is to say that the probability of your system being infected decreases with every subsequent scan because, on the one hand, an anti-virus receives information about previously unknown malware with each subsequent update, and on the other hand, the probability that you download unknown malware regularly is luckily low.#anti-virus_scan #security
Don't neglect your system security and run anti-virus scans regularly. The Bayes theorem really works!