Hi Everyone,
Well, I got myself into this by offering to explain why my probability math is correct in response to Fundy's interpretation of annual bleed risk over on the "Valve Selection" forum area in the thread titled
"Anyone wish they had chose the other option?" :
http://www.valvereplacement.org/forums/showthread.php?41059-Anyone-wish-they-had-chose-the-other-option/page3
First of all, I feel I owe a bit of introduction to support my claim that I do actually know at least a bit of what I'm talking about. I am NOT a statistician or math major, but I am an engineer with both a MS and BS degree from MIT. I did help teach some courses there during my senior and graduate years, but that was many moons ago. I do still work as a research scientist, and interact with math majors and other scientists (yes, I even know some NASA rocket scientists) on a frequent basis. So, while I think I still remember how to explain things reasonably clearly, if I come across as pedantic I assure you it is not intentional, and I'm only trying to help.
Because so many of the medical papers you will read involve statistical analysis of data and present probabilities of risk in the conclusions, it is easy to misinterpret these papers. Sometimes even the authors are wrong, although this is not usually the case if you stick to peer-reviewed journal articles.
Because this is a topic so near and dear to our hearts (pun intented), I am willing to risk public humiliation and ridicule trying to explain some of the key points here.
To begin, I want to follow up first on the example that Fundy gave in the other thread, where he claimed that if a woman were planning to have 4 children, and each child had a 25% risk of inheriting a disease, then he concluded there was a 100% probability that 1 child would have the disease if there were 4 children total with 25% chance each.
"In the 25% chance of a baby with disease. If the woman said she expected to have 4 children and another woman says 8 children in the 25% chance per child scenario. Each child is a 25% chance no matter how many children are born.
But in the first case 4 x 25% would indicate that after four children there is a 100% statistical probability of having one child with the disease. Meaning she should expect to have 3 disease free and one with the disease. "
My response to that was:
"If each child has a 25% chance (0.25 probability) of having a disease. Then the probability of each child NOT having the disease is 0.75
If you expect to have 4 children then the probability of having 4 children ALL without the disease is 0.75*0.75*0.75*0.75 = 0.316 or about 31.6% probability that all 4 children will be disease free, or about 68.4% probability that AT LEAST one of the children will have the disease, never 100%."
By way of proof for this case, I have prepared an Excel spread sheet that shows each and every possibility in this scenario, which you can download from this link:
http://www.wstco.com/ValveReplacementInfo/4KidsWith25PercentChanceEach.xls
To represent the 25% chance of disease for each child, I assigned 4 possible states for each child; A, B, C, and D, where "A","B" & "C" represent a disease-free outcome for that child, and "D" represents the child having the disease. Thus, there is a 25% probability for each individual child that he/she will have the disease.
The spreadsheet presents all possible permutations/outcomes of 4 children having 4 possible states each. There are 256 permutations, or possible outcomes for this proposed situation (4 to the power of 4). I color coded the spreadsheet so each permutation will be easier to track.
Each of these 256 outcomes has a 1 in 256 chance of happening if no child is yet conceived.
If you go through all 256 possible permutations and count how many there are with NO child having state "D", you will get 81 such possible outcomes. I have made this a bit easier by adding the columns to the right of the permutation matrix, one titled "Are ALL children disease free".
If there are 81 cases out of 256 possible outcomes where ALL children (4 out of 4) are disease free, this is a probability of 81/256 = .3164 or 31.64%. Please notice that this number agrees with the number I gave in the other thread.
If you again go through all 256 possible permutations and count how many there are where AT LEAST one child had the disease (state "D") you will get 175 such outcomes out of the total possible 256 outcomes. So, 175/256 = .6836 or 68.36%. Again, this agrees with the number I presented for this case in the other thread.
The sum of all the possibilities DOES equal 1 since 0.3164 + 0.6836 = 1.00
Notice that the two possibilities above are mutually exclusive, and do cover all the situations. Either all children are disease-free or there are 1 or more children with the disease. There are no other logical options. 81 disease free outcomes plus 175 with some disease outcomes makes up the full 256 set of permutations.
A different question, not specifically given before, is "What is the likelihood that EXACTLY ONE child will have the disease, and the other 3 will be free of it?". Counting the number of permutations that have exactly 1 child with the disease gives 108 such outcomes. So the probability of EXACTLY ONE child having the disease out of 4 is 108/256 = 0.4219 or about 42.2%. This case is NOT mutually exclusive from the case of AT LEAST ONE child having the disease, but rather it is one component of that greater aggregate possibility. Also, note that this is the probability of ANY SINGLE ONE out of the 4 being the lone unlucky child, not the probability of specifically child #1 or specifically child #2 having the disease - those would be back to the 25% probabilities we started with.
How many cases are there where EXACTLY 2 children out of 4 have the disease? There are 54 such cases, giving a probability of 54/256 = 0.2109 or 21.9%
There are 12 cases where EXACTLY 3 children out of 4 have the disease, giving a probability of 12/256 = 0.0469 or 4.7%
There is exactly 1 case where all 4 children out of 4 have the disease, giving a probability of this outcome of 1/256 = 0.0039 or about 0.4%
Once again, the total of all the possible mutually-exclusive outcomes is:
81 none of the 4 children has disease
108 where 1 child out of 4 has disease
54 where 2 children out of 4 have the disease
12 where 3 children out of 4 have the disease
1 where all 4 children have the disease
---------------------
256 total possible outcomes
Notice that the case of AT LEAST ONE child having the disease given earlier (0.6836) DOES equal the sum of the 4 more restrictive cases where EXACTLY 1, 2, 3 or 4 children have the disease ( 0.4219 + 0.2109 + 0.0469 + 0.0039 = 0.6836). It may be confusing that I am adding these probabilities here, but this is because these are essentially "simultaneous" probabilities as given, not sequential probabilities of 4 things happening in a row.
Hopefully this helps prove that the math I used in the other thread is correct and does give the correct result. By going through each of the individual permutations, this is a brute-force way of getting the same answer, but perhaps it is easier to see that it is correct.
I'll try to monitor this thread frequently throughout the weekend to get into more explanations, or other case examples if I can. But, I have a be spending a lot of time up in the attic this weekend to get some air-sealing and insulation done before the colder Winter weather really hits us here.