Suppose I want to evaluate the probability, given the state of my knowledge, of a certain proposition X. I say that learning the additional fact that some random guy entirely unknown to me, John Smith, asserts that X is true should increase my assessment of the probability that X is true.
Reductio: We know from Bayes’ Theorem that the probability of a hypothesis, given certain evidence, equals the probability of the evidence given the hypothesis, times the probability of the hypothesis simply, divided by the probability of the evidence simply. In other words,
1. P(h | e) = ( P(e | h) x P(h) ) / P(e)
2. Let the hypothesis be that X is true, and the evidence that John Smith says that X is true. Let us further assume that Smith’s assertion does not change the probability of the hypothesis. In other words,
3. P(h | e) = P(h)
4. From 1 & 3, P(h) = ( P(e | h) x P(h) ) / P(e)
5. Dividing both sides by P(h), we get
6. 1 = P(e | h) / P(e)
7. So, P(e) = P(e | h). That is, the probability of the evidence equals the probability of the evidence given the hypothesis.
9. From 2 & 7, the probability of John Smith saying that X is true equals the probability of Smith saying that X is true given that X is true. In other words, Smith is equally likely to assert X whether or not X is true.
10. Since we know nothing about Smith, we must judge him evidentially as a random person, and the same about the proposition since its content was unstated.
11. Thus, on average, a given random person has the same probability of asserting a given random fact regardless of whether it is true or false.
12. Now, with regard to observable facts, no one holds (11). For instance, no one would say that the probability of a random person’s asserting “the sky is blue” is independent of the sky’s actually being blue.
13. So in order for (11) to be true on average, the class of unobservable facts must have a negative correlation between truth and assertion. First, this is highly dubious in itself. Second, it would not be sufficient even if true, simply because most things people say are about observable facts.
14. Therefore, the second assumption in (2) is false.
This can also be shown from a related probability formula. The probability of A given B equals the probability of both A and B, divided by the probability of B. That is,
P(A | B) = P(A & B) / P (B)
But if P(A | B) = P(A), then
P(A) x P(B) = P(A & B)
But the probability of two events both occurring equals the product of the two independent probabilities when there is no causal relationship between the two, for instance two separate coin tosses. So this would imply that there is no causal relationship between the truth of things and what people say about them, which again is impossible. Thus, it is necessary to say that the mere assertion of a fact constitutes evidence for the truth of that fact.
One last argument: there are many things that we hold simply on the basis of someone’s assertion. We would not do this unless we thought that the assertion was evidence for its own truth.
…learning the additional fact that some random guy entirely unknown to me, John Smith, asserts that X is true should increase my assessment of the probability that X is true.
This can’t possibly hold on subjects for which you know yourself to be more knowledgeable than the general population (or than whatever sub-population John Smith is to be drawn from). For example, I know what I’m eating right now; John Smith can assert I’m eating X all he wants but my subjective probabilities regarding what I’m eating aren’t budging.
And more abstractly, even if your setup is that we’re not supposed to know anything about the proposition a priori, one might still judge oneself to be more knowledgeable than the average population on ‘most’ or on a sufficient range of topics that your proposition is not likely to hold for them ‘on average’ – and thus, doesn’t hold.
Your proof essentially argues that P(h|e)=P(h) can’t be true ‘on average’ or in the generic situation. I guess what I’m saying is there are two directions in which this can be untrue and which direction is appropriate depends on where you deem your knowledge in relation to ‘John Smith’.
To your last paragraph, I’ll just point out that the fact that people do something ‘many’ times doesn’t mean it’s a generalizable rule. There is a selection bias here. Yes it’s true that there are ‘many’ situations in which John Smith’s assertion e increases our P(h|e), but these would be precisely those situations in which we judge John Smith to be a reliable, trustworthy witness worth listening to on the subject. When and where we don’t, we don’t modify our P(h) – and rightly so.
This can’t possibly hold on subjects for which you know yourself to be more knowledgeable than the general population (or than whatever sub-population John Smith is to be drawn from). For example, I know what I’m eating right now; John Smith can assert I’m eating X all he wants but my subjective probabilities regarding what I’m eating aren’t budging.
The argument makes no assumption about whether I am more knowledgeable than John Smith. Therefore it holds in either case (unless it is simply invalid, in which case it doesn’t hold in any case–but you would need to show that it is invalid by pointing to a particular step and explaining the mistake).
If I am more knowledgeable than Smith regarding the subject matter in question, all that follows is that my assertion to the contrary is stronger evidence than his assertion. It does not follow that his assertion is not evidence at all. Similarly, in your example, the fact that you can see what you are eating and Smith cannot merely points to the fact that the evidence of your senses is stronger evidence than his opinion. It does not mean that his opinion is not evidence at all. If you fail to adjust your subjective probability based on his opinion, then you are simply choosing to ignore some of the available evidence and thereby increasing the margin of error in your estimation of the probability. The fact that the margin will be very small compared to the strength of evidence on your side (i.e. your senses) does not mean that it is not an error nonetheless.
A real-life example: I have in the past taught algebra to high school students. Occasionally, one of them would point out that I had made a careless mistake while writing on the board. This assertion manifestly constituted evidence that I had in fact made a mistake–and this despite the fact that I knew myself to be more knowledgeable than any of them. It would have simply been foolishness and arrogance on my part to claim that their opinion therefore was irrelevant.
And more abstractly, even if your setup is that we’re not supposed to know anything about the proposition a priori, one might still judge oneself to be more knowledgeable than the average population on ‘most’ or on a sufficient range of topics that your proposition is not likely to hold for them ‘on average’ – and thus, doesn’t hold.
The above applies here as well.
To your last paragraph, I’ll just point out that the fact that people do something ‘many’ times doesn’t mean it’s a generalizable rule. There is a selection bias here. Yes it’s true that there are ‘many’ situations in which John Smith’s assertion e increases our P(h|e), but these would be precisely those situations in which we judge John Smith to be a reliable, trustworthy witness worth listening to on the subject. When and where we don’t, we don’t modify our P(h) – and rightly so.
Again, having prior reason to think Smith reliable simply increases the strength of his evidence. It does not mean that it does not constitute evidence even before we have that reason. My argument did not depend on Smith’s reliability, so again, either this is not relevant or you need to show in what step the argument makes a mistake. In fact, the argument assumed that we have no knowledge of Smith’s reliability. This shows that Smith’s opinion constitutes evidence unless we have positive evidence against his reliability. We do not need evidence for his reliability.
The argument makes no assumption about whether I am more knowledgeable than John Smith. Therefore it holds in either case
I don’t understand. Unless I’m really missing something, the fact that you didn’t specify whether you are more knowledgeable (etc) doesn’t imply it “holds in either case”. My very point is that it holds in some cases and not in others, thus can’t be taken as a general rule.
unless it is simply invalid, in which case it doesn’t hold in any case
There is ambiguity here. Your proposition above was completely general. I think it is invalid. That doesn’t mean I think it “doesn’t hold in any case”. Indeed, I can certainly imagine it holding for some people, sometimes. But it doesn’t hold always, which is what would be necessary for your proposition to be true – thus it isn’t.
If I am more knowledgeable than Smith regarding the subject matter in question, all that follows is that my assertion to the contrary is stronger evidence than his assertion.
But the fact that your knowledge is stronger evidence means that his assertion may be discounted, ignored, or perhaps even taken as contraindicative, as the case may warrant. The issue is whether his assertion must necessarily make you alter your “P(h|e)” in his direction – and it need not. I can envision populations and topics for which a random person from that population asserting X would prompt me to decrease my subjective probability of X.
You’ve mentioned more than once that I need to show which step of your proof went wrong. I thought it was clear that I objected to (implied) line of reasoning that went “P(h|e)=P(h) is false, therefore we must increase our subjective probability”. You set up the idea that assumption (2) (i.e. (3)) leads to a contradiction of sorts, but there’s a gap in the proof, because even if “P(h|e)=P(h), always” is false that doesn’t mean P(h|e)>P(h) uniformly. Could have P(h|e)<P(h) in some cases. Could be that all combos are possible and it depends on the situation (which is my view).
Similarly, in your example, the fact that you can see what you are eating and Smith cannot merely points to the fact that the evidence of your senses is stronger evidence than his opinion. It does not mean that his opinion is not evidence at all.
This begs the question of what exactly you can possibly mean by “evidence”. In my view, a stranger chosen at random has for all intents zero in the way of meaningful evidence regarding what I’m eating. What evidence do you think he can have?
If you fail to adjust your subjective probability based on his opinion, then you are simply choosing to ignore some of the available evidence
I’m not ‘ignoring’ it, I’m assigning to it the appropriate weight, which in this case is zero.
and thereby increasing the margin of error in your estimation of the probability.
I know what I’m eating. I assign a subjective probability of 1 to that fact. Some random stranger IM’s me and tells me I’m eating something else. I do not lower my subjective probability regarding what I’m eating. You think I have now “increased the margin of error” in my estimation of what I’m eating? I disagree.
A real-life example: I have in the past taught algebra to high school students. Occasionally, one of them would point out that I had made a careless mistake while writing on the board. This assertion manifestly constituted evidence that I had in fact made a mistake
This is a real-life example illustrating that there can be cases where a random person’s assertion can force you to alter your priors. But I never disputed that. What I dispute is that this can be extrapolated to a general rule, that all random assertions must change your priors on all subjects.
In fact, the argument assumed that we have no knowledge of Smith’s reliability.
No, the setup implies that we have some knowledge of Smith’s reliability – we presumably know the population he is ‘randomly’ drawn from, whether you intend that to be ’society at large’ or some subset. Meanwhile, we also have some knowledge about our reliability. This enables comparison, at least sometimes, and that is exactly my point.
This shows that Smith’s opinion constitutes evidence unless we have positive evidence against his reliability.
Go back to my eating example. Here is my evidence against the reliability of a random asserter: he’s saying I’m eating raspberries, I believe otherwise, and I am in a better position to judge. Thus, I justifiably reject his assertion. Indeed, the fact that he’s asserting that I’m eateing raspberries when I very well know that I’m not constitutes evidence against his reliability and possibly his sanity, or at least his seriousness.
best
I don’t think you’re understanding the point of the argument. The claim is about the relevance of assertion precisely as such, not about about all of the other factors that may constitute additional evidence in a particular case. Thus, although it is true that we could stipulate a scenario in which some particular assertion results in a lower conditional probability, this is irrelevant. The argument does not deny it.
The essential point is this: more often than not a person is more likely to assert something given that it is true, than they are to assert it simply speaking. Is this the point you are disputing?
Concerning the raspberry IM scenario: I’d like to address that in some detail, so I’ll do a post on it.
You’re absolutely right that I could be missing something. What you seem to be saying here is that you intended your proposition to mean something like: ‘on average’ or ‘lacking other info’, for the average h the random assertion should increase your P(h) in its direction.
Even if I accept this, it’s not useful as a guide to action in any particular situation. And the problem is that your original proposition was stated as a “should”: “learning the additional fact that some random guy entirely unknown to me, John Smith, asserts that X is true should increase my assessment of the probability that X is true.” My very point is that I don’t think it should, necessarily. Even if you are right about the probabilities involved once you average over all h’s and e’s, the fact will always remain that in any given situation fitting your proposition, you will be in a position to (1) know what your hypothesis h is (it won’t be ‘random’), (2) judge your knowledge on that subject, and (3) know what population John Smith is drawn ‘randomly’ from (even if that’s just ‘all of humanity’). So there can be times when you may justifiably give zero or even negative weight to JS’s assertion.
Ok, you now ask, but won’t random assertions be true ‘more often than not’? The problem is that this suffers from huge ambiguity in the implied measure: what is ‘often’, how are assertions counted? I can think of at least two ways:
1) count all assertions that all people actually make (in some time period, say) and count the fraction which are true. This method would end up counting numerous trivial assertions such as “I’m hungry”, “it’s a nice day out”, etc and I suspect even the most dishonest, stupid person in the world would have more assertions on the ‘true’ side of his ledger than not.
2) count all assertions that are theoretically possible to make (i.e. count all possible values of X for your ‘a certain proposition X’), count what fraction of humans when forced to assert something about X would say something true, and average the results for all X.
What you want me to say, I guess, is that because people make true assertions ‘more often than not’ as reckoned by method (1) – which I do not deny – then your original proposition is true. I don’t think so, because method (2) is far more relevant to the setup of your proposition, and by method (2), I have no confidence whatsoever that the average/random person will tend to make true statements about the average/random proposition X.
Most people in the world would not be able to make a true assertion about quantum field theory if their life depended on it, for example. The vast majority of possible propositions X fall into that category. So this really depends on what sort of propositions or subjects we are talking about, perhaps.
But that’s my point.
Best,
Even if I accept this, it’s not useful as a guide to action in any particular situation. And the problem is that your original proposition was stated as a “should”: “learning the additional fact that some random guy entirely unknown to me, John Smith, asserts that X is true should increase my assessment of the probability that X is true.” My very point is that I don’t think it should, necessarily. Even if you are right about the probabilities involved once you average over all h’s and e’s, the fact will always remain that in any given situation fitting your proposition, you will be in a position to (1) know what your hypothesis h is (it won’t be ‘random’), (2) judge your knowledge on that subject, and (3) know what population John Smith is drawn ‘randomly’ from (even if that’s just ‘all of humanity’). So there can be times when you may justifiably give zero or even negative weight to JS’s assertion.
The presence of your point (2) is concerning to me. If you still think that your own level of knowledge of the subject can affect whether JS’s assertion constitutes positive evidence, then you don’t understand the mathematical aspect of the argument, and that needs to be settled before we can dispute the philosophical aspect.
Note that this part of the argument, the mathematical part, applies in every case whatsoever, regardless of what the proposition is or who JS is: As I showed in the post, Bayes’ theorem tells us that whether JS’s assertion constitutes positive or negative evidence depends only on this value:
(Probability that JS would assert X, given that X is true) divided by (Probability that JS would assert X, given that X is false)
If the value of this expression is greater than 1, then JS’s assertion increases the probability of X. If the value is 1, the probability of X is unchanged. If the value is less than 1, the probability of X is decreased. Now, my level of knowledge in no way affects the value of this expression, since it concerns JS and JS alone. It follows that my level of knowledge cannot change the direction of the evidence that JS’s assertion constitutes.
What my knowledge can do is affect by how much the probability of X is changed, not the direction of the change. This is because the absolute quantity of the change depends on the ratio of the certainty of my knowledge to JS’s. But the direction must remain unchanged; if the value of the above expression is greater than 1, JS’s assertion must increase the probability of X unless I assign a probability of 1 to my own belief (which is effectively saying that my own belief is infinitely more certain than JS’s). But assigning a probability of 1 to my own belief is simply making a claim of infallibility, and in that case there’s not much point arguing.
Does this make sense? If not, then this is what we need to discuss before getting back to the post, because thus far it is just a matter of understanding the mathematics.
Let’s write your expression as P(A|X)/P(A|~X) = R. You speak as if this R is to be treated as an a priori constant. But surely in any given situation it depends on (1) X and (2) the population ‘randomly’ drawn from to make the assertion A.
Perhaps things are arranged such that I know very little about X and/or very little about whether the population is likely to speak the truth about X. Very well, in such cases I am agnostic and probably would generally consider the ‘random’ statement to be positive evidence.
But what if I do believe I know something about X and/or about the population’s knowledge of X relative to mine? In that case my knowledge of X affects my personal estimate of R. In particular, I could indeed decide that R <=1, and thus – as you state – the ‘random’ assertion A is negative or at best neutral evidence. So that’s how my knowledge of X can enter into this equation. This dependence is of course hidden by your symbols and abstractions but it is there.
Moreover, my estimate of R not only depends on X but is nonconstant over time – it could be affected by my hearing the proposition A. After all, if I know (or at least believe w.p. 1) I’m eating chocolate, the fact that JS asserts I’m eating raspberries gives me some new information about JS: that he may, by my lights, be delusional, or a liar, or something. Thus, even if my prior estimate of R had been >1 beforehand, hearing an assertion I believe and know to be false coming ‘randomly’ lowers my estimate of R to something less than 1, probably. Again: my relative knowledge to JS affects how I interpret his ‘random’ assertion. Which is exactly my point.
A persistent problem here is that with all these uses of ‘probability’, there are a lot of ambiguous measures flying around, and this allows for easy confusion and/or sleight of hand. This is why concrete examples (i.e. “what I’m eating”) are useful to think about & address.
Sonic Charmer (I’m assuming you’re also the last “Anonymous” poster), you seem to be consistently failing to distinguish the probability of something simply speaking, and the probability given a certain supposition.
If I am sure I am eating chocolate, and somebody says I am eating raspberries, it is indeed far more probable that he is joking, lying, or crazy than that I am eating raspberries.
Nonetheless, on the supposition that I somehow am making a mistake about what I am eating (and the probability of this mistake is never mathematically 0) it is more probable that he would make this assertion than on the (admittedly more likely) supposition that I am not mistaken. Thus the original argument follows.
To restate the particular example without mathematics: It is extremely likely that he is joking, lying, or crazy; nonetheless, there is a small possibility that I am hallucinating and he is correcting my misjudgment. On account of this small possibility, there is a slightly greater likelihood that I am hallucinating, than there was before he said anything.
you seem to be consistently failing to distinguish the probability of something simply speaking, and the probability given a certain supposition.
I don’t understand. What is “the probability of something simply speaking”?
Nonetheless, on the supposition that I somehow am making a mistake about what I am eating
Why would one make that supposition? I understand it is a possibility, with probability > 0. But it is not a supposition anyone would make and act on.
It is extremely likely that he is joking, lying, or crazy; nonetheless, there is a small possibility that I am hallucinating and he is correcting my misjudgment.
That probability is already accounted for in the fact that I place a probability of 1-epsilon on the notion that I’m eating chocolate. It’s why there’s an “epsilon”. The fact that JS randomly tells me I’m eating raspberries doesn’t necessarily give me any new information that merits increasing that epsilon. It might, if I consider (based on my knowledge & my knowledge of his population) JS’s assertion to merit additional weight. I may or may not do that. As I have said. best
For example, in the extreme, suppose I believe myself to live among people who are primarily liars/tricksters. Prior to hearing JS’s assertion, I thought I was eating chocolate w.p. 1-epsilon, the ‘epsilon’ accounting for the a priori probability that I’m wrong (one possibility being that I’m eating raspberries instead). But then a ‘random’ person tells me I’m eating raspberries.
If I were agnostic or had no knowledge about the population which said this, I’d probability increase my raspberry probability. But since I believe the population JS came from are primarily liars/tricksters, I now believe it is less likely than I previously thought that I’m eating raspberries. So I knock the raspberry probability down a smidge and my subjective probability that I’m eating chocolate increases accordingly.
This is an extreme limiting case but it illustrates why all directions are possible, including away from the assertion. It all depends on the proposition X and/or the population making the assertion A. In general, of course, the situation is not so much going to be that ‘people are tricksters’ as that ‘people have no useful information to give me on the subject’, and their assertion gives one no reason to adjust your priors either way.
Let me give you some further examples to think about:
-You are a mathematician, have worked 30 years on research and reading and talking to other researchers, and based on the sum of this work you have formed a subjective probability, say .6, on whether some unproven statement, say the Poincare Conjecture, is true. Then you are told a world lottery was conducted, a random person was chosen and compelled to answer the question “Is the Poincare Conjecture true?” either yes or no, and that person chose “no”. Do you now find the Poincare Conjecture less likely based on this? (This illustrates primarily why your knowledge matters.)
-You currently place a tiny probability on the possibility that the tenets of Nazism are true (morally, ethically, scientifically). You regularly get ‘random’ opinions on this, the vast majority of the assertions you hear support your view that Nazism is untrue (with the very occasional exception), and so on average your probability doesn’t change – stays tiny. But then, you travel back in time to 1930s Nazi Germany, and are bombarded with assertions that Nazism is indeed true. According to your proposition, you’d have to now increase your estimate of the truth of Nazism based on hearing so many pro-Nazi assertions. (This primarily illustrates why the population matters.)
In your examples, you are not talking about a “random person”, but a person whom you are supposing to be likely to lie, or whom you suppose to have absolutely no relevant idea on the matter.
The fact that in a particular case you may know that someone is more than 50% likely to make a false statement, does not mean that in general the principle claimed by bunthorne is invalid, as you asserted. It simply means that in particular cases there are more relevant/important factors than the fact that overall people make more true statements–it does not mean that that factor is irrelevant.
In your examples, if the answers add new information, then you should change your estimate. If they add no new information, you should not.
So for example, in the first case, if to begin with, you know how many people in the Western world believe the Poincare Conjecture to be true, but don’t know it over all, you should lower the probability (Granted, by a very very small amount), as you should raise it if the answer were positive (there is a small possibility, for example, that the answer was given by a person who knows the true answer). (Granted, there could also be other factors that haven’t been mentioned, which would affect in other ways what happens to the probability.)
In the second case, if you didn’t already know what those persons’ opinion was, then, yes, you should increase your estimate. If you already knew beforehand what they say, then you gain no new information, and should not increase your estimate.
To summarize the essential points, without the detailed mathematics:
(1) Overall, people make true statements more often than they make false ones.
(2) The first principle is relevant to deciding whether what someone said is true.
Bunthorne proved the second proposition from the first one. (Strictly he used an equivalent, since if people overall made false statements more often than true ones, then they would not be more likely to make a statement given that it were true, but less.)
Sonic Charmer is disputing the truth of the second point, on the grounds that in some cases, I become more sure that something is false, because some one says it (or remain equally sure/unsure). The response to this objection by Bunthorne and I, is that the objection only shows that in some cases there are weightier principles that this general principle (or theoretically, principles that exactly counterbalance it), but that does not alter the relevance of the principle itself.
Is this a more or less accurate simplified presentation of the dispute?
I think we can all agree that if a probability of 1 is assigned to any opinion, no new information can change it. But I think that this almost never, if ever, obtains, and so I am assuming that there is always some probability, however small, that a given opinion is false.
It is true that knowledge is required to determine the two factors of the relevant ratio, namely the probability of JS asserting X given that it is true divided by the probability of JS asserting X given that it is false. The question is, what sort of knowledge is required to determine this. The examples you (SC) are giving strongly imply that you are thinking that knowledge such as your personal beliefs and arguments (e.g. regarding the Poincare conjecture) can tend to decrease the ratio. More importantly, it seems that you think that if we know that JS has a high probability of asserting something false regarding a particular subject matter, this raises the chance of his R (the ratio mentioned above) being equal to or less than 1. Your earlier example of quantum field theory, for instance, strongly implies this, as does the Poincare conjecture example. You are right in thinking that if a random person is compelled to make a statement about quantum field theory, the statement will very likely be false. The mistake is in thinking that if the statement is very likely false, then it should not or at least is less likely to constitute positive evidence.
I have to go now, but I will post a mathematical example later in which we can calculate the exact Bayesian probabilities, and in which we will see that a person’s assertion can raise the probability of a hypothesis even if we know beforehand that the person’s assertion has an arbitrarily high (say 99.9%) probability of being false. And this even if we further know that we ourselves have a much more probable opinion to begin with.
Here’s the example I mentioned in my last comment:
Part 1: Suppose someone randomly chooses a number from 1 to 100 and writes it on a slip of paper, which he places face down on a table.
Given only this information, what is the probability of the hypothesis that the number on the paper is 47 (I’ll call this hypothesis H)? Clearly, P(H) = 0.01, since each of the hundred possible numbers is equally likely.
Part 2: Now suppose I turn my back. While we do this, another party, A, who has zero knowledge of the number, flips a coin, which I assume is fair, i.e. P(heads)=0.5, P(tails)=0.5. If the coin lands heads, he looks at the slip of paper and says aloud the number on it. I’ll neglect, to make the math cleaner, the minuscule probability of error in A’s sight or in my hearing; this probability does exist, but by following the math you can see that it does not matter to the argument. If the coin lands tails, A does not look at the slip. Instead, he chooses a random number from 1 to 100 and speaks it aloud. Thus, I know the number spoken by A in either case, but I do not know whether he read it from the paper or chose it randomly.
Suppose I hear A say the number 52; I’ll call this evidence E1. What now is P(H | E1), that is, what is the probability that the number on the paper is 47, given that A said 52? Applying Bayes’ theorem, we get
P(H | E1) = ( P(E1 | H) x P(H) ) / P(E1) = [ ( 0(0.5) + 0.01 (0.5) ) x 0.01 ] / 0.01 = 0.005
We derive P(E1 | H) by the following reasoning: Given H, that the number is 47, the probability of A looking at the paper and saying 52 is 0. But he looks at the paper with probability 0.5. This gives us the term 0(0.5). Also with probability 0.5, A chose his number at random, in which case the probability of him saying 52 is 0.01. This gives us the term 0.01(0.5). Hence, P(E1 | H) = 0(0.5) + 0.01(0.5).
Hence P(H | E1) = 0.005. But as we said, P(H) = 0.01. Thus, A’s statement that the number is 52 cuts the probability of the number being 47 in half (unsurprisingly, since A knows the number half the time).
It follows that we know with probability .995 that the number is not 47. So we should have a quite strong opinion that the number is not 47.
Part 3: Now suppose another party, B, follows a procedure similar to A’s, except that B chooses a random number from 1 to 1000, and only looks at the paper if that number happens to be 1000; otherwise, he speaks aloud a random number from 1 to 100. Thus, the only difference between A and B is that A looks at the paper half the time, while B only looks one time in one thousand. It is far more probable that B spoke a random number than that he read the number from the paper.
Now, assume that I hear B say the number 47. How should I judge this? Before I consider his statement as evidence, as I showed above, I can say with probability .995 that the number is not 47. In addition, I also know a priori that B has a very high chance of speaking falsely, since 999 times out of 1000 he simply speaks a random number, which only has a 0.01 chance of actually being the number on the paper.
So, I know two things before taking B’s evidence into account: I have strong reason to believe that his particular claim is false, and furthermore strong reason to believe in general that he will make false claims because of the method he uses to choose the number he speaks. Thus, it seems that B’s statement, have an approximately 99% chance of falsehood, does not constitute positive evidence.
But consider what happens when we apply Bayes’ theorem. Note that our P(H) now takes A’s evidence into account, and so is 0.005 rather than 0.01. Let us call B’s statement, that the number is 47, E2:
P(H | E2) = ( P(E2 | H) x P(H) ) / P(E2) = [ ( 0.01(0.999) + 1(0.001) ) x 0.005 ] / 0.01 = 0.005495.
Thus, B’s statement increases P(H) from 0.005 to 0.005495, and this despite the high probability of B’s statement being false.
Now, if you follow the mathematics here, it is evident that B’s statement will always constitute positive evidence, that is, increase P(H), no matter how improbable we make it that he looks at the paper. As the probability of B’s looking at the paper goes to 0, the probability of his statement being false goes to 0.99. By increasing the range of numbers used to write on the paper in the first place, we can make the probability of B’s statement being false arbitrarily large.
Thus, a statement that has 0.99999 (or arbitrarily higher) probability of being false still constitutes positive evidence for what it claims.
If the example is understood, the reason for this becomes clear. Since the only factor that determines whether B’s statement constitutes positive evidence is the ratio of the probability that he makes the statement given that it is true, to the probability that he makes the statement given that it is false, it does not matter in itself how likely his statement is to be false. On the assumption that his statement is not based on the reality to which it refers (this corresponds to the case where he does not look at the paper but instead speaks a randomly chosen number), evidently the truth of the proposition does not make a difference to his asserting the proposition, and so the ratio would be 1, that is, constitute exactly zero evidence.
But this is where it is easy to make a mistake. There is always a chance, however small, that B’s statement is in fact based on the the reality to which it refers. Since I have assumed that B always tells the truth given that he knows it, this chance, however small, will always increase the ratio to a value higher than 1, meaning that it constitutes positive evidence.
This should make clear what is required in order for someone’s statement to constitute zero or negative evidence. It is not enough to say that the person will likely make a false statement because he doesn’t know the truth. Rather, you need to say that, given that he does know the truth, however unlikely this is, the person is as likely to lie as he is to tell the truth. This is why examples such as the Poincare conjecture don’t make the point you think they do. It does not matter if no one else in the world is in a better position to judge about the truth of the conjecture than you are. All that matters is that, if the person chosen by lot did happen to know, he would be more likely to tell the truth than to lie.
Now, it is a very strong assertion to say that someone is as likely or more likely to lie as he is to tell the truth, and it requires very strong evidence to judge this. In the vast majority of cases, therefore, a person’s assertion of X must increase the probability of X, even if he is chosen from a population that knows next to nothing about the subject, and even if there is an arbitrarily high chance that his statement will be false.
To Joseph:
In your examples, you are not talking about a “random person”, but a person whom you are supposing to be likely to lie, or whom you suppose to have absolutely no relevant idea on the matter.
‘Random person’ in above context just means ‘person chosen randomly (uniformly) from some population’. Well, my examples involve choosing randomly from certain populations. Perhaps the original setup was only supposed to work for ‘the whole world’s population’ and no subset thereof, but that would be rather strange. Anyhow the point of my examples is merely to illustrate that there can be issues on which the population (whatever population you choose to think of) can be wrong or ignorant. Unless you disagree?
The fact that in a particular case you may know that someone is more than 50% likely to make a false statement, does not mean that in general the principle claimed by bunthorne is invalid,
Sure it does. Bunthorne said one ’should’ alter one’s estimate of P(X) when faced with contrary assertion A pulled randomly from a population P. I can think of examples of X and P where one shouldn’t. Done.
In your examples, if the answers add new information, then you should change your estimate. If they add no new information, you should not.
Bingo. That’s my point. Bunthorne’s assertion is that they always add new (positive) info. This is wrong, as you seem to agree.
(1) Overall, people make true statements more often than they make false ones.
(2) The first principle is relevant to deciding whether what someone said is true. Bunthorne proved the second proposition from the first one. [...] Sonic Charmer is disputing the truth of the second point,
Actually I dispute the truth of (1), or at least, wish to point out an inherent ambiguity in what ‘more often’ means. ‘More often’ implies a measure – that we are counting something. There are two ways I can think of to count this:
-Observe the statements people actually make and count the % that are true
-Tally up all possible propositions, force ‘random’ people (or just poll everyone) to assert something on each, and count the % of assertions that are true.
It may be that people make true statements ‘more often than not’ if reckoned the first way, but not the second, because as I said before, the first way would only count statements people choose to make, and thus it is biased to select for a lot of trivially true and/or informed statements. A lot of statements I make daily are trivially true; e.g., “I’m hungry”, or informed, e.g. “I am N years old”. But if (as the original setup implies) I were chosen randomly and forced to make a statement on a ‘random subject’ pulled out of a hat, say Bulgarian history, I probably couldn’t if my life depended on it.
So the point is that trying to reason from (1) (“most statements people make are true”) to the conclusion (hearing a ‘random’ statement A on a given proposition X should increase your estimate of A’s truth) involves a sleight of hand. The issue is whether these people are likely to make a true statement on the subject of X in particular. And that depends on what X is and what the people are like.
Best
To Bunthorne
You have characterized the meaning of my counterexamples correctly. As to your example, I follow it full well and it is a good one but it is also contrived and front-loaded to exclude precisely the possibilities I’m contemplating. You make the assumptions (some explicit, some implicit) that
-A & B & (etc) won’t lie if they know the truth
-they don’t have poor eyesight causing them to mistake 1 for 7 (and, won’t make similar errors of that nature)
-they are able to recognize and say numerals, they can speak a language to communicate with you, i.e. they have sufficient knowledge relevant to the situation (this is very implicit and you probably didn’t feel the need to say it because it seems so trivial in your example – but it’s not trivial for, e.g., quantum field theory!)
I realize that these assumptions were necessary ‘to make the math cleaner’. However they divorce the entire example from the real world. One rarely can make these assumptions in a real-world situation. But since you do, it’s no wonder you end up with the result that B’s assertion always contains positive and only positive info. You constrained things to be precisely that way from the get-go.
Now, it is a very strong assertion to say that someone is as likely or more likely to lie as he is to tell the truth,
In my view whether this is a strong or weak assertion depends on (1) the subject X and (2) the population P from which ’someone’ is drawn (and also, ‘lying’ is not the only possibility leading to an untrue assertion – there is always ignorance, innocent error, bias, etc etc). And that’s my point! Because in any particular instance of your setup, we will know those things, and thus take a view on that ratio of yours. So whether and how we choose to modify our P(X) based on hearing assertion A depends on what we think about X and P – as I’ve said all along.
Here’s how I see the math working out; it basically boils down to this setup:
-a proposition X
-my prior estimate that X is true, P(X) > 0
-A is the event that a ‘random person’, whatever that means, from some population P, makes the assertion “not-X” (~X)
The question is what is the posterior probability that X is true, given A? Do I now have to think X is less likely, i.e reduce P(X)? We need to compute P(X|A). Here’s Bayes’ rule:
P(X|A) = P(A|X)*P(X)/P(A) = Q * P(X)
where Q is the ratio P(A|X)/P(A). The question boils down to whether Q<1. We might further expand P(A)=P(A|X)*P(X) + P(A|~X)*P(~X), do some dividing, to get
Q=1 / (P(X) + R*P(~X)).
And R is your previously mentioned ratio, P(A|~X)/P(A|X), a positive number which is higher the more ‘truthful’ (in whatever sense – lying, errors, biases, etc) the population P is on the subject of X. As P(X)+P(~X)=1 it’s easy to see there are 3 cases
1) P tends to lie/be wrong on X: R 1 so P(X|A) > P(X).
2) P is neutral/ignorant/essentially a coin-flip on X: R=1 so Q=1 so P(X|A)=P(X). No meaningful new info.
3) P tends to be at least better than neutral/ignorant on X: R > 1 so Q < 1 so P(X|A)< P(X), and indeed, I alter my estimate of P(X) downward in the face of the assertion ~X.
Notice, your example above (involving looking at numbers on cards) assumed away possibilities (1) and (2) and thus reduced the entire world to possibility (3) only. That’s fine and your result was correct given that assumption.
But in general, the assumption that we are always in case (3) doesn’t hold. I dispute it. At least, whether I allow we are in case (3) or insist we are in case (2) or (1) depends on my judgment of the ‘truth-telling tendency’, i.e. the ratios R and Q, of the population P when it comes to the subject of X (a judgment I’m all the more comfortable making, the more knowledge of X and/or P I may have personally, of course).
So like I said: it all depends on the proposition X and the population P that JS comes from. It depends.
I hope I’ve made the point more clearly now. Also, retroactive apologies for the lengthy thread-hijacking…but it’s been fun
Best,
“Now, it is a very strong assertion to say that someone is as likely or more likely to lie as he is to tell the truth,”
In my view whether this is a strong or weak assertion depends on (1) the subject X and (2) the population P from which ’someone’ is drawn (and also, ‘lying’ is not the only possibility leading to an untrue assertion – there is always ignorance, innocent error, bias, etc etc). And that’s my point!
There are very few people and subjects such that a chosen person will be as likely or more likely to lie than tell the truth. But the real problem here is your parenthetical remark. Lying is fundamentally different from error resulting from ignorance, bias, etc. The difference is absolutely critical to the way it functions in the context of Bayes’ theorem.
And R is your previously mentioned ratio, P(A|~X)/P(A|X), a positive number which is higher the more ‘truthful’ (in whatever sense – lying, errors, biases, etc) the population P is on the subject of X. As P(X)+P(~X)=1 it’s easy to see there are 3 cases
1) P tends to lie/be wrong on X: R 1 so P(X|A) > P(X).
2) P is neutral/ignorant/essentially a coin-flip on X: R=1 so Q=1 so P(X|A)=P(X). No meaningful new info.
3) P tends to be at least better than neutral/ignorant on X: R > 1 so Q < 1 so P(X|A)< P(X), and indeed, I alter my estimate of P(X) downward in the face of the assertion ~X.
Notice, your example above (involving looking at numbers on cards) assumed away possibilities (1) and (2) and thus reduced the entire world to possibility (3) only. That’s fine and your result was correct given that assumption.
Based on this, I don’t think you understood my example. In the example, person B is precisely and deliberately constructed to be of type 1, which you say I assumed away. As given, person B most certainly “tends to be wrong” on the subject he speaks of; in fact the probability of his being wrong is roughly 99% as written, and I explained how we could arbitrarily raise the probability of his being wrong, to 99.9999999% or higher if we want. But despite the fact that he belongs in your group 1, his R value is not less than 1 but greater than 1, and his assertion provides positive evidence of what he claims. The reason I gave the example is because I gathered from your comments that you thought “R<1″ follows from “P tends to lie/be wrong on X.” The example proves that it does not so follow.
One caveat here: There is an ambiguity in the meaning of “tends to be wrong.” Take a concrete example: Given a certain demographic group, say a group eighty percent of whose members belong to the Flat Earth Society (I say only 80% so that we will not know the person’s opinion in advance). I think we can agree that a random person from this group is likely to make a false statement about the shape of the earth. Thus, we would say that such a person “tends to be wrong” on the subject. But this does not mean that his opinion has a tendency to be opposite to the truth regardless of what the truth is, but only given that the truth is that the earth is round. If the earth were in fact flat, other things being equal, such persons would tend to be right on the subject, since they would still say that it was flat. Consider the ratio R using the same example. Let’s say JS is a member of this demographic, and it turns out that he does, indeed, assert that the earth is flat. We want to analyze his ratio R:
R=P(JS asserting the earth is flat | the earth is flat) / P(JS asserting the earth is flat | the earth is not flat)
Now, you say that because JS is likely to make a false assertion on the topic, the value of R is less than 1. This means that the numerator is less than the denominator. But how do we know which is greater? Knowing that he is likely to assert that the earth is flat merely tells that both the numerator and the denominator are relatively high, but not which is higher.
This is perhaps the fundamental sticking point. I say that a given person on a given topic can be known to be such that he has a very high probability of making false statements on that topic, say probability 0.99 of being wrong, and nonetheless his assertion on the topic can constitute positive evidence; and further, that this can hold when you yourself already have a very high probability that the opposite statement is true. This needs to be clear before the argument can proceed. If you don’t think there can be such a situation, then this is where the discussion needs to focus.
Lying is fundamentally different from error resulting from ignorance, bias, etc.
Degree, not kind. While lying definitely has a bias against the truth, the other sorts of non-truths could have a bias against the truth, sometimes. If/when they do (or at least if/when I think they do) that’s precisely when I’ll be discounting ‘random assertions’.
In the example, person B is precisely and deliberately constructed to be of type 1
Not so. Let’s go slowly, adapting your setup to my framework. Proposition X is ‘the number is 47′. Suppose B, who looks at the card 1/1000th of the time, says A= “it’s not-47″ (he actually says it’s 52, or whatever). My formula above applies and we have to look at the ratio R = P(A|~X)/P(A|X).
If the number isn’t 47 (~X), B will say so w.p.
P(A|~X) = 1/1000 * 1 + 999/1000 * 99/100
The first term accounts for him looking at the card and seeing (accurately) that it’s not 47. The second term accounts for the probability he instead chooses randomly (999/1000), in which case he’ll say not-47 99/100th of the time (because there’s a 1/100th change he’ll hit upon 47 by coincidence).
If the number is 47 (X), B will say it’s not 47 w.p.
P(A|X) = 1/1000* 0 + 999/1000 * 1/100.
In other words he only says “it’s 47″ if he is choosing randomly (999/1000) and happens to pick 47 when doing so (1/100).
So the ratio R = P(A|~X)/P(A|X) > 1. But this is case (3), not (1).
You conveniently constructed your random people specifically so that they either (a) tell the truth a nonzero proportion of time or (b) do things neutrally with respect to the truth. You specifically left out (c) lies, or has a bias, or an error, or a motive, or any other factor that skews away from truth. That’s why your contrived example is always case (3).
As given, person B most certainly “tends to be wrong” on the subject he speaks of; in fact the probability of his being wrong is roughly 99% as written,
This is a confusion in use of language. I agree that B “tends to be wrong”. But I was using that phrase as shorthand for “R>1″. As you see, for B, R>1. Period. Case (3).
But despite the fact that he belongs in your group 1, his R value is not less than 1 but greater than 1,
Huh? My group 1 was nothing more and nothing less than “R1″. Since R>1, he’s in group 3. This is just by construction. I’d ask that you give my previous comment some more thought.
The reason I gave the example is because I gathered from your comments that you thought “R<1″ follows from “P tends to lie/be wrong on X.”
No, that was just a convenient loose phrasing to characterize group 1 and what it means for R<1. The definition is as stated above: R<1. Luckily I did my math based on R<1, not on the English phrase “tends to be wrong”…
One caveat here: There is an ambiguity in the meaning of “tends to be wrong.”
Well, that is true (as you’re illustrating).
Best replace it w/”R<1″ if it helps you.
Oops I think I had an error in the “47″ example..too many letters. Should be
P(says not-47 | not 47) = 1/1000*1 + 999/1000*99/100
P(says not-47 | is 47) = 1/1000*0 + 999/1000*99/100
Still true that R>1 of course.
Huh? My group 1 was nothing more and nothing less than “R1″. Since R>1, he’s in group 3. This is just by construction. I’d ask that you give my previous comment some more thought.
I’m sorry, but this is truly sad. This single snippet shows that I gave your comment far more thought than warranted. If you are defining your three categories only by the value of R, then your argument amounts to this:
“Your argument concludes that Bayesian analysis of the vast majority of cases results in case 3–therefore, clearly you’re ignoring cases 1 and 2!”
Yes. That’s what proofs do. They yield conclusions.
“Your argument concludes that Bayesian analysis of the vast majority of cases results in case 3–therefore, clearly you’re ignoring cases 1 and 2!” Yes. That’s what proofs do. They yield conclusions.
But you didn’t “prove” that the vast majority of cases result in case 3. You assumed that (for all cases, in fact), and then got to a “conclusion” that was predetermined by that assumption. Moreover it is completely incorrect to say that Bayesian analysis “results” in case 3. What Bayesian analysis results in is a formula
P(X|A)=Q * P(X)
with a factor Q which has a form depending on P(A|~X)/P(A|X). That’s all. To say something further you have to know whether Q is bigger/equal/less than 1 (which are, of course, just the 3 cases). But one doesn’t know that and Bayesian analysis by itself doesn’t give you any “result” along those lines without further info. Well, the idea that we always or ‘mostly’ have Q<1 is precisely what I dispute; I have questioned how one would gauge ‘mostly’ (and pointed out that in any given situation one might be able to gauge Q and then ‘mostly’ doesn’t matter), and I have given counterexamples which you have not rebutted. So that’s where we are.
P.S. The Flat Earth example is an interesting one. A very nice one! Clearly the intent of your setup is this: the population P has 80% Flat-Earthers and 20% normal people/truth-tellers. If the proposition is X=’the earth is round’, and the assertion is A=’the earth is flat’, then we have
P(A|~X)=.8*1+.2*1
P(A|X)=.8*1+.2*0
R=1/.8. R is bigger than one! So you’re right. If my P(X) is less than one, and I heard assertion A and I knew it came from population P, I would indeed reduce my P(X)! After all, P is just a mixture of ‘no info’ (the Flat-earthers) and ’some positive info’ (the 20% who would tell the truth). Nicely constructed. In the limit one could imagine writing ‘the earth is flat’ on a zillion note cards, then getting one truth-teller to write his answer on a note card and mix it in, then drawing a random card, which will almost certainly say ‘the earth is flat’, and if so one would have to reduce P(X|A) because of the fact that the card could have come from the truth-teller. I agree with you there.
This brings out some issues though:
First, what if my P(X)=1? In this case it doesn’t matter what “R” is, because it gets multiplied by P(~X)=0 in the expression for Q. Thus P(X|A)=P(X)=1 for all A. So the ‘note-card’/Flat Earth experiment won’t change my view. And indeed, for the record I do believe the earth is round with probability 1. (Which doesn’t mean I think that the earth being flat is ‘impossible’, just that it’s a measure-0 event, which is technically different.)
More interestingly, I think your setup helps illustrate a more subtle difference in our thinking. Let’s assume my P(X) is slightly less than 1. You can ask me to think about a situation where I know that the population P consists of X% Flat-Earthers and (100-X)% knowledgeable truth-tellers. If I really know that’s what P is like, then I do agree in that case that hearing ‘the earth is flat’ randomly from this P gives me positive info.
But let’s go back to the expression R=P(A|~X)/P(A|X). I calculated this as 1/.8 for the 80% Flat-Earth population, reasoning that if the earth were flat, then everyone including the Flat-Earthers would say so (P(A|~X)=1), if not than only the Flat-Earthers would say so (P(A|X)=.8). But not so fast, because two questions now arise in my mind:
1. If the earth really WERE flat, would the Flat-Earthers we know and love be saying so? What do I really think about Flat-Earthers? What is the nature of their wrongness? What causes it? What is their motivation? What are they like? Your setup is that they are predictable almost like robots and would answer ‘the earth is flat’ in every single universe. Fine, and your analysis works on that setup. But actually, maybe I think that real living breathing flat-earthers, as opposed to the mathematical construct ‘people who would say the earth is flat in any universe’, are something other than that – maybe I think there’s a deeper cause of their error such as a tendency to be contrarian, and resistant to science, for example. So if the earth were flat, maybe some of them would be saying it’s round! So I’m not so sure that P(A|~X) = 1. In fact, thinking about it more, for a real population of 80% Flat-Earthers, I think that P(A|~X)<1, probably.
2. How/why can I be sure that the other 20% really are knowledgeable truth-tellers? Okay, so you’ve told me – or let’s say I believed prior to the experiment – that 20% of these people are ‘normal’, and would say ‘the earth is round’ if it really is. They won’t say ‘the earth is flat’ if it’s round. That’s how I got the result P(A|X)=.8 and not something higher. But hold on. Now I’ve gotten an answer from this group, the answer is ‘flat’. This is new info. Am I still as sure as I was before that the other 20% are truth-tellers? Maybe I learn that this answer just came from a Flat-earther, in which case it’s not really new info (because R<=1 for the Flat-earth population by itself) and I wouldn’t change my P(X) anyway.
But if I learn, or think there’s a possibility that it came from the other 20%, I might just decide to change my view of that 20%. After all, I’m pretty darn sure the earth is round (my P(X) is very close to 1, and I believe I formed that P(X) on the basis of a decent amount of knowledge). Given that, seeing the answer ‘flat’ from the other 20%, I’m liable to say: well maybe that ‘other 20%’ wasn’t as full of knowledgeable truth-tellers as I thought. Maybe there are even some more Flat-Earthers (or tricksters, or confused, or..) mixed in there. So do I really think that nobody from the other 20% would say the earth is flat if it were round? (i.e. that P(A|X)=.8 only) No. I actually think that P(A|X)>.8, probably.
So having thought things through, my questions 1. and 2. lead me to the interesting place that I’m no longer quite so sure that R=1/.8. I actually think the numerator is probably less than 1 and that the denominator is probably greater than .8. In other words now all I think I know is that R1, therefore, and that I need to modify my P(X) downward?
No I am not.
This helps illustrate three main things:
1. Why what I think about the population (nature, truth, motives..), in relation to the proposition X, is important. Because I have to imagine how they would behave in a “not-X” universe, and that is not necessarily so trivial.
2. How hearing an assertion A from a population, if I take it seriously, gives me new information that could – in some cases – affect what I think of that population in a way that ruins the effort to use static Bayesian analysis on it. After all if I’m really really sure X is true (which, see point 3.), and someone says not-X, maybe my estimate of their ‘truthfulness’, or whatever you want to call it, goes down, it doesn’t stay the same as before.
3. Why my knowledge is important. Because if I have some knowledge of the subject, and/or the population, or think I do, I am more comfortable making such judgments as 1. or 2.
But these of course are just restatements of points I’ve already made in the above discussion.
best
argh html screws me up again. near the bottom of my there’s a muddled sentence containing “R1″. It’s supposed to be
‘now all i think i know is that R is less than 1 divided by .8. So am I still so sure that R is greater than 1 and that I need to modify my P(X) downward? No I am not.’
I guess we need to take baby steps. So I’m going to focus on one single point and not address the other difficulties in your most recent comment.
I have information sufficient to judge that a certain person, JS, will make false statements on a given subject S with probability 0.99. JS makes a statement on subject S. Given this information and only this information, I can judge that:
a) R=1
b) R>1
c) R<1
d) none of the above
What would your answer be? Don’t bother adding qualifications about what the statement is or anything else about JS. What is the answer, given only the above information
R = 1/99. Less than one.
P(A|~X)=.01, because if X is false, JS will only say that truth 1% of time
P(A|X)=.99, because if X is true, JS will falsely say it’s false 99% of time
So knowing this about JS, if JS says “not-X”, I adjust my P(X) upward. This is as I said and as it should be.
The previous answer seems to suppose that the “Subject S” was a single proposition, which could only be affirmed or denied. If that was what you meant, you wouldn’t say “makes a statement on subject S”, but affirms on denies “S”.
If I understand the question correctly, the obvious answer is that a judgment can’t be made: (d)
The previous answer seems to suppose that the “Subject S” was a single proposition, which could only be affirmed or denied. If that was what you meant, you wouldn’t say “makes a statement on subject S”, but affirms or denies “S”.
If I understand the question correctly, the obvious answer is that a judgment can’t be made: (d)
R = 1/99. Less than one.
P(A|~X)=.01, because if X is false, JS will only say that truth 1% of time
P(A|X)=.99, because if X is true, JS will falsely say it’s false 99% of time
So knowing this about JS, if JS says “not-X”, I adjust my P(X) upward. This is as I said and as it should be.
Remember, as Joseph pointed out, that my question didn’t specify a definite proposition, but a proposition about a definite subject. Taking this into account, do you want to revise your answer, or do you still want to maintain that the answer to my multiple choice question is (c)?
If you do, I’d just like to be clear on this point: You assert that if I know that JS will make a false statement on a given subject S with probability 0.99, then, when JS actually does make his assertion about S, let’s call it X, I should judge that the posterior probability of X is lower than I would have before JS made the statement?
For example, I judge that P(X)=some value A. JS then asserts X. I know that his statement is false with probability 0.99. So now I should judge the probability of X to be less than A. Is this a correct statement of your claim?
Joseph: Of course you are correct that answer (d) is the true one. But I expect that you see where I’m going with this.
Remember, as Joseph pointed out, that my question didn’t specify a definite proposition, but a proposition about a definite subject. Taking this into account, do you want to revise your answer, or do you still want to maintain that the answer to my multiple choice question is (c)?
No. Fine, the proposition is ‘about a subject’. But the setup was that I had a view on that subject (=X). When JS gives a view on that subject, if it is takes a stand on X, what he says will either confirm X or it won’t (~X). And thus everything I said above applies.
(If what he says is unrelated to X either way, of course my P(X) is unchanged so the assertion is moot.)
You assert that if I know that JS will make a false statement on a given subject S with probability 0.99, then, when JS actually does make his assertion about S, let’s call it X, I should judge that the posterior probability of X is lower than I would have before JS made the statement?
There are now two different X’s. Argh.
I had a view, P(X). Let’s say JS gives a view ‘on the subject’ that confirms ‘not-X’. You’ve told me JS lies 99% of the time (and I’ve assumed the other 1% = truth). So P(X|A)>P(X). Yes, the posterior probability of what JS said is lower than it had been; the posterior probability of my X is higher.
For example, I judge that P(X)=some value A. JS then asserts X. I know that his statement is false with probability 0.99. So now I should judge the probability of X to be less than A. Is this a correct statement of your claim?
Now there’s yet another A!
Let me just show the calc again for this case (which has switched around the logic, and thus the inequality signs, by having JS confirm X instead of confirm ~X, but that’s ok):
P(X|JS asserts X) = P(JS asserts X|X)*P(X)/P(JS asserts X) = Q P(X), where
Q = 1/(P(X)+RP(~X)), where
R=P(JS asserts X|~X)/P(JS asserts X|X).
I calculate R here as
numerator = .99 (when ~X, JS will lie 99% according to you, and say X)
denom = .01 (when X, JS will only say X 1%)
thus R for this outcome is 99, bigger than 1. So Q is less than 1. So P(X|JS said X) < P(X).
Yes, I would lower my posterior probability, in your setup. What do I win?
You’ve told me JS lies 99% of the time (and I’ve assumed the other 1% = truth).
No, I didn’t. I said, quote,
“A certain person, JS, will make false statements on a given subject S with probability 0.99.”
This statement tells us nothing about the probability of JS lying about subject S, only about the probability of him making false statements (when he makes statements). As I said before, there is an absolutely fundamental difference between the two. In order to lie, JS needs to know (or believe he knows) what the truth is. He doesn’t need to know in order to make a false statement.
To use our previous example, if I assert that there is intelligent alien life elsewhere in the universe, that may be a false statement (obviously I don’t know for sure). But even if it were false, it would not be a lie unless I actually believed the contrary.
So making a false statement doesn’t imply anything about the speaker’s knowledge or belief. Lying does. Do you understand the difference now? I can’t make the point I have in mind until this is clear.
(BTW, given this important distinction, we know from the information stated that the probability of JS making a true statement is 0.01. We don’t need to assume it.)
This statement tells us nothing about the probability of JS lying about subject S, only about the probability of him making false statements (when he makes statements).
Ok, false statements. It doesn’t change my previous comment but feel free to do a find/replace on ‘lie’. We still have
P(JS asserts X|~X)=.99 (“X” is the false statement if ~X is true)
P(JS asserts X|X)=.01 (“X” is the true statement if X is true)
R=99, bigger than 1
So JS asserting X decreases my P(X). What did you think would change?
So making a false statement doesn’t imply anything about the speaker’s knowledge or belief. Lying does. Do you understand the difference now?
I understood the difference before, O Teacher. But thanks!
This is a sideshow, it doesn’t change anything. I was just using the word ‘lies’ as shorthand. It doesn’t affect my analysis.
If you think it does, point out where. Otherwise what are we talking about now?
I was just using the word ‘lies’ as shorthand. It doesn’t affect my analysis.
I believe that either you used the term “lie” because you made a certain supposition about the situation, or the other way around, because you used the term “lie”, you tricked yourself into making a certain supposition about the situation.
The supposition, which is not implied in the original statement of the question, is that abstracting from the truth, JS is equally likely to assert or to deny any given statement. And since either an assertion or its opposite denial is true, if he asserted or denied randomly, he would speak the truth 50% of the time. Thus, if he is equally likely to assert or deny any given statement, yet say something false 99% of the time, then he cannot be merely mistaken or ignorant: either he is deliberately lying, someone is deliberately deceiving him, or something similar.
Thus by your implicit supposition that JS is just as likely to assert a statement as to deny it, you’ve excluded the possibility that his 99% rate of error is due merely to ignorance.
You overlooked this difference earlier, too, when you asserted that the difference between the falsehood due to lying and the falsehood due to ignorance was a difference of degree, and not of kind.
I believe that either you used the term “lie” because you made a certain supposition about the situation, or the other way around, because you used the term “lie”, you tricked yourself into making a certain supposition about the situation.
No, I used the term ‘lie’ as shorthand, like I said. If you really think this matters so much surely it should be a simple matter to show which of my formulas & calculations change if the word ‘lie’ is replaced by ‘untruth’. Otherwise this is just you guys seizing upon my use of language in ‘gotcha!’ fashion cuz you think you see an opening (yet oddly can’t or won’t explain how it changes anything). Please tell me that’s not what this is.
The supposition, which is not implied in the original statement of the question, is that abstracting from the truth, JS is equally likely to assert or to deny any given statement.
I don’t understand where you think you’re going with this. Look, Bunthorne gave me a .99-.01 probability split between JS saying a false vs true thing. I used those probabilities exactly. I made no ’supposition’ which changed them (or something). So what’s your point? Which of my formulas/calc’s needs to change?
Thus by your implicit supposition that JS is just as likely to assert a statement as to deny it,
Huh? ‘equally likely’? I used a 99-1 split between untruth-truth. As was stated. What should I have done? Which of my formulas/calc’s needs to change?
you’ve excluded the possibility that his 99% rate of error is due merely to ignorance.
No I didn’t. I just passed through the 99% probability lock stock from bunthorne’s statement of the setup into my formulas. I made no ’supposition’, took no view, and frankly didn’t even think about where it came from or what it was due to, because it wasn’t necessary to my calc.
Which of my formulas/calc’s needs to change?
You overlooked this difference earlier, too, when you asserted that the difference between the falsehood due to lying and the falsehood due to ignorance was a difference of degree, and not of kind.
Saying that a difference is a difference of degree but not kind is not ‘overlooking’ a difference. It explicitly acknowledges there’s a difference for pete’s sake: “it’s a difference of degree”. See how that uses the word “difference” there?
Which of my formulas/calc’s needs to change? Either one of you, do feel free to speak up and correct my numbers anytime, at your earliest convenience, don’t be shy. Best,
I’ll even make it easy for you. Just fill in one or both of these blanks:
I calc’d P(JS asserts X|~X)=.99, and it should instead be ____.
I calc’d P(JS asserts X|X)=.01, and it should instead be ____.
These are the only two pieces of my calc that rely on info from the particular problem setup. So if and when you fill in either blank, with numbers other than .99 and/or .01, you’ll have shown where I erred. I’d appreciate it,
I calc’d P(JS asserts X|~X)=.99, and it should instead be ____.
I calc’d P(JS asserts X|X)=.01, and it should instead be ____.
Both should be unknown, due to a lack of information that would allow a value to be calculated. I stated your error in the previous post. I don’t have to time to explicate it further now. I’ll come back later to do so, but I would appreciate your in the mean time taking seriously my last two comments.
I just passed through the 99% probability lock stock from bunthorne’s statement of the setup into my formulas.
Exactly the problem. You’re doing mathematical formulas and applying them to reality, without reflecting on what real situation they would correspond to.
Let me recall a pair of examples very similar to one already given (have you forgotten about them already?)
1. Suppose JS draws a random number from 1 to 100. Then he draws a second number from 1 to 100. If the second number he gets is the same as the first number, then he says “The first number was ___” (that number). Otherwise he says “The first number was ___” (the second number).
Obviously he will make a false statement 99% of the time, and a true statement 1% of the time.
Now suppose he says “The first number was 47″. Let this be proposition X.
The probability of his asserting X, given that X is true, is 1/100 = 1%
The probability of his asserting X, given that X is false, is 1/100 = 1% (99% of making a false statement, and given that it’s false, a 1/99 chance of it being X)
So in this case:
P(JS asserts X|X)=.01
P(JS asserts X|~X)=.01
From which follows, incidentally (to relate it the main question):
****
P(X|JS assert X)=.01
P(~X|JS asserts X)=.09
(Probabilities aren’t changed.)
****
2. Suppose B draws a random number from 1 to 1000. Then he draws a second number from 1 to 1000. Finally he draws a third number from 1 to 1000. If the third number he gets is lower than 10, then he says “The First number was ___” (the second number). Otherwise he says “The first number was ___” (the first number).
He will make a false statement exactly 99% of the time, and a true statement 1% of the time. (I hope I don’t need to go through the calculation step by step).
Now suppose he says “The first number was 47″. Let this be proposition X.
The probability of his asserting X, given that X is true, is 9/1000 (if the third number is lower than 10)+ 1/1000 (if the third number is not lower than 10) = 0.01 = 1%
The probability of his asserting X, given that X is false, is .991/999 (the probability of the third number being 10 or more, and number two’s being the number 47 (out of 999, because it’s a given that it’s not the same as the first number–otherwise X wouldn’t be false))
P(JS asserts X|X)=.009
P(JS asserts X|~X)=.991/999=.00099199199
From which follows:
****
P(X|JS asserts X)=.009
P(~X|JS asserts X)=.991
Since to begin with, the chance of JS asserting X = 0.001, and the probability of X = 0.001, then given the assertion of X, the probability of X = 0.009. From being a one in one thousand chance that the first number was 47, it becomes a nine in one thousand chance, nine times as probable!
It remains improbably, of course, but has become more probably now that he has made the assertion, even though his assertion is probably (99%) wrong.
****
Since in the one case,
P(JS asserts X|X)=.01
P(JS asserts X|~X)=.01
And in the other case,
P(JS asserts X|X)=.009
P(JS asserts X|~X)=.991/999=.00099199199
the original question lacked sufficient information to give definite numbers, and the answer was (d).
Q.E.D.
Both should be unknown, due to a lack of information that would allow a value to be calculated.
bunthorne told me .99 and I used .99.
Only thing I can think of (measure ambiguity, a problem I’ve pointed out several times now) is if bunthorne meant the .99 probability to hold on the product measure, (possible truth values of X in an ensemble of universes) x (possible statements of JS) rather than in the plain English sense (there’s a 99% chance JS will lie, right now, in this universe, given this universe). If so, I don’t understand the point of not saying so more explicitly in the setup instead of forcing me to guess.
If that’s the secret you guys want me to unlock to prove whatever point this is, fine. I’ll outline the math for that case, which would’ve meant that bunthorne was only telling me that
P(js says X|~X)*P(~X) + P(js says ~X|X)*P(X)=.99.
In which case it is underdetermined & whether I adjust my posterior depends on more info – for example, my prior, P(X). So this illustrates what I’ve been saying all along: it depends.
Is that what this is all about? You wanted to illustrate my point for me? Thanks!
First, let me repeat the original scenario just so it’ll be closer to the bottom of the combox:
I have information sufficient to judge that a certain person, JS, will make false statements on a given subject S with probability 0.99. JS makes a statement on subject S. Given this information and only this information, I can judge that:
a) R=1
b) R>1
c) R<1
d) none of the above
What’s going on here is that you’re trying to take my verbal description, “a certain person, JS, will make false statements on a given subject S with probability 0.99. JS makes a statement on subject S,” and convert it into an equivalent mathematical description. You’ve proposed two:
P(JS asserts X|~X)=.99 (”X” is the false statement if ~X is true)
P(JS asserts X|X)=.01 (”X” is the true statement if X is true)
and
P(js says X|~X)*P(~X) + P(js says ~X|X)*P(X)=.99.
Neither is a correct description. So what we need to do now is see why this is. My earlier example actually showed this, which is why Joseph brought your attention to it again, but I think we can make it simpler. Take the most reductive form possible of my earlier scenario:
I secretly choose a random number from 1 to 100. JS tries to guess the number I chose, saying, “Your number is …” whatever he thinks it might be.
When JS makes his guess, what is the probability that his statement (“Your number is …”) is false? I don’t need an R value, just the simple probability that his statement is false.
you’re trying to take my verbal description, “a certain person, JS, will make false statements on a given subject S with probability 0.99. JS makes a statement on subject S,” and convert it into an equivalent mathematical description.
You insisted I answer a tightly-constructed (yet, apparently, intentionally vague?) question, and I tried to. So yes, I tried to take your ‘verbal description’ and interpret it. That’s how humans tend to answer questions. If you’re saying that your verbal description is non-interpretable then you haven’t actually asked me a coherent question. Is that what you’re saying?
You’ve proposed two [interpretations of the question]. [...] Neither is a correct description.
Then why don’t you tell me more clearly what the question you want me to answer actually is. If/when you do that I’ll be happy to continue this game regarding “your question”. Note: not before.
I secretly choose a random number from 1 to 100. JS tries to guess the number I chose, saying, “Your number is …” whatever he thinks it might be.
When JS makes his guess, what is the probability that his statement (”Your number is …”) is false?
I don’t know because I don’t know how JS is going to choose “whatever he thinks it might be”. You didn’t say he chose it randomly, you said he chooses “whatever he thinks it might be”. That could mean he thinks he can guess by reading your face, using psychology, etc., and will thus favor certain numbers over others. Depending on what those numbers are, and what number you are thinking of, the probability could be higher or lower than .99.
So, this example ISN’T a form of your “question” scenario. In this example it is not a priori clear, and you do NOT have information sufficient to say, that there is p=.99 that JS’s statement will be false. Try again.
P.S. Is it too much to ask to skip this round-and-round and ask for the punchline? Frankly, what’s your point?
I don’t know because I don’t know how JS is going to choose “whatever he thinks it might be”. You didn’t say he chose it randomly, you said he chooses “whatever he thinks it might be”. That could mean he thinks he can guess by reading your face, using psychology, etc., and will thus favor certain numbers over others. Depending on what those numbers are, and what number you are thinking of, the probability could be higher or lower than .99.
So, this example ISN’T a form of your “question” scenario. In this example it is not a priori clear, and you do NOT have information sufficient to say, that there is p=.99 that JS’s statement will be false. Try again.
FALSE. Actually, I had to laugh at this. The scenario proposed by bunthorne was “I secretly choose a random number from 1 to 100. JS tries to guess the number I chose, saying, “Your number is …” whatever he thinks it might be.”
Given this situation, they could repeat the scenario again and again, and JS could “guess” the number 23 every single time. He would be right 1% of the time, and wrong 99% of the time.
An elementary question in probability theory.
Just to avoid a possible objection. The stated situation was “I secretly choose a random number,” not “I secretly try to choose a random number.”
A question always needs to be taken with the information provided.
I calc’d P(JS asserts X|~X)=.99, and it should instead be ____.
I calc’d P(JS asserts X|X)=.01, and it should instead be ____.
So if and when you fill in either blank, with numbers other than .99 and/or .01, you’ll have shown where I erred. I’d appreciate it.
You may have missed my second post in response to this, because apparently we were writing at the same time, and your response to my first post appears just below mine.
You can go back and read the full post. I give two scenarios which fit the suppositions of the question.
In the first:
P(JS asserts X|X)=.01
P(JS asserts X|~X)=.01
In the second:
P(JS asserts X|X)=.009
P(JS asserts X|~X)=.991/999=.00099199199
Since these are both possibilities, in one case giving R=1, and in the other case giving R>1, the correct answer to the question was: d) (I can judge none of the above).
I’d appreciate it if you would have the honesty and humility to admit your mistake for once.
Frankly, for someone suggesting to have a PhD in math, you’ve made a number of pretty basic errors.
Joseph
Given this situation, they could repeat the scenario again and again,
In my view the probability bunthorne asked me for was not the probability in repeated scenarios but the probability conditioned on this scenario, in which bunthorne has already chosen the number. The key phrase was “When JS makes his guess, what is the probability..”. He didn’t say “before this experiment starts” he said “when JS makes his guess”. But when JS makes his guess, a number exists, and the only question is will JS’s guess match that number. So, I can’t say.
What all this really points to is that the questions need to be stated better. If bunthorne wanted to speak about JS’s probability in the event space (choices of number) x (choices of JS) he should have said so and I could certainly answer that question too (although this would also require specifying better what JS does in the other universes, in particular whether it’s independent, which is not obvious..). You sincerely think I couldn’t?
I’m done with this game. Let me just save time by outlining your end of the conversation,
1. “Answer my vaguely-stated question and use only my vague info!”
2. “No! Ha! You interpreted my vague statement differently from how I secretly intended! Gotcha!”
3. Repeat.
Where is all this supposed to lead us? Do either of you even remember?
Joseph
Re: your numerical examples,
1. I see so the issue you guys had in mind was indeed just independence. I had thought we all understood & had dismissed the case where JS’s assertion is independent as trivial, but guess not. Ok: obviously, if JS’s choice is independent of X (and in general, if the random assertion is independent of my hypothesis) then we just have R = 1 and JS’s assertion gives no info and P(X|A)=P(X), all of which bunthorne already said & I’d thought was understood by all.
bunthorne was your question really just meant to illustrate independence giving no info? Because if so, why? Can we take that as stipulated already?
2. Your example 2. seems to be misstated in a couple ways. I could try to correct it to what I think you probably meant, but given the likelihood of a “gotcha!” response, it’s not worth it.
Locus iste occurrit: μὴ νοοῦντες μήτε ἃ λέγουσιν μήτε περὶ τίνων διαβεβαιοῦνται. Sed dubitur: loquitur hebetudine, aut duritia cordis?
I do think it would be best to end this here. I’d rather spend the time working on new posts than walking through the most basic fundamentals of probability theory which should be considered prerequisites for any post involving Bayes’s theorem.
Finis: Puto parte utriusque eum loqui.