György Gergely on genericity

This post is part of our ‘Pedagogy theory week’ series.

Monday Tuesday Wednesday Thursday Friday Saturday

For a very short presentation of pedagogy theory, see the Monday post. In this post, György Gergely replies to Marion and Olivier’s concerns about the notion of a “Genericity Bias” (see in particular the Wednesday post).



Natural Pedagogy theory claims that the Genericity Bias has been selected for because it serves an adaptive epistemic cognitive function. It enables infants to rely on ostensive communication not only to receive new and relevant episodic information, but also to infer and extract culturally shared generic knowledge about more abstract types of referents (including object kinds and various other general types of cognitively opaque, conventional and/or normative aspects of shared and generalizable cultural knowledge) that they expect knowledgeable communicative others to ‘teach’ them about. Importantly, the Genericity Bias allows even preverbal infants to bridge the inferential gap from token to type by extracting and encoding culturally shared generic knowledge about kind-specifying referent properties directly even from – non- or pre-linguistic – ostensive communicative acts that can employ only deictic referential gestures (such as gaze-shift, body orientation, and pointing) that by necessity can identify particular referents only.

The Genericity Bias is hypothesized to function as a built-in interpretational constraint or strategy to direct infants in the inferential disambiguation of the intended level of genericity and width of referential scope of the other’s informative intention. According to our proposal, the Genericity Bias is set to maximize the likelihood of retrieving relevant information at the widest scope of referential generalizability that is compatible with the available communicative and contextual evidence relevant for specifying the intended referential scope of the communicative act.

But what about the argument that a built-in Genericity Bias of this sort would result in the epistemic danger of compulsively attributing referentially overly wide-scope generic interpretations to any communicative manifestation? It’s true that if ostensive signals were pre-wired to trigger the mandatory assignment of a kind-level interpretation of the intended referential scope, this would result in a curious and obviously maladaptive form of interpretive maladie of ‘specificity blindness’: a deficiency in being able to understand communications as being about episodic properties of particular referents. This would have potentially disastrous epistemic consequences as the Genericity Bias would flood the system with a constant flow of pragmatic misinterpretations by inducing the attribution of an overly wide referential scope and generic interpretation to all communicative acts.

In this regard it’s worth noting that as part of the evidence in support of the GB hypothesis we did identify such ostensively induced pragmatic misinterpretations in young infants. One obvious example is our demonstration of the ostensively induced nature of the perseverative search error in the standard A-not-B task discussed this week by Marion in some detail. We argued that the classical finding of robust and perseverative object search errors by 10-month-olds in the Piagetian A-not-B object-hiding and searching task are, in fact, a consequence of a mistaken assignment of an overly generic level of referential interpretation to the demonstrator’s object hiding actions. This is suggested by our demonstration that infants’ search errors are induced only in the presence of the ostensive communicative signals that the model directs towards the infant while performing the hiding actions. When the infants observe the same sequence of actions performed in a non-communicative observation context without being ostensively addressed by the person doing the object hiding, the perseverative search errors have been drastically (if not fully) reduced.

We argued that under the spell of the ostensive signals that induced the Genericity Bias in them, the 10-month-olds were fooled into misinterpreting the basically episodic object hiding-and-searching game as a sequence of serious ‘teaching’ demonstrations. They were thus driven to make the (mistaken) inferential jump to interpret the demonstrator’s hiding actions at a level of referential genericity that would have been justified only if they were correct in treating the other’s communicative acts as pedagogical demonstrations intending to convey some generic type of knowledge about the kinds of referent objects and/or kinds of actions performed on them. As Marion is quick to point out, however, the demonstrated hiding actions are potentially compatible with a number of different but equally possible interpretations that would satisfy the expectation of genericity of referential content. For example, infants may have inferred that the adult – when repeatedly hiding the object under Container A – was pedagogically demonstrating to them that “this type of object ‘belongs to’ container A”. Given this – conventional or even normative – referential interpretation of the communicator’s informative intention, infants were led to search for the hidden object under it’s ‘proper place’ (i.e., in Container A where they were just ‘taught’ it ‘belongs to’). This would explain why they continued to perseveratively search for the object where it is ‘ought to be’ (in Container A) even when it was actually hidden in Container B during subsequent B trials.

Would this kind of ostensively induced generic level misinterpretation represent a real epistemic danger to the young infant? While we claim (and demonstrate) that such overly generic referential misinterpretations do exist (and are, in fact, developmental signatures of the early presence of GB), there is reason to suggest that such over-generic misinterpretations are likely to be sufficiently rare and developmentally transient mistakes not represent a serious danger of epistemic derailment of the young infant’s conceptual development.

The argument for such errors being acceptably rare is based on the hypothesis that the Genericity Bias is not a mandatory code to map referent tokens onto kinds. Rather, it is – as it’s name indicates – not more than a built-in ‘bias’ on inferential processes of referential disambiguation that fosters the assignment of wide-scope and generic interpretations over episodic ones as long as the former are compatible with the communicative and contextual evidence available to specify the intended referential scope of the manifested information.

Let me illustrate this point by another developmental phenomenon: infants’ early competence at social referencing emerging around one year of age. Adults often use object-directed emotion displays accompanied by ostensive referential communicative signals to convey new and generalizable information about generic properties of the kind of objects they make deictic reference to. Imagine that while mom is baking a cake, she notices her baby approach the hot stove. She gets the child’s attention by ostensive facial-vocal signals, then looks and points at the stove putting on an exaggerated emotion display of fear while looking back and forth between the stove (the source of danger) and the infant. If as a result of the ostensively induced GB the infant were driven to assign the most generic level of referential interpretation of the mother’s communicative display, she would infer that “stoves are dangerous to approach”. This would predict that apart from avoiding the stove in the future herself, she would also become alarmed when seeing her mom approach the stove next time. However, since the infant had frequently observed her mother approach and manipulate the stove before (and, in fact, see her do so even while ostensively addressing her fearful communicative expression to the infant), the baby is likely to be able to infer that the intended referential scope of the manifested information is more restricted, coming to the more plausible – but still generic – conclusion that the intended width of referential scope of the informative intention conveyed is something like “stoves are dangerous for children to approach”. Thus, the level of genericity and width of referential scope that the infant arrives at is a matter of pragmatic inference that is modulated by the availability of relevant contextual information and/or existing background knowledge (about mother often approaching the stove). This further relevant evidence can be used by the infant to block the assignment of an overly general interpretation of the intended referential scope of the mother’s communicative display. Nevertheless, due to the ostensively induced GB, the infant will be still driven to assign a referential interpretation at the highest level of genericity compatible with the available contextual information instead of the equally compatible episodic alternative (such as “Mom is afraid of the stove she is looking at”).

Arguably, therefore, the overly generic misinterpretations (exemplified by the ostensively induced A-not-B search error) represent an acceptably low cost incurred by the powerful inferential cultural learning system of Natural Pedagogy when compared to the high benefit gained in cognitive relevance provided by the possibility it affords – implemented through the Genericity Bias – for extracting and fast-learning relevant and generalizable cultural knowledge about referent kinds even from single communicative manifestations that employ deictically identified particular referents only. (We like to think of the ostensively induced A-not-B search error as a sort of ‘conceptual illusion’ – the “illusion of being taught” – that is comparable to perceptual illusions that demonstrate the existence of a pre-wired interpretive mechanism of perceptual inference through its rare malfunctioning under specific input conditions.)

The ostensively induced bias to preferentially assign generic and wide-scope referential interpretations to communicative manifestations when available reflects the fact that Natural Pedagogy has evolved as a cognitive inductive learning device that employs ostensive communication as a kind of evolutionary short-cut to bridge the inductive gap to make relevance-guided inferences from token to type. In other words, Natural Pedagogy treats ostensive communicative agents as a special source of demonstrative information designed to guide the learner’s inferences to identify the generic contents of referent kinds through single and deictic referential acts of communicative manifestations of such culturally shared generic knowledge. In the terminology of Sperber & Hirschfeld, Natural Pedagogy can be hypothesized to be an adapted social cognitive system specialized for cultural learning whose proper target domain is to acquire the generalizable properties that specify abstract types of referents (such as sortal conceptual kinds).

Social referencing in infants

Gergely, G., Király, I., & Egyed, K. (2007). On pedagogy. Developmental Science, 10:1, 139-146

Sperber and Hirschfeld on ‘target domains’

The cognitive foundations of cultural stability and diversity.


  • Dan Sperber
    Dan Sperber 6 December 2010 (00:50)

    Csibra, Gergely and their collaborators have provided rich evidence showing that the same event is more likely to be interpreted by a young child as providing generic information when presented with ostensive cues than when presented without such cues. One possible interpretation of this is that children have a bias to interpret ostensively presented information as generic. Here is another possible interpretation. What ostension does in any case, for children and adults alike, is to convey the presumption that the information to which attention is being drawn is intended to be relevant to the addressee. Without ostension, you pay attention to what seems relevant to you and ignore what seems irrelevant and you have no external guidance in doing so. With ostensive communication, you are encouraged to assume that there is something in what is being ostended that the communicator had reasons to think would be relevant enough to be worth your attention. So, quite generally, it is not surprising that the same event, when accompanied with ostensive cues, should be interpreted differently. Here are two examples that have nothing to do with infants or genericity: – Imagine you are on a holiday in the mountains in a foreign country, you walk out of your hotel, and you see clouds in the sky but don’t pay attention to them. Now however a local, seeing you about to take a stroll, points to these clouds, thereby indicating that their presence is of relevance to you, and this suggest that it might rain (an old example of [i]Relevance[/i], 1986). – Imagine somebody in the street suddenly pointing a gun at you, and doing so either with or without ostensive cues. In both cases, the event is relevant, but it is relevant in quite different ways. If the person was establishing eye contact and making sure that you were paying attention to the gun, you might reasonably interpret this as a threat (and you had better freeze). If the person was not even paying attention to whether or not you were paying attention, you might reasonably interpret the behaviour as a move to shoot you (and you had better run for cover. So, when children interpret an ostensively presented event as conveying generic information, it may be because they have a bias regarding the interpretation of ostension (as claimed by pedagogy theory), or it may be that generic information is particularly relevant to them (for the very reasons invoked in pedagogy theory), and that therefore they quite readily assume that the relevance intended is achieved through such an interpretation. Note that this second explanation is more parsimonious: it just assumes the framework of relevance theory and a strong learning motivation. Parsimony is not enough to make the explanation right, but it puts the onus of proof on those who want to insist that, no, what is at work is not ostensive communication working in the normal unbiased way and addressed at eager learners, it is a biased interpretation of ostension itself. What is, or what would be the evidence?

  • Davie Yoon
    Davie Yoon 11 December 2010 (23:36)

    There are two planets. When the inhabitants of Planet 1 are born, they interpret ostensive cues using relevance theory, but with a strong learning motivation. The inhabitants of Planet 2, in contrast, are endowed with a bias to interpret ostensive communication as conveying conventional and/or normative aspects of their culture. If we are planning an intergalactic vacation, will it matter whether we visit Planet 1 or Planet 2? Will the ability of these beings in terms of communication and knowledge/information transfer differ enough that one planet will have richer culture, language, technology, etc? Finally, I am not sure I understand how Sperber’s preferred explanation for the empirical data on infants’ interpretation of ostensive communication is more parsimonious. What is a “learning motivation”? Unless you specify what it is that infants all want to learn, a general motivation to learn things could just as easily take the form of a strong motivation to commit every detail of an episodic event to memory, like some kind of mnemonic savant. This would seem at odds with the current data.

  • Dan Sperber
    Dan Sperber 15 December 2010 (15:21)

    Very good point by Jennifer that can be answered, I hope, with some clarification. Indeed “learning motivation” is vague. To be more precise, I assume that human beings in general, and children in particular recognise as particularly relevant information that can be made use of in a variety of situation, including generic knowledge that can be applied to all instances of the same phenomenon and knowledge about particulars (for instance about genealogical relationships – and to begin with, the individual’s own -) that is of social relevance and that comes up again and again (both kinds of knowledge being typically cultural knowledge). In this they are right: information that has implications in a variety of situations is more relevant (by our definition of relevance). What is remarkable is the human ability to value not just immediate relevance but also, and even more so, relevance in the long term. This remarkable fact is linked to the unique capacity of human linguistic communication to communicate not just about the here and now (or the round-the-corner and in-one-minute) but about things distant or scattered in time and space. It is quite possible, by the way, that this learning motivation is not itself a single domain-general drive; it may consist in fact in a variety of domain-specific drives. Pedagogy theory recognises something quite similar regarding generic knowledge but describes it as a bias built in the very process of ostensive communication, and I don’t see good enough evidence or arguments to accept that. A learning motivation such as I am suggesting will cause misinterpretations when an ostensive stimulus is ambiguous in the right way (that is, with two competing interpretations of roughly equal plausibility, one satisfying this learning motivation, the other not). This is no different from the kind of misinterpretation by people with a paranoid personality, of sex maniacs, and so on: there is a bias, in cases of conflicting interpretations towards assuming that the interpretation most relevant to oneself is the one intended (whereas the one intended is the one that seemed to the communicator – who may be unaware of the addressee’s obsessions – to have been the most relevant to the addressee). Sophisticated interpreters (hence not very young children) are able to factor out their own obsessions in their interpretation of what the speaker says (unless they have reasons to think the speaker is addressing precisely these obsessions). Now, to go back to Jennifer’s thought experiment about the two planets, Planet 1 where people have no systematic bias built in the process of interpretation (but may be biased by their own interests, including a generally shared interest in learning cultural knowledge), and Planet 2 where they have such an in-built bias towards generic interpretation of ostensive communication (and where, of course, they may also be biased by their own interests); will your interactions be different on the two planets? Suppose there is no difference in what happens, what does genericity bias do for the natives? Why should it have been selected? How do you know that there is such a bias (besides having postulated its existence)? If there is a difference, misinterpretations should be more common and hence communication should be less efficient on the second planet where a bias is build in the communication process itself. You will be running a greater risk that the natives will misunderstand you to communicate some generic knowledge when in fat you are communicating about particulars. Prefer Planet 1.

  • Davie Yoon
    Davie Yoon 16 December 2010 (16:48)

    Prof. Sperber: Many thanks for your thoughtful comment to my previous questions. I think we are zeroing in on my key confusion: What is the difference between (1) a Sperberian learning motivation for long-term relevance and (2) a Natural Pedagogical genericity bias? As explanations for the empirical data, they seem to me (a non-expert in relevance theory, unfortunately) to be identical twins called by different names. Some responses to your questions: [quote]Now, to go back to Jennifer’s thought experiment about the two planets… Suppose there is no difference in what happens, what does genericity bias do for the natives? Why should it have been selected? [/quote] Why not for the same reasons you posit for a general learning motivation for information of long-term relevance? As you write, “human beings in general, and children in particular recognise as particularly relevant information that can be made use of in a variety of situation, including generic knowledge that can be applied to all instances of the same phenomenon and knowledge about particulars (for instance about genealogical relationships – and to begin with, the individual’s own -) that is of social relevance and that comes up again and again (both kinds of knowledge being typically cultural knowledge).” Wouldn’t these benefits apply equally well both to a motivation to learn information of long-term relevance and to generic information? [quote]How do you know that there is such a bias (besides having postulated its existence)? [/quote] I think we can both agree that the empirical data suggest something more at work than simple associative learning in infants’ interpretation of ostensive communication. However, I do not see that the data distinguish between Sperber & NP and am still confused as to whether these two accounts are at all empirically distinguishable… Now some more questions about the Sperberian motivation to learn information of long-term relevance. [quote]A learning motivation such as I am suggesting will cause misinterpretations when an ostensive stimulus is ambiguous in the right way (that is, with two competing interpretations of roughly equal plausibility, one satisfying this learning motivation, the other not). [/quote] Why shouldn’t such a misinterpretation be compatible with the NP account? [quote]If there is a difference, misinterpretations should be more common and hence communication should be less efficient on the second planet [NP] where a bias is built in the communication process itself. You will be running a greater risk that the natives will misunderstand you to communicate some generic knowledge when in fact you are communicating about particulars. Prefer Planet 1. [/quote] It seems that there is potential for misinterpretation on both planets (see your above quote). Why should the misinterpretations you describe on Planet 1 be more frequent or more serious than on Planet 2? Sure, in your examples you talk about paranoid gun enthusiast sex addicts, which are surely rare even on the planet Earth. But anyone may have a particularly salient motivation or goal in mind that could differ from that of their communicator. For example, an infant may be extremely hungry or sick to their stomach while a caretaker is trying to inform them of a cultural convention. It seems to me that leaving it up to the individual to decide for themselves what they should learn is a path to very slow and error-filled cultural acquisition. Sure, a bias to interpret a communication as generic when the intention is to communicate about particulars could lead to the type of misunderstanding you describe. But it seems to me that this type of error could be a small price to pay for the larger benefits of rapid cultural transmission of information. Whether or not we decide there is an important distinction to be made between relevance-then-genericity, or genericity-then-relevance, both accounts beg for further empirical and mechanistic elaboration. For one thing, it is difficult to imagine how either interpretation style is biologically instantiated. At the risk of sounding like a broken record [url=]broken record[/url], I hope that this discussion inspires more researchers to push precision forward in mechanistic, neurobiological terms for both relevance theory and natural pedagogy. Thanks for your patience! Davie (aka, Jennifer) Yoon

  • Olivier Morin
    Olivier Morin 20 December 2010 (11:28)

    I am itching to jump in and debate with Davie and Dan, but for now let me reply to György’s post with an imaginary dialogue. Imagine two vision scientists (apologies to vision scientsits reading this) trying to account for the human capacity to perceive the color blue. Let us call them Dr. Color-Capacity and Dr. Blue-Bias. Dr. Color-Capacity claims that natural selection favored the evolution of a sensitivity to colors. As a result, when we are exposed to a given range of wavelengths, we perceive them as blue. Sometimes (rarely) we suffer from illusions: we perceive blue shades where there is nothing blue to see, or we can’t see blue when it is there. These illusions are a side-effect of our general capacity to see colors: blue, red, yellow, etc. Dr. Blue-Bias has another take on the issue of Blue-vision. He agrees with Dr. Capacity on the existence of a general capacity to see colours, but he wants to make two additional points. His first point is that there is something special about visual illusions involving blue. According to Dr. Blue-Bias, we are more likely, all things being equal, to mistake Red for Blue than the other way round. He has some evidence to back this claim: when people are presented with ambiguous colors, they are more likely to err on the Blue side than on the red side. (Dr. Color-Capacity objects that his stimuli are not that ambiguous.) How does he explain this weird pattern of illusions? Well, Dr. Blue-Bias is unsure. Sometimes, he claims that our Blue detectors are extremely sensitive because Blue information is more important than other kinds of Color information. Thus, we are more vulnerable to Blue illusions, as compared to Red illusions. Every other day, however, Dr. Blue-Bias ventures to make a stronger point: he claims that our increased sensitivity to Blue illusions is the price to pay for our capacity to see blue. We wouldn’t be able to see Blue [i]at all[/i] without the Blue Bias. This is the kind of claim that sets Dr. Capacity on edge. (Dr. Capacity is an old curmudgeon.) Whichever claim he supports, Dr. Blue-Bias concludes that his findings suggest that Color Vision evolved specifcially because it enabled us to see Blue. That was its first and most important function. Every other day, he ventures to claim that our high susceptibility to blue illusions was itself selected by natural selection (and Dr. Capacity grinds his teeth). Dr. Color-Capacity does not deny that we are susceptible to Blue Illusions, just as we are susceptible to Red and Yellow illusions. No visual system is perfect, and these errors are, indeed, a necessary by-product of our capacity to see colors. – Well, Dr. Blue-Bias replies, you just made my point! That is exactly what I mean! – Wait a minute. You are not merely claiming that mistakes happen in the visual system. You are saying that we are highly and specifically susceptible to Blue mistakes (as opposed to Red, Yellow or White mistakes). [i]That [/i]is not a necessary side-effect of our capacity to see colors (or to see Blue). – OK, but it is a necessary side-effect of the Blue-Bias. We are more sensitive to the presence of Blue, because it is extremely important for us not to miss it when it’s there. It’s like predator-detection: it is much more costly to miss the actual presence of a tiger than to mistake a sound in the forest for the presence of a tiger. In the first case, you die, in the second case, you just made a mistake. The same goes for Blue-detection: our vulnerability to false positives is the price we pay for our sensitivity to true positives. And it’s worth the cost. – So you agree that the Blue Bias just makes us more vulnerable to false positives, by making us less vulnerable to false negatives. You would agree that we could see the color blue – just as we see red and yellow – without the Blue Bias ? – Well, yes if you put it that way. Though to me color vision cannot really be dissociated from the Blue Bias. – So let us come back to the Blue Bias. I don’t really get your analogy between seeing tigers and seeing Blue. Mistaking a tiger for a bush is much more costly than mistaking a bush for a tiger, granted. But why would you claim that mistaking Blue for yellow is much more costly than mistaking yellow for Blue? – Blue information is extremely useful, and with the Blue Bias we are more likely to retrieve it. – Red information is just as useful, and with the Blue Bias we are less likely to retrieve it! – Well, actually, Blue information is more useful, and the mistakes caused by the Blue Bias are quite rare anyway. – But then the Blue Bias must be very weak? – Not necessarily. If one is strongly biased towards blue, and blue is a very common color, then one will make few mistakes. – But is blue that common and that useful, as compared to other colors? Fortunately for them, Dr. Color-Capacity and Dr. Blue-Bias have ready means to answer these questions. They are also lucky to possess an almost complete knowledge of the mechanismes underlying color vision: what stimuli are more likely to be perceived as blue, why, what counts as an ambiguous color stimuli, etc. Sadly, we don’t have such knowledge of human communication. How can we know that we have a Genericity Bias, as opposed to a Communication Capacity vulnerable to occasional genericity illusions? Answering that question would require, at least, a definition of genericity, and some knowledge of what usually causes us to perceive communication as generic. What bothers me about Pedagogy theory is that it could easily be taken as providing such knowledge. It doesn’t. Saying we are victims to Blue illusions does not explain how we are able to perceive the color blue. The Genericity Bias, if it exists, does not explain why we are able to understand generic communication.

  • Dan Sperber
    Dan Sperber 21 December 2010 (14:27)

    The structure of my argument was this: one can argue either that the relevance-guided-ostensive-communication-plus-natural-pedagogy (RT+NP) and the relevance-guided-ostensive-communication (RT) hypotheses make the same predictions, or that RT+NP predicts more misinterpretations in communication than RT. In the first case, RT is more parsimonious. In the second case, where is the evidence? And should we expect to find such evidence given that these extra misinterpretations (where non-cultural information is mistaken for cultural information) are unlikely to be adaptive? To anticipate further moves: Why not have, you might want to argue, NP without RT, and then the argument from parsimony collapses? Because, in any case, NP need a theory or ostensive communication. The only fully-fledged such theory is RT. So NP needs either RT or an alternative. At present there is no alternative. If there were, the comparison would be more complex, obviously, but we are not there and it is unclear that it would be such a good idea to go there. RT is not bad and can be improved if need be. Doesn’t RT need to assume something like a learning motivation (or possibly a variety of domain-specific such motivations)? Yes, but isn’t something of the kind also assumed in PT? Another related point: Misinterpretation towards genericity is only one of the cases of misinterpretation common in human communication. For all of these, RT proposes a unified account: addressees tend to default to the assumption that the intended interpretation is the one most relevant to them (rather than the one that might have seemed to an honest communicator to be relevant enough to the addressees). Young children have only the default heuristics at their disposal. More mature communicator can make use of more sophisticated heuristics, but the default one may still bias them (See Sperber “Understanding verbal understanding” 1994 [url=]here[/url]). Hence RT suggests a general interpretation for interpretive biases, including the genericity bias.

  • Olivier Morin
    Olivier Morin 22 December 2010 (11:21)

    Davie asks: what is the difference between a Motivation to learn relevant things and a Genericity Bias? First, as pointed out by Dan, there is a subtle difference between a) being interested in a topic to such an extent that one considers interpretations that feed one’s obsesssion in priority ; and b) being utterly unable to prioritize other interpretations because of the way one’s communicative faculties are built. This difference, however, is subtle, and may well amount to the same behavioral outcome. In that case Davie wouldn’t be far from the truth when she sees nothing but a distinction without a difference here. But there is, I think, a much more important difference, namely the difference between relevance and genericity. Generic representations, if I understand Gyuri and Gergo well, are supposed to apply to sets of objects in a way that is more or less robust to changes in circumstances, to the passing of time, etc. The fact that “chairs made in Japan cannot be painted a certain shade of beige because that kind of paint is not available in Japan” is generic. But it is not the kind of information that most animals would spend cognitive energy learning. Generic information is relevant only for certain individuals in certain circumstances. Compare a child and an adult. Ms. Adult is a fluent speaker of her own language, she knows how to action most artifacts around her, she is a proficient user of her local system of laws and norms. When she takes a plane, she does not pay attention to the security instructions she is hearing for the nth time. For Mr. Child, on the other hand, generic information is new, relevant and extremely useful. He wants to know the names, the functions, the nature of things around him. If Mr. Child is Eager-to-learn-relevant-things, his passion for generic information will, in time, fade into a broader interest for relevant information. If, on the other hand, Mr. Child is Genericity-obsessed, he will become a very dysfunctional adult. Pedagogy theory does not claim that we are Genericity-obsessed. It states that we prioritize genericity only when we deal with ostensive communication. If genericity-obsession is ruled out, then, we are left with two possible explanations: – Relevance: Generic interpretations are favored if and when they are interesting (for example, when you are a child in front of an adult with a new toy). – Genericity-Bias: we can’t help favoring generic interpretations, whether or not they interest us. The latter solution would clearly make communication very difficult, or at least boring, for Planet 2 inhabitants.

  • György Gergely 28 December 2010 (10:42)

    Dr. Opacity from Planet 2: Let’s get back to Earth (for some data) First, three cheers and countless webhandshakes for Davie Yoon for her brave and spirited request asking Dan Sperber to provide us with specifications and guidance to unpack the “strong learning motivation (SLM)” component of Relevance Theory (RT+SLM – sans NP) before jumping to embrace it on grounds of parsimony (over RT+NP). And [i]‘salut et bonne santé’[/i] to Dan with admiration for having taken Davie’s challenge as seriously as her argument deserves it and for going so much further in spelling out what is (and how much is not) at stake here. And while I am at it, thanks to Olivier Morin for the mental pleasure provided by reading NP week on ICCI: it was really the needed input to strengthen one’s soul before entering the yearly Christmas Game. (I must correct him on one terminological point though: the You-Know-Who is really called Dr. Opacity on Planet 2 – Dr. Blue-Bias is an urban legend, obviously from Planet 1. But more on that on a different occasion, if I may.) There would be much to react to in relation to Davie’s and Dan’s most informative exchange pro and con NP’s genericity bias and RT, but under the pressure of Christmas let me just make one quick point to counterbalance what I feel as an unduly early developing closure and apparent convergence of views to consider the two positions (“RT vs. RT+NP”, as Dan parsimoniously referred to them) as being basically just too close to each other to be usefully differentiable empirically (or in some conceptually sufficiently interesting way). For example, Davie starts her – otherwise equally brave and spirited – comments to Dan’s clarifications about how RT with “SLM Unpacked” (RT+SLMU) should be conceived of by saying: “What is the difference between (1) a Sperberian learning motivation for long-term relevance and (2) a Natural Pedagogical genericity bias? As explanations for the empirical data, they seem to me… to be identical twins called by different names.” But hold on, Davie, not so fast… (Honestly, I’m afraid that you may be letting NP too hastily delegated to the status of some – probably to be short-lived – ‘conceptual appendage’ that upon due and fair scrutiny can and will be removed without any loss, complications or even too much pain from the parsimonious body of RT+SLMU). In my view, at least, it is far from clear yet whether the empirical data generated by NP theory can be accounted for equally well by the two alternative accounts (not to speak about being “more parsimoniously” explained by ‘RT+SLMU’, as suggested by Dan). So before starting to discuss the (otherwise really interesting) question of how many and how disasterous misinterpretations we would have to be resilient enough to tolerate to live on Planet 1 (RT+SLMU) vs. Planet 2 (RT+NP), let me intervene and ask you all to stay on Earth for a moment and consider the following questions about the relevant evidence that we do have: Dan makes two (uncontroversial) claims (about SLM): (i) Human infants are “relevance-seeking” creatures with a strong built-in epistemic learning drive, who “without ostension, pay attention to what seems relevant” to them while ignoring what seems less relevant. (ii) Infants are born with an SLMU that has high on its pre-wired relevance agenda the need to learn about generic facts of the world and culture. That is, everything else being equal, information that applies more generally (to a wider scope of referents, or across a wider scope of situations) is taken to be (or ‘sensed as’) more relevant (due to preparedness for relevance in the long term) than the same information if it applies only to a particular referent in the here-and-now. This is sufficient ground, says Dan, to account (staying within the framework of RT+SLMU) for the finding that “children interpret an ostensively presented event as conveying generic information”. This is a simple (and parsimonious) consequence of the fact that “generic information is particularly relevant to them” (this being part of SLMU) and not because of a genericity “bias regarding ostension”. Fine. But shouldn’t this argument equally apply to the interpretation of events that are observed in non-ostensive contexts as well? “Without ostension, pay attention to what seems relevant”, so infants endowed with Dan’s SLMU should, by hypothesis, assign the more generic interpretation (that is “particularly relevant” for them) also when presented with the same event in (an equally ambiguous) non-communicative observation context. Several pieces of converging evidence from our studies indicate, however, that they don’t do so. Let me use just one example to illustrate this point: Take our recent function-based artifact individuation study (Futo et al., 2010, Cogn.) applying the well-known Xu & Carey (1996) paradigm to 10-month-olds who repeatedly (but separately) witnessed either of two novel artifacts being brought out by a hand from behind an occluder (one at a time). The hand then demonstrated a specific functional use of Artifact 1 on one side of the screen (Pull handle – Lights illuminate, see Fig. 1) and a different functional use of Artifact 2 on the other side (Turn knob – music plays, Fig. 1). Futo et al. varied across groups whether the artifacts and their respective uses were presented in an ostensive communicative or in a non-communicative observation context. In the ostensive condition infants were first greeted in (good old Hungarian) motherese before the objects were presented: They heard “Hi, baby hi! Look!” (“Szia baba, szia! Nézd csak?!”) uttered by the sound of the young woman hiding behind the occluder (who was thus heard but not visible to the infant) before her hand brought out and functionally manipulated the two artifacts in turn. (In the non-ostensive presentation condition we used a synthesized backward non-speech sound transform of the motherese greeting as the attention getting control stimuli before presenting the artifacts.) In the test phase the screen was removed to reveal either both or only one of the two artifacts behind it. The 10-month-olds expected two objects behind the screen (showing violation-of-expectation by looking longer when seeing only one), but they did so only in the ostensive demo condition. This indicates that when ostensively addressed infants assigned a generic kind-level interpretation to the functional use of the novel artifact representing it as (a token of) the kind-specifying generic functional property of the artifact kind that the novel object observed belongs to. It seems therefore that 10-month-olds are conceptually committed – not by ostensive guidance but by evolved innate conceptual architecture, i.e., by SLMU – to consider artifacts to belong to kinds that are individuated by their essential function (apparently assuming a one function-one kind mapping for artifact concepts). As a result, they seem to have concluded that since there were two different arifact kind functions demonstrated, there must have been two separate individual artifacts (one belonging to each kind) behind the screen. Fine, says Dan, that’s precisely what ostension does to you: it presumes and promises extra amount of intended relevance to be harnessed for the interpreter and “since generic interpretation is particularly relevant to them [babies]” (NOT because of ostension, but as part of SLMU) “they quite readily assume that the relevance intended is achieved through such an interpretation”. Fine, but if so, the question is: why didn’t they similarly assign the same generic interpretation (that is ‘particularly relevant’ to them) also in the – equally ambiguous – non-ostensive demonstration condition? Shouldn’t they have done so on Dan’s RT + SLMU account (where generic info is in general of higher relevance for the SLMU baby than corresponding episodic info)? The results, however, indicate that the genericity bias effect is selective: Without ostension there was no indication of assigning a generic interpretation to the observed functional uses of the artifacts (as there was no expectation of two objects rather than one during test). This seems hard to accommodate by RT+SLMU (unless the linkage between ostension and genericity is going to be Unpacked in the SLMU component of RT+ SLMU in some – hitherto unspecified – manner). In contrast, apart from explaining the genericity bias effect in the ostensive condition, the RT+NP view’s ostensively-induced genericity bias hypothesis can easily account for the lack of kind assignment in the non-ostensive function demonstration condition as well: so much for relative parsimony. I, of course, agree with Dan that “parsimony is not enough to make the explanation right, but it puts the onus of proof on those” who want to account for the selectively ostension-linked genericity bias effect as a general consequence of RT+SLMU (sans recourse to NP). (I should call attention to an important empirical question here that remains to be explored: the issue of what role statistical observational learning plays in the inductive assignment of generic kind representations. Clearly, if infants are eager and prepared to learn about kinds and generic knowledge (given SLMU), it may well be possible to clench their epistemic thirst the slow way as well (gulp-by-gulp, as it were): by accumulating and extracting regularities from repeated observations (say, about the reoccurring stereotypic functional use of the same – or of several different – exemplar(s) of a novel artifact kind) and by assigning and representing the extracted invariant information as a relevant kind-specifying generic property of the object kind. In fact, during the habituation phase of the Futo et al. study infants observed numerous repetitions of the same functional uses of the novel artifacts. Still, it seems that at least for these young 10-month-olds this amount of statistical evidence was not sufficient in the non-ostensive condition to commit themselves to a kind-level generic interpretation of the repeatedly observed functional use of the artifacts.) Above we have been discussing ostensive vs. non-communicative contexts that – in-and-of-themselves – were neutrally ambiguous between an episodic vs. a generic interpretation of the same events observed. However, the adult versions of Davie’s own study with 9-month-olds (Yoon et al., 2008) that Hanna Marno is carrying out in her PhD research (e.g., Marno et al., 2009) provide an interesting and much stronger case where the genericity bias can be demonstrated in adults as well and – most importantly for the present argument –where the need for episodic encoding of the stimulus display is made clearly more relevant by the specific requirements of the task context. Briefly, Marno et al. applied a change detection paradigm, where subjects first saw an array of objects with different features placed at different spatial locations on a table. A female demonstrator sitting at the table first either ostensively addressed the subject (Ostensive condition: direct eye-contact) or was acting in a non-ostensive way (Non-communicative condition: averted eye-gaze), before proceeding to point at one of the objects on the table. Then the display momentarily disappeared (change blindness procedure), and when it became visible again subjects had to indicate which of the objects has gone through a change (either in features or in spatial position) by touching it on the touch-screen. The results replicated the pattern of findings Yoon et al (2008) reported in 9-month-olds where in an ostensive communicative context, infants devoted their limited memory resources to encoding the feature-based identity of novel objects at the expense of encoding their spatial location, which was however preferentially retained in non-communicative contexts. In particular, Marno et al. reported that in the ostensive (direct eye-gaze) condition adults showed more error in change detection when the target object’s spatial location was changed while performing better when the object’s features were changed. In contrast, the opposite pattern was found in the non-ostensive (diverted eye-gaze) condition where feature change was detected better than change in spatial location. Note that this pattern of findings demonstrates the influence of the ostensively-induced genericity bias of NP on object processing even when it is clearly dysfunctional for the optimal performance of the particular type of task that requires (and pragmatically specifies the need for) registering any kind of momentary change (be it featural or spatial) as relevant to attend to in order to solve the task. My point again is that while an ostensively induced genericity bias (as postulated by NP) can actually (and non-trivially) predict this – in the specific task context, non-adaptive – processing bias and the consequent error pattern, the RT+SMLU account – as far as I can see – has no principled way to deal with this finding. So let me pass the buck (and the onus of proof) back again to RT+SMLU on parsimony’s ground, if I may. So Merry Christmas to all the lucky ones who are in a position to read NP week on ICCI instead of opening more presents under the Tree (that has been featurally distorted and spatially displaced from its natural habitat into a living room of humans by cultural misfortune). I am sending my best regards from my (longish) field trip to Earth before returning to – for the time being and if you don’t mind – Planet 2. Gyorgy [b]References:[/b] Futó J., Téglás, E., Csibra, G., & Gergely, G. (2010). Communicative function demonstration induces kind-based artifact representation in preverbal infants. Cognition, 117, 1-8. Marno, H., Csibra, G., & Davelaar, E.: Object perception in social-communicative context. EPS Meeting at the University College London. January 5-6, 2009. Yoon, J. M. D., Johnson, M. H., & Csibra, G. (2008). Communication-induced memory biases in preverbal infants. Proceedings of the National Academy of Sciences of the United States of America, 105, 13690-13695.

  • Dan Sperber
    Dan Sperber 30 December 2010 (23:05)

    In my interventions in this discussion, I have evoked the hypothesis that humans in general and children in particular have a strong learning motivation. Then I had to correct the misinterpretation I may have fostered that this is one distinct and unitary mental drive. Such a learning motivation may be subserved by many domain-specific competencies of modules that cause the individual to pay particular attention to information relevant in these domains. Before too much is made of this idea of a strong learning motivation (which now appears with an acronym “SLMU” — “U” for ‘unpacked’– in Gyuri’s latest comment), let me point out that it may be no more (and no less) than a consequence of the first or [i]cognitive principle of relevance[/i] of Relevance Theory (look [url=http://ttp://]here[/url] for a recent précis of the theory). According to the principle, human cognition tends to be geared to the maximisation of relevance, meaning that humans tend to spontaneously pay attention to the stimuli most likely to have rich consequences inferable at relatively low cost and to interpret them in manner that maximizes this relevance. This tendency to maximize relevance will favour more general (or ‘generic’ as Gyuri and Gergo use the term) information when available, and – see my next comment – ostensive communication (made possible by mindreading and by the cognitive principle of relevance and governed by the second or [i]communicative principle of relevance[/i]) is a uniquely rich source of general information. The more I think of it, the more I am tempted to think of Pedagogy Theory as the discovery and exploration of a whole range of major consequences of the role of relevance in human cognition and communication that Deirdre Wilson and I had not thought about, let alone understood. I am thrilled by this, both because of my interest in relevance and because of my interest in culture: An extended Relevance Theory that would incorporate Pedagogy Theory would be a new theoretical enterprise of much greater scope than either or the mere addition of both. Of course, this is just a hunch and it may be that Relevance Theory and Pedagogy Theory, however neatly they dovetail, are best seen as distinct theories that could each stand and fall independently. Still, if it turns out that the best theoretical case is for incorporation, I believe that this would greatly contribute to the scientific importance of both theoretical contributions.

  • Dan Sperber
    Dan Sperber 30 December 2010 (23:13)

    Gyuri writes in his latest comment: [quote]everything else being equal, information that applies more generally … is taken to be … more relevant than the same information if it applies only to a particular referent in the here-and-now. This is sufficient ground, says Dan, to account … for the finding that “children interpret an ostensively presented event as conveying generic information”. This is a simple (and parsimonious) consequence of the fact that “generic information is particularly relevant to them” … and not because of a genericity “bias regarding ostension”. Fine. But shouldn’t this argument equally apply to the interpretation of events that are observed in non-ostensive contexts as well? “Without ostension, pay attention to what seems relevant”, so infants endowed with Dan’s SLMU should, by hypothesis, assign the more generic interpretation (that is “particularly relevant” for them) also when presented with the same event in (an equally ambiguous) non-communicative observation context.[/quote] In my initial comment, I had tried to explain why it is not so. Let me try to make a better job of it. I assume that our cognitive mechanisms and dispositions operate under some strong epistemic constraints. Barring some exceptions (see below) and everything else (processing costs in particular) being equal, cognitive processes are more adaptive when they provide the organism with genuine information rather than misinformation. In other terms there has been strong selective pressure for cognitive processes to be epistemically sound. (This is integrated in current relevance theory by claiming that only “positive cognitive effects” contribute to relevance, where positive means contributing to genuine knowledge.) The rare clear exceptions have to do with cases where false positives and false negatives have very different fitness costs and where a bias towards overestimating dangers may be adaptive (‘better safe than sorry’). Assuming a cognitive bias in other cases — not just an attentional bias, but a bias, e.g. interpretive, subject to epistemic evaluation — and trying moreover to argue that such a bias is adaptive as is done for the ‘genericity bias’ in Pedagogy Theory has to be argued for against prima facie implausibility. Given epistemic constraints, automatically generalizing from one’s own observation of single instances would not be a way to maximize relevance, quite the contrary. In fact, human – and children’s – tendency to generalize from single instances is very much controlled by domain-specific or even concept-specific dispositions: from seeing an animal of a certain species hop, we readily generalize to the species being disposed to hop, from seeing an animal of a certain species limp, we do not generalize in the same way. Now, when a single instance is being displayed in an ostensive manner, the epistemic situation is different. The communicator is conveying that the display is relevant enough to be worth attention, and often this is not the case unless a generalization from the single instance is warranted. Ostension comes with a relevance warrant that, in most cases, implies an epistemic warrant. Of course, I am not suggesting that children, or, for that matter, adults, have to go through a reflective examination of these warrants. They are just competent addressees (with competence becoming more sophisticated with age) of ostensive communication. This is the framework that, I suggest, might explain the effects discovered by pedagogy theorists — more generic interpretations of the same display when ostensively presented — without assuming an ad hoc cognitive bias.

  • Davie Yoon
    Davie Yoon 31 December 2010 (18:31)

    Half of each of the teams in this debate are missing! Hopefully Deirdre Wilson and Gergely Csibra will join in after New Year’s festivities have concluded. I wonder if the possible convergence Prof. Sperber is pointing towards will be acceptable to all principle parties… In the meantime, Happy New Year — [url=]this one goes to 11!!![/url]