Article for January: Iconicity and the Emergence of Combinatorial Structure in Language

Our first paper in 2017 is „Iconicity and the Emergence of Combinatorial Structure in Language” by Verhoef, Kirby and de Boer (2016), published in Cognitive Science. They present experimental results regarding a transmission chain experiment in which participants had to make use of a slide whistle:

…to learn and reproduce sounds representing novel objects. Their focus is on the emergence of combinatorial structure in this whistling “language” and the interaction thereof with the possibility to create rather iconic form-meaning pairings, which was manipulated experimentally. The results suggest that 1) learnability of the systems improves over generations, 2) increasing structure emerges over generations, and 3) there is a short-time influence of the possibility to use iconic signals that delays the structuring processes.

Feel free to comment on the paper below!


  • comment-avatar
    Thomas Müller 26 January 2017 (10:31)

    A really nice paper overall that I enjoyed very much. I liked especially how they were able to quantify distance between the whistles, such that they had a continuous outcome measure for the deviation between the continuous sound signals.
    At the same time, however, they also used this measure for the “reproduction constraint” employed to prevent participants from producing the same signals, in which I see a problem for the interpretation of the results: The “human tendency to find and create structure” that they claim to have found may be limited to situations in which underspecification is prevented, then.
    Additionally, the measure of iconicity, as noted in the paper, was subjective and not reliable. A different way to achieve this could be to let completely new participants only play the guessing part of the original experiment with the previously produced data. Listening to signals from both conditions, they could provide a behavioural and more objective measure.
    The main relevance of the results that I see for our work here is that starting out with iconic symbols should prevent combinations in our experiments. However, in the current versions of our Colour Game, the use of preestablished symbols means that at most we can get compositional, but not combinatorial systems; so I’m not sure what we would have to expect. Any thoughts on that?

    • comment-avatar
      James Winters 13 February 2017 (16:44)

      Wrt your point about the reproduction constraint: I think the authors would count underspecification as structure. It’s just that a compression pressure alone isn’t sufficient to account for combinatorial structure; you also need something similar to the reproduction constraint.

      On the colour game combinations: I wouldn’t necessarily rule out the possibility of transitioning from compositional to compositional and combinatorial signals. There is actually real-world evidence for this in Al-Sayyid Bedouin Sign Language (ABSL), where the community does not have a conventionalised level of meaningless elements, i.e., combinatorial structure, but it does have compositional structure at the levels of morphology and syntax. This might be a good reason for incorporating iterated learning chains into the colour game: you won’t get combinatorial structure without naive learners.

      • comment-avatar
        Thomas Müller 15 February 2017 (12:59)

        Very good point about underspecification also being a kind of structure. Still, if combinatorial structure doesn’t come to be by transmission chains alone, but requires this additional constraint, couldn’t that mean that the constraint is actually the more important factor? At the very least they both play a role.

        On the second point, I’m not fully sure I understood what you wanted to say: Do you mean that naive learners could lead to developments of single symbols becoming meaningless on their own, thus creating a transition from purely compositional to compositional plus combinatorial?

        • comment-avatar
          James Winters 17 February 2017 (09:25)

          Depends on what you mean by important. For instance, an additional constraint might be relevant for creating, say, distinct categories, but the transmission component is needed to make things compressible. So, from my perspective at least, both constraints are important in different ways. I think the point Kirby et al are normally trying to make is that you need both; they don’t really put a premium on one constraint over another, even if the transmission component is highlighted.

          You’re pretty much spot on wrt the second point: we might want to consider running transmission chains as, in complex compositional combinations, I reckon the transparency between form-meaning mappings will be lost (or, to be more accurate, reanalysed) when transmitted to naive learners. Furthermore, I’d predict this to be the case with more abstract symbols, as opposed to iconic ones.

  • comment-avatar
    Piers Kelly 3 February 2017 (13:59)

    This paper appealed to my interest in the relationship between two classes of graphic codes: semasiographic systems and glottographic systems. Semasiographic systems communicate meaning without modelling any phonological features of language, while glottographic systems use graphemes to stand for phonological building blocks, usually syllables, phonemes or both together.

    Examples of semasiographies include Australian message sticks, Native American pictographs, and khipus among others. All of these rely on iconicity to a greater or lesser degree, while their combinatoriality is limited or non-existent.
    Examples of glottographic systems are Egyptian hieroglyphics, the Roman alphabet, the Chinese script etc etc. These are all systems that coincide with what we call writing. The scripts are generally non-iconic (with a few high-profile exceptions) and highly combinatorial.

    From one perspective it might be assumed that glottography has a huge advantage of semasiography since it encodes spoken language (albeit imperfectly) giving it great versatility. Semasiography, on the other hand, requires some kind of synchronous oral exposition and is usually restricted to a limited set of meanings or ritualised performances. In other words, semasiographic communication is very much a context-dependent activity. This is not to argue that semasiographic systems are ‘deficient’. Indeed they do exactly what they need to do and they are certainly not trying-and-failing to be glottographic writing, as so much earlier commentary has proposed.

    Nonetheless, glottographic writing has always arisen in sites where semasiographic systems were already in use with the implication that glottography benefits from having semasiographic launch pad. And yet such a launch is by no means inevitable or easy.
    Verhoef et al’s paper points to some of the possible reasons for this difficulty. Iconicity (typical of many semasiographies) can be both a blessing and a curse when it comes to the emergence of codes. The blessing is that iconicity is thought to assist in the acquisition process (but not in all environments, see Tolar, Lederberg, Gokhale, & Tomasello, 2008) and provides a starting point for grounding communication. But the curse is that it seems to resist combinatoriality. Verhoef et al speculate that this is because iconicity relies on an association between a vague property of a sign and it’s meaning and thus “as long as the correct association between this property and the meaning is preserved, the precise realization of the signal is of less importance.” For this reason enforcing a distinction between signals becomes harder. A single meaning may be represented with two or more iconic signs, and users also need to align on “which properties of the meaning are in focus when mapping form to meaning”. In effect, iconicity resists standardisation: in the number of variant signs per meaning, the forms taken by those signs, and the contrasts between them.

    These kinds of effects are visible in real-world graphic codes. Australian message sticks, for example, make use of iconicity but are short on meaningful graphic contrasts in the manner of ‘minimal pairs’. I have not yet come across definitive instances of synsemantics – whereby two graphic elements with semantic content can combine as building blocks to produce novel meanings (which is not to say this is impossible or doesn’t ever happen). Instead, message sticks are ‘read’ as diagrams where multiple relationships between graphic elements can be propounded upon orally. In direct contrast, glottographic scripts must be read in a determined order, and can get away with exhibiting less iconicity. More importantly, the graphemes (whether iconic or not) are standardised such that even minor contrasts can become productive. Compare the various Egyptian glyphs of human figures that differ in ‘gesture’ only, where even slight relaxation of a standard would risk a misreading. Since glottography cannot always rely on an external context shared by addressor and addressee, including an oral channel for immediate repair when communication goes awry, it is necessarily more standardised, linear and minimally contrastive – all properties that favour combinatoriality.

    Orly Goldwasser has uncovered similar effects in two incipient glottographies: the Proto-Sinaitic and Proto-Canaanite alphabets of the Middle Bronze Age. Here a number of icons were borrowed from Egyptian hieroglyphics in the formulation of the new alphabets but with their iconicity still intact and these appeared to resist standardisation relative to non-iconic graphemes. Later, the alphabets gave way to linear scripts that were less iconic and more standardised. On the strength of Verhoef et al, it would no surprise if such a loss of iconicity went hand in hand with increase combinatorial potential manifested in their linearity and contrastiveness.

    • comment-avatar
      Olivier Morin 13 February 2017 (12:33)

      Thanks Piers, your comments make me understand iconicity better. One small question that’s connected with Verhoef et al.’s paper.

      They note (rightly, I think) that iconicity is very much in the eye of the beholder, which makes it a difficult notion to handle. Isn’t this true also of at least some cases of purported iconicity in writing systems? I am thinking of cases where the non-arbitrariness of the sign becomes obvious (or at least apparent) when it is pointed out to you, but you would never have inferred it on your own. (For instance, how the letter A is a bull’s head in reverse, or how the Egyptian hieroglyph for ‘city’ represents a crossroads. Message sticks markings don’t look very iconic to a naive eye like mine, but they could be iconic in this way.

      If this is on the right track, it might be fruitful to distinguish between iconicity with direct communicative functions—signs whose visual properties directly convey some meaning to most spectators, without instructions—and iconicity as interpretation.

  • comment-avatar
    Olivier Morin 13 February 2017 (12:26)

    A very exciting paper! (As often with this team.) I take their main point, that transmission and memorisation constraints are likely to make signals more compressible and so, more structured. (Whether iterated learning is the only way to get there is of course another matter.)

    One minor regret I had about that particular experiment: The title role is almost absent from the play! Iconicity makes a fleeting appearance, but fails to provide a stable communicative strategy. Asking naive participants (not privy to the transmission chain) to guess a signal’s referent on hearing the sound itself might have demonstrated iconicity (but it could also fail to demonstrate it). Still, the paper is worth a read as a nice replication of the general finding that structure can grow in transmission chains.

  • comment-avatar
    Piers Kelly 14 February 2017 (14:49)

    In response to Olivier’s first comment: this problem points to the need to have a robust definition of iconicity, and an equally robust definition of context. I won’t attempt to formulate such definitions here (James W has a good definition of context in his thesis), but these concepts are to a certain extent interdependent.

    To say that an iconicity judgment relies on prior familiarity with the relevant pictorial conventions is another way of saying that iconicity is context-dependent. The context, in this case, is the shared knowledge of these pictorial conventions.

    I would argue that semasiographic systems are context-dependent across more than one dimension: the immediate synchronic context of the real-world interaction (be it performance, ritual or other) including oral glossing, the context of the shared knowledge acquired from past interactions of a similar kind, and the context of shared cultural norms of graphic interpretation.

    Of course asynchronous glottographic communication also relies on context in order to be successful but this context is arguably of a different kind or degree. On the face of it, less context is necessary all round for written communication to be successful, and existing pictorial conventions are arguably irrelevant altogether: the shared conventions that really matter are the standardised (and deliberately learned) pairings of sign-to-morpheme and/or sign-to-sound-value. Iconicity may help in the directed learning of this standard but it hinders the conventionalisation process. Inferential interpretations of icons are not constrained enough to be efficient.

    I would be inclined to argue, however, that perhaps glottographic writing is in some ways just as context-dependent as semasiography if we consider the fact that glottographic writing smuggles its most useful context along with it. A glottographic text is, after all, one that uses graphemes to model salient features of spoken language.

    Reading is a process of uncovering a linear sequence of linguistic sounds and/or morphemes, and this sequence provides its own narrow context of interpretation that ultimately resolves into an unambiguous linguistic ‘utterance’. This is why ancient scripts like Mayan and Linear B were able to be deciphered at all despite enormous gaps in our knowledge of the ethnographic-historical context of their production. It’s also why, I suspect, presumptive semasiographies like Rongorongo, or Vinca, resist decipherment. We lost the real-world context, oral glossing etc, and there’s no linguistic structure hiding inside to help us fill in the blanks.

    Anyway, these ideas are all perhaps a little confused. I’ve just been exploring the idea of relative context dependency lately because I have a feeling that it matters for understanding not just how graphic codes evolve but what conditions need to be in place to permit a transition between a synchronous and an asynchronous code.