“maluma” and “takete”
What I call the synesthetic hypothesis suggests that the perception of acoustic smoothness by one of our five senses, hearing, is somehow linked to the perception of smoothness by two other senses: vision (seeing a curvy figure instead of a jagged one) and taste (tasting a creamy instead of sharp taste).
Synesthesia is the general name for the phenomenon of strong associations between the different senses. Some people, like Dan Slobin, a Berkeley professor of psychology and linguistics, are very strong synesthetes. For Slobin, each musical key is associated with a color: C major is pink, C minor is dark red tinged with black. But the bouba/kiki results suggest that, to at least some extent, we are all a little bit synesthetic. Something about our senses of taste/smell, vision, and hearing are linked at least enough so that what is smooth in one is associated with being smooth in another, so that we feel the similarity between sharpness detected by smell (as in cheddar), sharpness detected by touch or vision (like acute angles), and sharpness detected by hearing (abrupt changes in sound).
We can see this link between the senses even in our daily vocabulary. The words sharp and pungent both originally meant something tactile and visual: something that feels pointy or subtends a small visual angle, but both words can be applied to tastes and smells as well.
It’s not clear to what extent these synesthetic links are innate or genetic, and to what extent they are cultural. For example, nomadic tribes in Namibia do associate takete with spiky pictures, but, unlike speakers of many other languages, they don’t associate either the word or the pictures with the bitterness of dark chocolate or with carbonation. This suggests that the fact that we perceive bitter chocolate as “sharper” than milk chocolate or carbonated water as “sharper” than flat water is a metaphor that we learn culturally to associate with these foods. But we really don’t know yet, because we are just at the beginning of understanding these aspects of perception.
There are, however, some evolutionary implications of the synesthetic smoothness hypothesis and of the frequency code.
John Ohala suggests that the link of high pitch with deference or friendliness may explain the origin of the smile, which is similarly associated with appeasing or friendly behavior. The way we make a smile is by retracting the corners of the mouth. Animals like monkeys also retract the corners of their mouths to express submission, and use the opposite facial expression (Ohala calls it the “o-face”), in which the corners of the mouth are drawn forward with the lips possibly protruding, to indicate aggression.
Retracting the corners of the mouth shrinks the size of the front cavity in the mouth, just like the vowels I or i. In fact, the similarity in mouth position between smiling and the vowel i explains why we say “cheese” when we take pictures; i is the smiling vowel.
Ohala’s theory is thus that smiling was originally an appeasement gesture, meaning something like “don’t hurt little old me.” It evolved when mammals were in competitive situations as a way to make the voice sound more high pitched and the smiler appear smaller and less aggressive, and hence friendlier.
Both the frequency code and the synesthetic smoothness hypothesis may also be related to the origin of language. If some kinds of meaning are iconically related to sounds in the way that these hypothesessuggest, it might have been a way for speakers to get across concepts to hearers early on in the evolution of language. The origins of language remain a deep mystery. We do, however, have some hypotheses, like the “bow-wow” theory of language evolution, the idea that language emerged at least partly by copying nature, naming dogs after their bark and cats after their meow and so on. The frequency code suggests that perhaps one of the earliest words