CALLING A TABLE MARY
Anthology on Cross-Linguistic Lexical Semantic Differences
Wikipedia:The Pope is Catholic
The Pope is Catholic. You probably knew that. But does a person completely ignorant of Christianity or religion in general know that? What about a person who is isolated from knowledge of the Western World? At some point in everybody's life, each of us did not know the Pope was Catholic. (Perhaps you were young then, but young people can use Wikipedia, too.) All of us are born ignorant, and only come to knowledge through learning.
While there is no need to go into highly specific detail, it's always good to provide some context. Even if something is very well known among English speakers, please remember that Wikipedia exists in many languages.Even though many Westerners know that Pope Francis is not a Methodist, does everyone in the world know that? (Let's ignore the fact that not everyone has access to Wikipedia. Yet. One of Wikipedia's goals is that they will.)
Leach could not have explained this conundrum of prototype theory and categorization/worldview varying across languages any better."Because my mother tongue is English, it seems self evident that bushes and trees are different kinds of things. I would not think this unless I had been taught that it was the case." - Edmund Leach (anthropologist), 1964
Recent linguistic data from colour studies seem to indicate that categories may have more than one focal element - e.g. the Tsonga colour term rihlaza refers to a green-blue continuum, but appears to have two prototypes, a focal blue, and a focal green. Thus, it is possible to have single categories with multiple, disconnected, prototypes, in which case they may constitute the union of several convex sets rather than a single one.
Rihlaza is a so-called "grue" term.
Terms for green-blue (Kay/MacDaniel call it grue) - are bifocal - ie. they have both to focal blue and focal green, rather than just one focal colour.
Prototypes, Whorf, and grue...
A common argument for the Sapir-Whorf Hypothesis is the perception of colour across languages. According to the hypothesis, if one language categorizes colour differently than another language, then the different groups should perceive it differently also. In a study done in the 1970’s a group of researchers studied the difference in perception of colour in English compared with a small tribe from Papua New Guinea called Berinmo. The Berinmo were given a sample of 160 different colours and asked to categorize them. The Berinmo not only had less categories, they did not differentiate between the English colours blue and green, however, they did draw a category between colours in their language nol and wor which in English would both be perceived in the category of yellow. The researchers found that the Berinmo speakers were better at matching colours across their nol, wor categories than across the English blue and green categories and English speakers were better at matching colours across blue and green than across the Berinmo nol and wor (Sawyer, 1999). According to the researchers by showing that the colour perception of the two language groups is dependent on the categorization in the language the results support the Sapir-Whorf Hypothesis.
In Whorf's words: "We dissect nature along lines laid down by our native language. The categories and types that we isolate from the world of phenomena we do not find there because they stare every observer in the face; on the contrary, the world is presented in a kaleidoscope flux of impressions which has to be organized by our minds—and this means largely by the linguistic systems of our minds. We cut nature up, organize it into concepts, and ascribe significances as we do, largely because we are parties to an agreement to organize it in this way—an agreement that holds throughout our speech community and is codified in the patterns of our language [...] all observers are not led by the same physical evidence to the same picture of the universe, unless their linguistic backgrounds are similar, or can in some way be calibrated."
What Whorf wanted to account for was behavior as a product of ambiguities in lexical items - what he called linguistic analogies (Whorf, 1956). In this paper I want to bring to bear a line of research not well attested in the literature, viz. semantic change. Semantic change promises to be particularly enlightening in the Whorfian debate because one can find in the data answers to the root questions of the relationship between language and thought - do changes in the semantics of particular words regularly reflect changes in conceptual structure? Or are there instances of stable conceptual structure around which the linguistic material circulates? Or are there even instances in which the conceptual structure changes but, mutatis mutandis, the semantics of the words are stable?
Cross-linguistic research has shown that boundaries for lexical categories differ from language to language.
In different languages the world is carved up differently. This cross-linguistic variation has been shown for domains as varied as color, causality, mental states, number, body parts, containers, motion, direction, and spatial relations (Malt & Majid, 2013; Malt et al., 2015). Malt and colleagues, for instance, described how different languages label a set of household containers and found that not all languages observe the same distinctions, despite perceiving the similarity of the objects in the same way (Ameel et al., 2005; Malt, Sloman, & Gennari, 2003; Malt et al., 1999). For example, the Dutch word fles encompasses objects that in French are either called bouteille or flacon. Not only are there differences in the number of distinctions made in different languages, there is crosscutting in the way exemplars of a category are grouped together as well (Malt et al., 2003). The roughly equivalent French bouteille and Dutch fles demonstrate a difference in how they map onto a shared similarity space, which reflects a cross-linguistic difference in meaning representation. Additionally, the categories fles and bouteille each include a different number of objects, indicating differences in category extension as well. Although these cross-linguistic differences have received growing attention in recent years, within-language variation exists as well (McCloskey & Glucksberg, 1978; Verheyen, Hampton, & Storms, 2010). Inter-individual differences in linguistic categorization have been described in the relation to vagueness (Black, 1937; Verheyen & Storms, 2013). A distinction is made between vagueness in criteria and vagueness in degree (Devos, 2003). The former is involved when individuals use different criteria to determine if an object belongs to a category. When individuals agree on the criteria for category membership but use a different cut-off for separating members from non-members, the latter type of vagueness is in play. In seemingly homogeneous groups of speakers of the same language, groups that display one or both types of differences have been identified (Verheyen & Storms, 2013).
One would expect the variability within a language to be less pronounced in comparison to cross-linguistic differences. Non-linguistic appreciation of properties of domains seems 1959 to be universal (at least for some domains, including the one studied here), but the relation of this non-linguistic understanding to linguistic categorization is complex. That is, linguistic categories do not map directly onto similarity clusters (Malt et al., 1999).
These complex patterns of lexical variation for categories of everyday objects emerge not only between languages but within a language as well. Especially the latter differences seem to be more complex than earlier assumed. Vagueness in degree and criteria seem to cause complex patterns of lexical variation between latent groups of categorizers that resemble the patterns of lexical variation at a cross-linguistic level. The amount of variability observed within one language poses a challenge for cross-linguistic research. It is common practice in cross-linguistic research not to take into account within-language differences and to average across all individuals within a language, provided the sample comes from a restricted geographic region, implying a shared dialect. This may lead to conclusions that do not hold for the latent groups a language might harbour.
For example, based on Figure 1 one might believe that the difference between the categories doos and boîte mainly consists of a difference in degree. The correlation of 0.78 between the language groups is imperfect but points out that there is a substantial agreement between both language groups as well. However, taking into account the within-language differences, it becomes clear that this conclusion could vary a great deal depending on the combination of latent groups, since these correlations vary from 0.20 till 0.86. Future research will pinpoint possible causes for the observed variation. A possible path involves relating personal characteristics (age, gender, education level) and item characteristics to the parameter estimates of the different latent groups. How are we able to manage the considerable interindividual differences during the communication process and prevent a breakdown in communication? Possible answers to this question may lie in the way polysemy or words with new meanings such as eponyms are dealt with during communication using processes of sense creation and selection (Clark & Gerrig, 1983; Foraker & Murphy, 2012). Even for common nouns, referring to familiar objects in their most literal sense, these processes are relevant in the context of inter-individual differences.
Classic studies in anthropological linguistics suggest that there are also substantial differences in semantic categories in social arenas such as kinship (Romney & D’Andrade, 1964; Danziger, 2001; Foley, 1997). It is important to test whether these linguistic differences have cognitive consequences. There are direct studies of the cognitive effects of social semantics. Boroditsky and Schmidt (2000) found effects of linguistic gender on people’s encodings of objects. For example, they taught Spanish-English and German-English bilinguals English names for objects (such as “Mary” for a table) and found that people retained the names better when the gender was consistent with the gender of the noun in their first language. In addition, bilinguals’ English descriptions of the objects were consistent with the gender in their first language. Sera, et al. (2002) have also shown that gender retains semantic context in that cross-linguistic differences influence classification (Sera et al., 2002).
Many theories hold that semantic variation in the world’s languages can be explained in terms of a universal conceptual space that is partitioned differently by different languages. Recent work has supported this view in the semantic domain of containers (Malt et al., 1999), and assumed it in the domain of spatial relations (Khetarpal et al., 2009), based in both cases on similarity judgments derived from pile-sorting of stimuli.
The semantic systems of the world’s languages vary considerably. This observation has suggested two opposed accounts of the relation between language and thought. The Sapir-Whorf hypothesis holds that such cross-language differences cause corresponding differences in cognition, leading speakers of different languages to think about and perceive the world substantially differently (Lucy, 1992; Majid et al., 2004; Roberson et al., 2000). In contrast, many other theories accommodate such variation by positing a universal conceptual space that is partitioned in different ways by different languages (Berlin & Kay, 1969; Croft, 2003:139; Levinson & Meira, 2003; Majid et al., 2008; Malt et al., 1999; Regier et al., 2007). On this view, the significant point about the variation is that many logically possible semantic configurations are never attested – thus, the constrained variation illuminates underlying commonalities in human cognition.
Although the starting point for this debate is linguistic – namely the observation of semantic diversity across languages – a natural means of testing it is by probing nonlinguistic cognition. The Whorfian view predicts that speakers of languages with different semantic systems should conceive of the world differently, each group in line with their own language’s semantic system. The universal space view in contrast predicts that speakers of different languages should conceive of the world similarly.
Different languages exhibit different systems of semantic categories. It is often assumed that this semantic variation is constrained by, and can be explained by, a universal conceptual space that is partitioned in different ways by different languages. Malt et al. (1999) found evidence consistent with such a language-invariant space, and Khetarpal et al. (2009) assumed such a space existed. In both cases conceptual similarity was assessed through pile-sorting.
First, the findings suggest a particular view of the relation of language and thought, namely that: (a) there is a set of fine-grained and potentially cross-cutting conceptual distinctions that may be made, and some languages will happen to mark more of these distinctions than will other languages; (b) distinctions that are unmarked in a language are nonetheless conceptually available to speakers of that language – this is suggested by the fine-grained sorting; and (c) a distinction becomes more salient if it is marked linguistically in one’s native language (Hespos & Spelke, 2004) – this is suggested by the effect of language we find. This interpretation is consistent with the general view that “Whorf was half right” and correspondingly half wrong, as has been argued elsewhere (Regier & Kay, 2009).
Second, our results are compatible with the possibility that language may influence cognition in relatively subtle ways that are detectable by some analyses and not by others. Edit distance applied to pile-sorting may be a useful analytical tool, when used in tandem with others, in pursuing this question more generally.
Finally, our results suggest that caution is needed when basing accounts of semantic variation on an ostensibly universal similarity space derived from pile-sorting (e.g. Khetarpal et al., 2009) – because universality cannot be assumed. Similarity judgments are likely to be similar but not identical across languages, as was the case in our analyses. This highlights an unavoidable tension. A universal conceptual space is a useful theoretical construct for explaining semantic variation, but we have no guarantee that such a thing actually exists – nor, if it does, do we have a completely reliable means of assessing it. Instead, we have somewhat language-colored approximations to such a space, and these should be treated as such. A reasonable treatment may be to average together similarity judgments obtained from speakers of different languages in an attempt to better approximate a universal similarity space, as Khetarpal et al. (2009) did. But any interpretation of results based on such an approximation should be tempered by the awareness that it is merely an approximation.
At the same time, our results leave a number of questions open. The first concerns the contrast between our findings and those of Malt et al. (1999). They found that language was not reflected in sorting by overall similarity, and we found that it was, based on the same data. One possibility, as mentioned above, is that our edit distance analysis is more sensitive than some others, such that it picks up on differences that are missed by other analyses. Is this conclusion correct? Or is our analysis itself inappropriately biased in some respect? Which set of results should be believed? Answering this question is critical to placing our present findings in their proper context.
A second question raised by our findings is the extent to which they generalize to other languages. If we were to examine a new language that partitions semantic space more finely than the languages we have examined here, we would expect to find that pile-sorts produced by people of all backgrounds tend to align more closely with this new fine-grained language than they do with the more coarse-grained languages we have already examined. Is this the case? This question provides a straightforward means of further testing these ideas.
There is also the question of whether these results generalize to other semantic domains. While we have restricted ourselves to the two domains of spatial relations and containers, this was simply a matter of convenience, as the data were readily available. The reasoning behind these ideas however is general in scope, and we would expect to find supporting evidence in other semantic domains as well.
Finally, while these results demonstrate a correlation between language and sorting behavior, they do not demonstrate the causal link claimed by the Whorf hypothesis. It remains an open question whether the observed correlation is attributable to an effect of language on cognition, or to other factors, such as culture influencing both language and cognition. Regardless of how these questions are eventually answered, we hope that our present initial findings help to make plausible the central idea we have promoted here: a fine-grained conceptual space, largely shared in structure across speakers of different languages, but nonetheless also reflecting the speaker’s native language.
Second language learners face a dual challenge in vocabulary learning: First, [···]. Second, after some time, they discover that these names do not generalize according to the same rules used in their first language. Lexical categories frequently differ between languages (Malt et al., 1999), and successful language learning requires that bilinguals learn not just new words but new patterns for labeling objects.
(Sandra says: NEW PATTERNS MY ASS! MY ENGLISH IS NOT "SUCCESSFUL" IN YOUR WORDS, BUT IT'S MY IDIOLECT AND THE WAY I, A BORN SPANIARD/SCE [STANDARD CONTINENTAL EUROPEAN], VIEW THE WORLD. IT'S PART OF WHAT MAKES ME SANDRA DERMARK!!!).
Malt B. C., Sloman S. A., Gennari S. P., Shi M., Wang Y. (1999). Knowing versus naming: similarity and the linguistic categorization of artifacts. J. Mem. Lang. 40 230–262. 10.1006/jmla.1998.2593
Decades of research have indicated differences in lexical categorization across languages (such as the seminal comparison of colour categories by Landar et al., 1960), extending beyond abstract domains to concrete domains such as furniture, clothing, and household storage and serving vessels, and observed across Spanish, English, Chinese, Dutch, French, Russian, and more (Graham and Belnap, 1986; Malt et al., 1999, 2003; Ameel et al., 2005; Pavlenko and Malt, 2011; see Malt and Majid, 2013 for review). These differences mean that to use words as a native speaker does, language learners must acquire non-obvious, language-specific ways of generalizing names to new objects. For native speakers, fine-tuning of lexical categories may begin in infancy, but it continues beyond childhood, at least up to 14 years of age (Ameel et al., 2008), reflecting the significant challenge in language acquisition that word learning poses, even for monolinguals (see also Bowerman and Levinson, 2001). Developing adult, native-like boundaries between close competitor names requires attention to an increasing number of features of an object over time (Ameel et al., 2008). For example, no single concrete or abstract feature is sufficient to isolate members of the English category bottle from the set of 60 common household containers used by Malt et al. (1999). Instead, an interplay between features such as shape (typically cylindrical), material (plastic or glass), and function (containment of a fluid) define this broad category of container-like objects.
Learners of a second language, including children who acquire two languages simultaneously, are thus faced with a major incongruity between languages.
Lexical categorization is a valuable tool for identifying variation in lexical semantic mappings among speakers, and with this more sensitive measure of lexical semantic variation, second language lexical proficiency may no longer be sufficiently described by the accumulation of a list of words as tested by most picture naming, lexical decision, and fluency tasks. Instead, lexical semantic mappings are more precisely probed when many similar objects are named, which allows inferences about the boundaries of a given speaker’s lexical category. For instance, the researcher can examine which drinking vessels are named cup and which similar objects receive a different name (such as mug or glass) by a speaker.
Recent work has investigated whether and how bilinguals can maintain native-like lexical semantic representations in each language despite these differences. Ameel et al. (2005) tested simultaneous Dutch–French bilinguals on the names of common containers and serving dishes. Significant influences of both Dutch and French mappings were measured in the bilinguals’ categorization patterns for both languages, and the differences between lexical categories in the bilinguals’ Dutch and French were significantly smaller than the differences between monolinguals of each language. In effect, the simultaneous bilinguals partially converged across the two languages. They achieved this convergence by shifting category centroids in each language toward one another for greater consistency between approximate translation equivalents and reducing the number of features used to define category boundaries (Ameel et al., 2009). As such, convergence produces more similar lexical categories in each language and minimizes the conflict faced by the simultaneous bilinguals in organizing the objects into named categories. Sequential bilinguals also show similar trends toward convergence (Pavlenko and Malt, 2011; Malt et al., under review). The accumulating findings in lexical categorization behavior of simultaneous and sequential bilinguals are highly suggestive of a dynamic representation for lexical semantics, mutually influenced by both languages, susceptible to change well into adulthood.
These cross-language transfer and convergence effects can be thought of in terms of how exposure to one language might change mappings from objects’ representative features to words in the other language of the bilingual speaker. Theoretical models of lexical semantic representation, such as Van Hell and De Groot’s (1998) Distributed Feature Model describe a set of underlying features whose combination may be used to define lexical concepts by linking these features to a lexical node. Models that use feature-based representations have been further adapted to accommodate broader asymmetry between languages (Dong et al., 2005) and the relative salience of different features in bilingual categorization (Ameel et al., 2009).
At least two computational models have attempted to simulate bilingual lexical categorization (Zinszer et al., 2011; Fang et al., 2013), drawing on connectionist architecture to translate language-specific mappings into training parameters for lexical nodes and high-dimensional semantic representations. These models are consistent with previous connectionist models of monolingual word learning (such as McClelland and Rogers, 2003) that rely on distributed feature representations to reproduce semantic category hierarchies (e.g., sunfish belongs to fish, which belongs to animals, all of which differ from plants) as a result of feature overlap between exemplars.
Although there are only a few quantitative accounts of bilingual lexical categorization, a number of likely predictors for development of lexical categories are apparent from the broader study of second language acquisition. The extent of L2 immersion, age of second language onset, time spent learning the second language in a formal setting (classroom training), and patterns of language use (the extent to which the languages are intermixed in use) all appear to be involved in non-native learners’ degree of success in learning a second language. Further, because name choice for an object may vary across speakers (e.g., Malt et al., 1999) the categorization norms of a linguistic community are an important means of quantifying a language learning environment, describing the variety of lexical semantic mappings used by native speakers in that community.
While many studies in second language acquisition explore the influence of language history variables on lexical learning, fewer studies have evaluated a combination of such variables simultaneously and properly controlled for interaction among the variables and statistical obstacles to measuring effects of variables individually, as outlined by Stevens (2006). None of the research to date has simultaneously related all of these variables to lexical categorization as a measure of word learning. We now consider these variables and how they may impact L2 development of lexical categorization in more detail.
We find only a modest age of earliest exposure effect on L2 category native-likeness, but importantly, we find that classroom instruction in L2 negatively impacts L2 category native-likeness, even after significant immersion experience. We also identify a significant role of both L1 and L2 norms in bilinguals’ L2 picture naming responses.
Classroom learning includes little exposure to the within-category variation necessary to acquire native-like lexical semantics. Consequently, non-immersed learners are likely to follow different developmental trajectories (than immersed), as their respective language inputs differ fundamentally in the lexical semantic domain.
At the earliest stages of learning, L2 learners draw heavily on L1 representations for production (see the Unified Competition Model of MacWhinney, 2012).
These early learners’ L2 categorization patterns should reflect their confidence in L1 naming (i.e., the extent of L1 dominant name agreement) because, in the absence of L2-specific lexical semantic knowledge, inferences about L2 words are based on knowledge of their L1 translation equivalents. Typicality ratings also can be construed as a measure of native speakers’ confidence about the name of an object, and Pavlenko and Malt (2011) found that Russian–English bilinguals relied on both Russian and English native typicality norms for individual objects when naming these objects in Russian, suggesting that their intuitions about categorization were influenced by the perceived confidence of each language community in an object’s category membership.
It is evident that in many instances of simultaneous and sequential bilingualism, the category information provided to bilinguals by the native-speaker communities of each language is variable and yet still bears a significant influence on their production in both languages. With relatively few lexical category stimulus sets normed for native speakers of more than one language and tested on sufficiently advanced bilinguals of both languages, the exact degree and means of this cross-language influence remains to be explored. However, native category norms that represent the full distribution of names produced and thus the degree of name agreement and variation among native speakers may allow an elaborated view of cross-language competition and transfer. The extent to which L1 representations are vulnerable to change may vary as a function of their own entrenchment, with greater native naming agreement representing more robust L1 representations.
L2 CLASSROOM INSTRUCTION
Malt and Sloman’s (2003) study of L2 English learners found that formal training in English prior to immersion offered no predictive power after accounting for years of L2 immersion. Based on this result, L2 training would seem to have minimal value for acquiring native-like L2 lexical semantic mappings.
The latter result does not strongly contradict the Malt and Sloman (2003) finding, however, in that the (Chinese) students of English at both levels still relied primarily on their native semantics when making English judgments, showing greater similarity to the monolingual speakers of their mother tongue than to bilingual native English speakers. Both the improvement toward slightly more native-like English associations and the general bias toward Mandarin Chinese semantics are reflected in the learners’ significant convergence, producing semantic similarity judgments that were more similar across languages than the judgments between the Mandarin Chinese monolinguals and native English bilinguals. For these sequential bilingual learners, language systems interacted to allow a small degree of transfer of learned L2 mappings onto L1 while never overcoming the overall L1-likeness of the representations in both languages. Thus the role of classroom experience in acquiring native-like L2 lexical categorization deserves more scrutiny.
The present study adds the unique corollary that L2 learning without immersion may, in fact, hinder native-likeness. This effect may be due to the entrenchment of L1 structures in learners’ L2 as a result of impoverished input. Common classroom techniques for learning translation equivalents or naming highly prototypical objects encourage learners to export their inferences about object categories from L1 to L2 by way of one-to-one translation. However, native-like L2 mappings only become available to the learner with more diverse input from an immersion environment or (potentially) another immersive instructional setting such as the highly enriched virtual environments that may be simulated in computer games (see Legault et al., 2014, for example). The more time that L2 learners spend learning lexical semantic mappings in a non-immersive environment, the more entrenched the L1-driven mappings in L2 may become. Considered against this perspective and the relative proportion of L1 vs. L2 input in the non-immersive environment, the patterns in our data become less surprising but provide an important lesson for language instruction practice.
IMPLICATIONS FOR SECOND LANGUAGE INSTRUCTION
The present study offers several new insights into the role of language history, language training, and language use in second language lexical semantic learning. Most importantly, we find that greater time spent studying a second language before immersion predicted lower levels of eventual L2 native-likeness, likely due to the entrenchment of L1-like lexical-semantic mappings. Although we do find an age of onset effect, even after controlling for immersion and duration of language training, the magnitude of this age effect is proportional to the benefits of immersion, and the benefits to L2 native-likeness from early age of onset are small relative to the effects of more pre-immersion training.
On the extreme end, one might propose that pre-immersion language instruction is actually counter-productive to native-like lexical semantic development, and second language education would be best postponed until immersion opportunities arise. However, this viewpoint is impractical for most non-immigrant learners, and likely over-stated, as our analysis of language-specific variables (native-speaker agreement and alternative names) show that learners are, in fact, highly sensitive to the inconsistent input that describes native-like lexical categorization. Lexical semantic learning in non-immersion environments might therefore be improved by introducing learners to a greater variety of referents and the naturally diverse naming patterns associated with those referents, allowing them to develop more native-like intuitions about the relationships between objects that define lexical categories. The method of using a diverse set of naming patterns in second language instruction clearly contradicts the traditional classroom teaching method, in which training focuses primarily on one-to-one translations; such a focus underestimates cross-languages differences, and by our findings, encourages the use of L1 patterns for L2 words and therefore impedes learners’ later ability to acquire native-like lexical semantic mappings.
Notice the bold underlined words.
BARBARA C. MALT Lehigh University
Languages vary idiosyncratically in the sets of referents to which common nouns are applied. To use nouns as a native speaker would, second language learners must acquire language-specific naming patterns, not merely a language-tolanguage correspondence. We asked second language learners to name household objects in English and in their native language, to judge the objects’ typicality with respect to English names, and to provide naming strategy reports. The least experienced learners’ naming and typicality judgments diverged substantially from native responses.
The words of one language cannot always be mapped directly onto the words of another. For example, the English words “fate” and “destiny” have no equivalent in some languages (Wierzbicka, 1992). Russian has separate words for one’s wife’s brother, wife’s sister’s husband, and husband’s brother, all of which would be labeled “brother-in-law” in English (Lyons, 1968), and Spanish uses a single preposition, “en”, for spatial relations that are divided into “in” and “on” in English (e.g., Bowerman, 1996). Many such cases of non-equivalence concern words for abstract and socially constructed concepts (e.g., De Groot, 1993; see Pavlenko, 1999, 2002; Altarriba, in press, for further examples). Cross-linguistic differences in such domains are not surprising. For words referring to common artifacts, though, one might expect a closer correspondence. Objects such as tables, chairs, plates, bowls, shoes... are similar in design and use across many cultures. If objects are grouped by name according to their shared properties, languages should make parallel distinctions in labeling everyday artifacts such as these. Indeed, direct correspondence across languages for words for common objects is often assumed in practical and theoretical approaches to second language vocabulary acquisition. Second language instruction has typically taught vocabulary for familiar objects as a matter of paired associate learning: students learn that “chair” is “chaise” in French or “silla” in Spanish, that “bottle” is “bouteille” in French or “botella” in Spanish, and so on. Psycholinguists studying the process of second language learning have focused on issues such as how to facilitate the learning of the pairs through mnemonic devices or grouping (e.g., Crutcher, 1998; Schneider, Healy and Bourne, 1998), how the existence of cognate pairs might be exploited to speed vocabulary acquisition (Meara, 1993), and whether members of a pair share a common conceptual store (e.g., Potter, So, Von Eckhardt and Feldman, 1984; Kroll, 1993). This idea is consistent with the suggestion that common nouns capture structure in the world that is obvious to all perceivers (e.g., Rosch, Mervis, Gray, Johnson and Boyes-Braem, 1976; Berlin, 1992). Further, some theorists have explicitly suggested that concrete nouns are the strongest candidate for having corresponding conceptual representations across languages (e.g., De Groot, 1992, 1993, 2002; Kroll, 1993).
Observational and experimental evidence now indicates, though, that the assumption of direct mapping is not necessarily correct even for the naming of common, concrete objects. Polish speakers label a telephone table and a coffee table by one word and a dining room table by another, although English speakers use the same label for all three (Wierzbicka, 1992). English speakers use the same name for a large, stuffed seat for one person (“chair”) as they do for a smaller wooden seat, but Chinese speakers give the stuffed one the same name that they would give a stuffed multi-person seat (what English speakers would call “sofa”; Gao, personal communication). Kronenfeld, Armstrong and Wilmoth (1985) found that speakers of English, Hebrew, and Japanese partitioned a set of 11 ordinary drinking vessels by name in different ways. For example, the American speakers of English grouped together by name a paper drinking vessel and one for drinking tea (calling both “cup”), but the Israeli speakers of Hebrew called them by different names. Speakers of Japanese used three different names in partitioning the objects, which were partitioned by only two different names in English and in Hebrew. Paradis (1979) and Graham and Belnap (1986) provide further examples. Malt, Sloman, Gennari, Shi and Wang (1999) looked at naming for a set of 60 common containers by speakers of American English, Mandarin Chinese, and Argentinean Spanish and similarly found substantial differences in the naming patterns across speakers of the three languages. For instance, the 16 objects named “bottle” in English were spread across seven different linguistic categories in Spanish, and the Chinese category that contained the 19 objects called “jar” in English also included 13 objects called “bottle” in English and eight called “container”.
Malt, Sloman and Gennari (in press) examined in more detail the relation among the linguistic categories for the 60 containers and found a complex pattern. Some of the categories shared prototypes across the three languages but others did not; some cases of nesting occurred (the categories of one language were contained within those of another); and some cases of cross-cutting were found (pairs of objects were put into a single category by one language but into different categories by another language).
ON CATEGORY: By “linguistic category” we mean any set of objects that shares a name in a given language (or for a given speaker). We do not assume that linguistic categories correspond directly to conceptual groupings of objects. The cross-linguistic variability in naming, along with the dissociation between naming patterns and perceived similarity discussed below, suggests, in fact, that they do not (see also Sloman and Malt, in press, for arguments against assuming fixed conceptual groupings of objects).
Both Kronenfeld et al. (1985) and Malt et al. (1999) found that although the naming patterns diverged across speakers of the different languages, judgments of similarity among the objects by those speakers were largely the same. This dissociation of naming from similarity, along with the cross-linguistic variation in naming itself, argues against the sort of universalprototype model of naming that the idea of direct mapping would suggest. That is, naming must involve something more than, or different from, learning prototypes of universally perceived groupings and the names associated with them, and then labeling objects according to their similarity to the prototypes (e.g., Smith and Medin, 1981; Hampton, 1993). Malt et al. (1999, in press) and Malt, Sloman and Gennari (2003) argue that the naming patterns of a language are influenced by a language’s history and the history of the culture that uses it. The vocabulary of each language (or dialect) changes over time and is shaped by factors such as what names happened to exist in that language at earlier times and so were available for extending to new objects, what new names happened to be introduced through language contact or manufacturer invention, what objects were present in the culture at earlier times and formed similarity clusters that were named, what domains have been of particular interest to the culture at some point and so have led to finer linguistic differentiation of the conceptual space, and so on. Such factors contribute to the choices of names a speaker has for an entity and which is dominant. For native speakers, then, a grasp of the linguistic categories of their language must come in part from language-specific knowledge accumulated through extended exposure to individual objects and the names assigned to them by mature speakers, as well as from perception of the properties of the objects themselves.
What of the second language learner, then? This perspective implies that to name objects as a native speaker would, he or she must acquire similar knowledge of language- and culture-specific naming patterns, not merely knowledge of a language-to-language correspondence. But such knowledge is not easy to come by. Paired-associate vocabulary learning in the classroom, and exposure to words in the absence of their referents (for example, in books), does not typically provide such knowledge. Even in an immersion environment, accumulation of the experience needed to generate native naming may be a lengthy process.
These observations lead to several predictions about the acquisition of native naming patterns for common objects by second language learners. First, naming patterns for learners with relatively modest levels of experience with the second language will not fully match those of native speakers even when basic vocabulary for the domain has been acquired. Second, because of their incomplete grasp of the membership of the linguistic categories (and hence the central tendencies), these learners’ identification of the linguistic category prototypes, as reflected in judgments of object typicality, will also tend to be poor. Third, as the level of experience with the language increases, naming patterns will become more similar to those of native speakers because learners will acquire more exposure to specific object-name pairings. Fourth, judgments of typicality for learners with a higher level of experience should also more closely match those of native speakers, because they have gained more experience with the linguistic category membership. Finally, for all learners, years of immersion in an English-speaking environment should be a more important predictor of match to the native naming pattern than years of formal instruction, because it is exposure to individual instances and the names they receive that is critical to mastering the second language categories, not merely instruction based on direct mappings or exposure to words in the absence of their referents.
These predictions contrast with those that would follow from a view in which concrete nouns correspond directly across languages. Under such a view, learners with any level of experience, once they have acquired vocabulary for a domain, should be able to generate appropriate usage and typicality judgments, and classroom learning should be as effective as immersion experience in providing mastery of the correspondence.
The data also allow us to address two subsidiary questions. First, is age of acquisition related to performance? This variable has been proposed as an important predictor of mastery of syntax, morphology, and phonology in second language learning (e.g., Krashen, Long and Scarcella, 1982; Johnson and Newport, 1989; Singleton, 2001). In general, semantics has not been included in discussions of critical periods for language learning. An implicit assumption is that learning meanings or uses of words is like domain-general learning of other sorts and so would not be affected by critical periods that may exist for other aspects of language acquisition. Indeed, recent work has explicitly proposed that word learning draws on domain-general mechanisms (e.g., Markson and Bloom, 1997; Smith, 1999; Bloom, 2000). However, this proposal remains controversial (e.g., Markman, 1992; Waxman and Booth, 2000, 2001). Furthermore, even if word learning proceeds in a domain-general fashion, the more firmly entrenched native linguistic categories are (i.e., the longer they have been held), the harder it may be to acquire second language categories that do not map directly onto the native categories (e.g., MacWhinney, 1992). Our data will allow us to assess the importance of age of acquisition to performance on our naming measure.
Second, exactly how might the fact of an incomplete knowledge base result in the naming patterns that learners generate? The most obvious explanation is that learners try to map directly from a word in the native language to a word in the target language. The observation that classroom vocabulary teaching is typically treated as paired-associate learning, with lists of word equivalents to be learned, suggests that learners in the earliest stages of acquisition may use such a strategy. In addition, research on lexical retrieval in bilinguals (e.g., Kroll and Curley, 1988; Kroll and Stewart, 1994) suggests that such learners access words in the second language lexicon through their links to words in the native lexicon. However, this research also indicates that as learning progresses, learners develop direct connections between conceptual knowledge and words in the second language lexicon, and they can access this lexicon without mediation from the native lexicon. The strategy reports, along with data from naming in the native language, will allow us to assess whether discrepancies from native naming derive from attempting to map words of the native language directly onto English words, or whether our learners avoid a direct mapping strategy but still fail to generate correct usage. We consider in more detail why errors may be generated in the latter case in the general discussion section below.
Learner strategies for naming in English
To evaluate whether second language learners generate names for objects in English by trying to translate directly from the name they would have used in their native language, we examined two measures: the strategy reports participants gave at the end of each English naming session, and the relation between the names that individuals generated in English and the names they gave in their native language.
The most notable feature of the distribution of reports is that for both the dishes set and the bottles set, the proportion of choices indicating attempts at direct translation was relatively low, including for participants with the fewest years of immersion in an English speaking environment. Participants also rarely indicated that they were just guessing, including those at the lower levels of learning. The first option, indicating that a word “just felt right” and the third, indicating using particular features as criteria for applying particular words, dominated the strategy selection for all the levels of experience. The other major feature of the choice distribution is that there is an increase in the first choice (“just felt right”) for those with the highest amount of experience with English, especially for the dishes set (and, for the dishes set, this increase is mirrored by a decrease in choices indicating use of specific features). Thus the strategy report data suggest that participants, even those who are relatively inexperienced learners, are not primarily attempting direct translation from their native language to English in generating names for the objects. Rather, they draw on semantic information that they associate directly with the English words, either consciously (reflected in reports of using specific features) or unconsciously (reflected in reports that a word “just felt right”). As one might expect, the most experienced speakers of English appear to engage less in the conscious consideration of properties and more in the intuitive generation of words.
How does incomplete lexical knowledge affect second language naming and typicality judgments?
Kroll and colleagues (e.g., Kroll and Curley, 1988; Kroll and Stewart, 1994) have suggested that for beginning second language learners, access to words in the second language lexicon is mediated by words of the first language. For instance, a Spanish speaker learning English who is presented with a container to label would access stored conceptual knowledge about that object. Accessing that knowledge would activate a Spanish name such as “botella”, and the Spanish name would then activate a linked English word such as “bottle”. For learners at this stage of acquisition, an account of the impact of an incomplete lexical knowledge base on second language naming is straightforward. The word chosen will be determined by the links established between first and second language words. Because those links do not accommodate any discrepancies in the distributions of the words between languages, the second language use of any given word will simply follow the first language distribution of the linked word. Graham and Belnap (1986) provide evidence of such a pattern for native speakers of Spanish in early stages of learning English.
Kroll’s data indicate, however, that as proficiency increases, links are built from conceptual knowledge directly to words in the second language lexicon so that mediation by the first language vocabulary is no longer needed. As discussed earlier, even our least experienced learner groups were not beginners in English. They were no longer studying English in a classroom but rather were immersed in an English-speaking environment and were using English in their daily activities. Thus they are likely to have progressed to the stage where direct links between knowledge about objects and English words are being built. Indeed, the strategy reports and measure of match to participants’ own native naming argue against a direct mapping strategy for our sample, including the least experienced learners. The discrepancy from native patterns that our participants at the lower levels of experience showed may thus be a more direct consequence of the incomplete knowledge base: learners activate names directly from their conceptual representations, but because there are many language-idiosyncratic object– name pairings that they have not been exposed to yet and for which appropriate object–name links have not been built, they may generate incorrect labels by attempting to generalize from a limited number of known object–label pairings.
If lack of knowledge of the word uses were the only factor hindering our learners’ performances, one might expect that the most experienced learners, who have been in an English-speaking environment for ten or more years, would have acquired enough information to match the native patterns. The fact that naming patterns remained distinguishable from those of native speakers even for the most experienced learners suggests that there may be other ways in which the differences between the learners’ native naming patterns and the to-be-learned patterns affect their performance. One way is that second language learners may not simply be building links to the second language lexicon from a tabula rasa starting point as a young native learner would be. They may initially import the pattern of links from objects to words that their native language uses and thus experience interference from the imported pattern in the process of acquiring the new pattern (as first-language syntax appears to interfere with learning second language syntax; see, e.g., MacWhinney, 1992; Tao and Healy et al., 1998; see also Jiang, 2000, for discussion of the lexical level from a slightly different perspective). Learners must not only acquire new links but unlearn the original ones, which may be difficult to do (e.g., Barnes and Underwood, 1959). In addition, Kroll (e.g., Kroll, 1993; Jared and Kroll, 2001) suggests that links between the native and second language lexicons may be retained even as links are built directly from conceptual knowledge to the second language lexicon, with the result that native language vocabulary may be activated along with second language words under some circumstances. This possibility suggests that when the pattern of object-name links is not parallel in the two languages, an object might activate a second language name and, through its link to a native word, also activate a different second language name that competes with the first and reduces the correspondence of names produced to those that would be used by native speakers. Thus the second language learner is at a disadvantage both in establishing the native pattern of object-name links and in selecting the native name choice upon seeing an object, relative to a young native learner.
MacWhinney (1992) suggests that “fossilization” of native patterns may occur through increased automatization of the first language system with increasing age. This notion suggests that age of immersion may be an important variable in determining ultimate mastery, although we were unable to test this possibility well because age of immersion is so closely related to years of immersion in our sample.
A second factor that may contribute to the persistence of some degree of error is mature learners’ sophisticated ability to use contextual information in communication. As MacWhinney (1992) points out in discussing why second language learners may not achieve fully native syntax and phonology, mature learners can often use context to understand sentences without paying attention to details of form and without noting discrepancies from their own implicit version of the same material. They may tend to do so more than young native learners do (see also Newport’s, 1990, “less is more” hypothesis and Cochran, McDonald and Parault, 1999). In parallel, in production, it is rare that communication between adults is hindered by minor discrepancies in syntax or phonology. The same would seem to be true for the names applied to objects. If the object is physically present, the intended referent is often obvious regardless of what name is applied. If it is not present, the speaker and addressee may never become aware if there is a slight discrepancy between what the speaker has in mind and what the addressee takes to be the referent. Thus both in production and comprehension, whether the referent is present or absent, some discrepancies from native understanding of the distribution of a word may go unnoticed. Communication needs may push learners to reach a certain level of knowledge, but there may be little miscommunication feedback pushing them farther.
Finally, many second language learners continue to use their native language with family and friends even when living in the second language environment, as about half of our participants indicated that they do. This fact will keep links from objects to native names active and more likely to interfere with second language naming than if the native language were left unused. In addition, even when learners are using the second language, their conversational partners frequently may not be native speakers of the language themselves, and so non-native patterns may be reinforced.
The discussion thus far has centered on naming choices, but our data indicate that typicality judgments also remain distinct from native judgments for some of the lexical categories. This result follows naturally from the possibility that object–name links are not fully native-like. Objects are presumably judged typical of their linguistic category to the extent that the link between the object and the name is strong and the object shares many features of other entities to which that name is also linked. If the strength and pattern of object–name links deviate from those of native speakers, then typicality judgments will also diverge. The fact that our learners showed more divergence for some categories than for others may be a function of how intrinsically difficult the categories are to master (that is, of how featurally diverse the membership is), as already noted. If our suggestion that native naming patterns are imported as a starting point is right, a second contributor may be how much the categories resemble those of other languages and thereby the native categories of each learner. (See Malt et al., in press, for evidence that some of the lexical categories of the bottles stimulus set are more closely shared across English, Spanish, and Chinese than others are; see Aitchison, 1994, for a brief report suggesting an enduring influence of native object– name links on second language typicality judgments)
Implications for models of second language lexical development
The simplest version of how conceptual knowledge might be represented in models of the bilingual lexicon is in terms of nodes, with each node representing a concept (which is then linked to a word in the native language and ultimately to one in the second language). However, the observation that roughly comparable words are not necessarily equivalent across languages dictates that a complete model must unpack conceptual knowledge, so that some of the knowledge associated with a word in one language can differ from that associated with the most closely comparable word in another language. De Groot (1992, 1993) suggests that, especially for abstract words, one should conceive of a word meaning as a composite of elements that are not always fully shared between a pair of translated words.
We suggest that a more radical approach than De Groot’s is needed on two fronts. First, our data (Malt et al., 1999; Malt et al., in press, and the current results), along with Kronenfeld et al.’s (1985) data on drinking vessels and the more anecdotal observations cited earlier, indicate that the need to unpack the conceptual representation is not limited to abstract words but applies to concrete nouns referring to common, everyday objects as well. This view is further reinforced by the issue of polysemy for concrete nouns. Although the applications of “bottle”, “jar”, etc. to various objects in our stimulus set may not be different enough to say that these uses correspond to distinct senses of the words, many concrete nouns do have uses divergent enough that they are considered to reflect different senses. And the different senses will not necessarily be shared across languages, even if the central sense is comparable. For instance, in English we speak of a human foot, an animal foot, the foot of a bed and of a table, a foot soldier, and so on. In Spanish, “el pie” is used for a human foot, but “la pata” is used for a (non-human) animal foot, as well as the foot of a piece of furniture. The end of a bed or table is “el extremo”, and a foot soldier is a “soldado de infanteria”. Thus differences between languages in the knowledge associated with roughly comparable words are likely to be pervasive even for common, concrete nouns (see also Pavlenko, 1999). We propose that for ALL nouns (as well as other parts of speech such as prepositions, e.g., Bowerman, 1996, and verbs, e.g., Talmy, 1985), models must accommodate the fact that roughly equivalent words in two languages will not necessarily fully share conceptual representations. Second, it is not merely “elements of meaning” that can be shared or not shared by roughly comparable words in different languages. The “elements of meaning” notion suggests that the knowledge associated with a word can be captured in a single summary representation, albeit one in which different elements are specified. Our view implies that the knowledge that allows native-like use of a given word cannot be captured by a summary representation. Some object names are not fully predicted by their properties; rather, they are consequences of historical linguistic and cultural forces that are not necessarily transparent even to native speakers of the language. In such cases, only knowledge of specific object–name conventions will allow native-like labeling of the objects. The case of polysemy again bolsters this argument. The differences between Spanish and English in the uses of “foot” versus “pie” are not predictable from an understanding of the most literal or central uses of the words. The proper uses in each language can only be generated (in part) by additional knowledge about the individual cases. (Indeed, the different senses of polysemous words may be represented separately in memory rather than captured in a summary representation; Klein and Murphy, 2001, 2002). Thus we suggest that a complete model of bilingual lexical knowledge will require substantial information about individual uses along with any explicitly represented featural or summary information.
The argument that the bilingual lexicon must incorporate two different sets of links from words to knowledge about objects raises the question of whether the linking patterns influence each other. For instance, we have already suggested that the first language pattern may be initially imported into the second language. As the second language linking pattern becomes more nativelike, if the second language becomes the dominant one, is there a backward influence in which the first language links shift to be more like the second language pattern? And for balanced bilinguals, raised hearing native speech from two languages, will two separate and native-like sets of links be established, or will there be a mutual influence such that neither is fully native-like (as has been suggested for phonology, for instance; see, e.g., Singleton, 2001)? If there is evidence for such influences, a challenge for developing models will be how best to capture them to enable predictions about performance.
Implications for second language teaching
Finally, our consideration of the reasons that second language learners may deviate from native naming patterns suggests several avenues for improving mastery of naming patterns. It may be inevitable that beginning language instruction will use paired-associate learning of rough translational equivalents as a major means of providing vocabulary information; this is an efficient way of conveying information that allows the beginner to use a word in approximately the right way. However, providing some explicit meta-knowledge to students about the potential for discrepancies between languages may help sensitize them to the need to pay attention to where uses diverge. To the extent that maintaining versus overriding the imported mappings has a voluntary aspect, awareness of the potential for differences may also speed overriding of the inappropriate links. Most importantly, the input needed to master the patterns can only come from extensive observation of object–word pairings, and this sort of observation will come best through immersion in the second language environment (see also Pavlenko, 1999; Jiang, 2000; Dewaele and Regan, 2001; and studies of mastery of morpholexical variables such as gender agreement, e.g., Dewaele and Veronique, 2001). In the absence of study abroad or other immersion opportunities, learning might be enhanced in the classroom by focusing less on reading books (in which the language alone, without corresponding visual input is provided) and more on films, plays, and interactive activities in which students observe real-world referents of the words spoken.
We examined first language (L1) naming of common household objects in three groups of Russian–English bilinguals: early, childhood and late bilinguals. Their naming patterns were compared with those of native speakers of Russian and English, in order to detect possible second language (L2) English influence on L1 Russian naming patterns. We investigated whether such influence is modulated by the speaker’s linguistic trajectory, specifically, their age of arrival in the L2 environment, which in turn influences their relative proficiency and dominance in the two languages. We also examined whether the potential for L2 shifts can be linked to specific characteristics of the categories in the L1 or L2. L2 influence was evident in the data, increasing with earlier age of arrival but most pronounced with lowest L1 proficiency. The changes entailed both narrowing and broadening of linguistic categories. These findings indicate that L1 word use is susceptible to L2 influence even for concrete nouns referring to familiar objects, and the nature of the shift for a given word appears to be driven by several factors.
Second language (L2) influence on the first language (L1) has been documented across a variety of linguistic domains, from phonology to pragmatics (for overviews, see Cook, 2003; Pavlenko, 2000, 2004). It appears, however, that different lexical, semantic and syntactic domains may display differential vulnerability to L2 influence on the L1. For instance, in Pavlenko’s (2002, 2010) studies, late Russian–English bilinguals displayed L2 influence on L1 in lexicalization of emotions but not in lexicalization of motion. Thus, although there is growing evidence that the L1 is vulnerable to a backwards effect of learning an L2, much remains to be understood about the timing and scope of this effect.
Malt and associates (1999, 2003) outlined several factors that could lead to differences in naming, including the salience of the domain in a particular community, the order, timing of appearance, and function of particular objects in the community, and the level of differentiation among objects encouraged by the languages’ morphology. Because each language community has a unique combination of values on these dimensions, cross-linguistic variation in naming such objects is likely to be the rule rather than the exception.
Although Malt and Sloman (2003) were not able to fully evaluate if the slow mastery was directly related to interference from the L1 mappings (due to the diverse language backgrounds of their participants), Graham and Belnap (1986) provide evidence that cross-linguistic differences may lead to divergence from native naming patterns. The researchers examined the naming patterns of L1 Spanish learners of L2 English in contexts where category boundaries in English did not correspond to those in Spanish (e.g., "silla" in Spanish covers the range of objects divided into "chair" and "stool" in English). They found that intermediate and advanced L2 learners of English who had resided in the US for less than a year followed L1 naming patterns in the use of the L2.
Ameel et al. (2005) compared naming patterns of twenty-five Dutch–French simultaneous bilinguals with those of monolingual speakers of the two languages, holding the social context constant (all participants resided in Belgium). The participants were asked to name common household objects and to judge their similarity. Similar to the previous studies by Malt et al. (1999) and Malt and Sloman (2003), the stimuli consisted of large sets (more than sixty each) of pictures of common storage containers and housewares. Monolinguals’ responses revealed differences between Dutch and French naming patterns. For instance, the twenty-five objects called fles (roughly, English "bottle") in Dutch were divided between the categories of bouteille (for larger bottles, up to 1/5 l) and flacon (for smaller ones) in French. Simultaneous bilinguals displayed a converging naming pattern, using words in the two languages in more similar ways than the monolinguals did. For instance, French bouteille was used more similarly to Dutch fles by the bilinguals, leaving their use of flacon for fewer objects.
The presence of L1 transfer and discrepancies between L2 users and target language speakers for late bilinguals, and of converging naming patterns for simultaneous bilinguals, reinforce the suggestion from studies of monolinguals that the language-specific naming conventions for household objects may be difficult to acquire. They further suggest that a source of difficulty in establishing target-like mappings may lie in the influence one language has on the other in establishing these mappings. Target-like word-to-referent mappings would require a speaker to avoid or overcome the influence of the other language, but the data suggest that this may not be possible. These studies, however, are limited to L1 influence on L2 and the mutual influence of two languages acquired in parallel. We now consider implications for the current issue of interest, the possibility of an L2 influence on L1.
Oour findings show that an L2 → L1 influence in the mental lexicon can occur even in a domain involving concrete nouns naming familiar, common household objects. Although one might speculate a priori that such a domain might not produce L2 → L1 influence, our preceding observation may provide a key insight into why such an influence can take place. Namely, although the objects involved are familiar, and the nouns are common and their referents concrete, the linguistic categories that the nouns define are not, themselves, strongly determined by unique perceptually given property clusters in the world. That is, although use of a noun is linked to certain sets of properties in each individual case, there are multiple possible ways of dividing up the domain to form meaningful sets of objects sharing overlapping properties. Because different languages evolve different solutions to the problem of dividing up the objects by name, and because the clusters to be learned are not selfevident independent of language input, word use may be less stable and less pre-determined by mere observation of structure in the world than one might imagine. Thus naming patterns for these common objects in one language can be swayed by exposure to patterns in another just as use of more complex or abstract language is.
Third, our data have illuminated the progression of L2 → L1 influence as a function of age of arrival in the L2 environment (a variable that reflects, at least within our groups, both extent of L1 mastery and length of immersion and extent of L2 mastery). Not surprisingly, it was strongest in the early bilinguals for whom their chronological L2 English is the dominant language, and weakest in late bilinguals for whom L1 Russian is still the dominant language. At the same time, all three groups demonstrated the L2 influence in the structure (i.e., salient attributes) and the boundaries of linguistic categories. It is striking that the late bilingual group showed some L2 influence, given that they have achieved full native mastery of L1 before leaving Russia, and, furthermore, their exposure to English has been relatively limited and their (self-rated) mastery of English is incomplete. This finding suggests that modest L2 → L1 influence may occur for virtually any group of speakers given moderate exposure to the L2. Since, for the most part, our late bilinguals had been in the US only a short time, their L2 influence might increase with longer immersion in the English language environment. However, childhood bilinguals showed a relatively small increase in the degree of L2 influence relative to the late bilinguals, despite a substantially longer period of stay in the US and higher self-rated proficiency in English. The largest L2 influence, and the biggest jump from the previous group, was in the case of the early bilinguals, who rated themselves only slightly more proficient in English than the childhood bilinguals but substantially less proficient in Russian. This pattern suggests that it may be the incomplete mastery of the L1 that leaves it most vulnerable to L2 influence, rather than the degree of exposure to or mastery of the L2 itself. This observation about the locus of the major L2 influence indicates that L1 use within the family may not be sufficient to ensure that children who arrive at early ages develop native-like L1 naming patterns even for the most familiar household objects. Our early bilinguals showed substantial alteration of their pattern of mapping.
Despite their close connection, though, the act of naming differs critically from the act of recognition. Naming is part of a communication process, whereas recognition is not. The name selected for an object may reflect requirements for successful communication, whereas the representation of commonalities presumably is in- fluenced primarily by constraints such as storage efficiency and the ability to support inference. Because of this fundamental difference in the nature of the two acts, the coupling between recognition and naming may be less than perfect. In particular, we posit that names used for objects reflect influences that are independent of the process of internal representation. The goal of the work presented here is to explore the nature of the relation between recognition and naming of common artifacts. We begin with the observation that the boundaries for linguistic categories (that is, groups of objects called by the same name) may differ from language to language.
We ask whether speakers of languages that have different linguistic category boundaries for a set of objects show differences in their perception of the similarity among the objects, and whether any such differences parallel the differences in how they name the objects. If naming of objects is tightly coupled to their encoding relative to other objects, the answer to both these questions should be yes. If naming and encoding are completely independent, then differences in linguistic category boundaries will not be paralleled at all by differences in perceived similarity among the objects. If the two are partially independent, then we should expect some parallels as well as some systematic differences.
Recent cross-linguistic comparisons of the sets of objects or things to which category names refer have revealed substantial differences in the way that speakers of different languages segment stimulus space by name. Such differences arise not only in naming of abstract or socially constructed domains such as kin or emotion but even for concrete nouns referring to common objects, which one might expect to correspond closely across languages (e.g., De Groot, 1993; Kroll, 1993). For example, the linguistic boundary between
chair and sofa is not the same in Chinese as in English. In English, a large stuffed seat for one person is given the same label as a wooden chair, but Chinese speakers give the stuffed one the same label that they would give a stuffed multi-person seat—what English speakers call sofa (Malt, Sloman, & Gennari, 2003). Kronenfeld, Armstrong, and Wilmoth (1985) found that speakers of English, Hebrew, and Japanese grouped 11 drinking vessels by name in different ways. For instance, the Americans gave the same name to a paper drinking vessel and a vessel for drinking tea (calling both "cup"), while the Israelis did not. Malt, Sloman, Gennari, Shi, and Wang (1999) examined naming for a set of 60 common containers (mostly called bottle or jar in English) by speakers of American English, Mandarin Chinese, and Argentinean Spanish found substantial differences in the linguistic category extensions across speakers of the three languages. For the 15 objects named container in English, four different names were used in Chinese, and the Spanish category that contained the 19 objects called jar in English also included six objects called bottle in English and three called container. Malt et al. (2003) examined in more detail the relation among the linguistic categories for the 60 containers and found a complex pattern. Some of the categories were very similar across the three languages but some categories of one language were nested within those of another, and others showed cross-cutting in which pairs of objects put into a single category by one language were put into different categories by another language.
Dissociation between naming and similarity
Given the cross-linguistic differences in linguistic category boundaries, if a close connection exists between categorization and similarity (Kruschke, 1992; Medin & Schaffer, 1978; Nosofsky, 1984, 1986; Rosch & Mervis, 1975), then one must expect differences across speakers of different languages in what they know or understand about the objects. However, Malt et al. (1999) found that although the naming patterns diverged across speakers of the different languages, similarities among the objects were perceived in much the same way. Likewise, Kronenfeld et al. (1985) found comparable similarity judgments for the drinking vessels for their speakers of Hebrew, English, and Japanese. Hence, the perception of objects similarities— and so the way that people conceptualize the objects non-linguistically—may be largely universal, while naming of objects—and so the way that people categorize them linguistically—is language-specific. Based on the dissociation of naming and similarity, Malt et al. concluded that naming cannot be driven only by featural commonalities that speakers perceive among objects. Other constraints on name choice that have evolved over the course of the language s history (such as convention, pre-emption, and chaining; see General discussion) must contribute to the naming patterns of each language. To understand how monolinguals speakers of a language link their knowledge of words to knowledge of the world, then, a distinction must be made between lexical concepts, which may be language-specific, and general non-linguistic understanding of the world, which may be universal (Levelt, Roelofs, & Meyer, 1999; see also Bierwisch & Schreuder, 1992; Levinson, 1997). Cross-linguistic diversity and the challenge of bilingualism The dissociation of naming from similarity and the differing patterns of naming across languages create a dilemma for speakers of more than one language. For those who acquire one language as their native language and later learn a second language, the naming patterns of the first language are presumably mastered; the problem to be surmounted is how to then acquire a different naming pattern that is associated with the second language. Malt and Sloman (2003) found that second language learners of English from a variety of first-language backgrounds had substantial difficulty with this task; they showed discrepancies from native speakers in their English naming patterns even after many years of immersion in an English-language environment. The difficulty posed to those who grow up exposed to two languages from birth is perhaps even greater. To be completely native-like in both languages, the child learner must attend to the distinctions between the two languages naming patterns, acquire both patterns, and maintain them as distinct over time. But evidence suggests that the two lexicons of profi- cient bilinguals are not isolated from one another. For instance, Schwanenflugel and Rey (1986) found crosslanguage semantic priming effects in a lexical decision task with Spanish–English bilinguals. Recognition of words in one language following other-language primes was as fast as that following same-language primes. Similarly, Guttentag, Haith, Goodman, and Hauch (1984) found comparable facilitation for item categorization when the target words were surrounded by unattended words highly related to the category regardless of whether the two words were within-language or across-language. Other cross-language semantic priming studies (e.g., Altarriba, 1992; Chen & Ng, 1989; Kroll & Curley, 1988; Williams, 1994) as well as evidence from picture naming and translation tasks (Potter, So, Von Eckhardt, & Feldman, 1984), word and picture identifi- cation and classification (Shanon, 1982), and word association and lexical decision tasks (Van Hell & Dijkstra, 2002), support the notion that the two lexicons of proficient bilinguals are interconnected (see also Francis, 1999; Kroll & Sholl, 1992).1 Recent evidence also indicates that there is cross-talk between the syntaxes of the two languages of the bilingual (e.g., Dussias, 2001, 2003; Hartsuiker, Pickering, & Veltkamp, 2004), and some cross-language contamination of phonology as well (e.g., Bullock & Gerfen, 2004; Kehoe, Lleo´, & Rakow, 2004). Thus, the representations of the bilingual s two languages may be readily and broadly permeated by one another. If the mental lexicons of the bilingual s two languages have direct interconnections or indirect feedback loops involving links between word forms and representations of referents, it may be difficult or impossible for bilinguals to maintain two separate and distinct patterns of mappings from word forms to referents. The present study was designed to address the issue of how linguistic diversity in naming patterns affects the bilingual lexicon. Specifically, we investigated the relation between bilinguals two naming patterns and the relation of their two patterns to the corresponding monolingual naming patterns. The study was carried out in Belgium, a bilingual country where French- and Dutch-speaking monolinguals live alongside bilinguals who are brought up learning French and Dutch simultaneously. This situation provides an ideal laboratory in which to address these questions. We studied compound bilinguals having a French-speaking (monolingual) mother and a Dutch-speaking (monolingual) father or vice versa. Compound bilinguals learn and use their languages interchangeably in the same environment and in the same situations. Compound bilinguals are to be distinguished from coordinate bilinguals, who acquire and use their languages in strictly distinct environments, and from subordinative bilinguals, who learn the second language as a foreign language (i.e., are first exposed to it later in life) (Ervin & Osgood, 1954; see also Weinreich, 1953).
The one-pattern hypothesis vs. the two-pattern hypothesis
If different languages with different histories maintain different naming patterns, what does a bilingual, acquiring two different languages simultaneously, learn about how to name objects? Two contrasting hypotheses are suggested, presented schematically in Fig. 1. The geometric figures (circles, squares, and triangles) represent objects being named in Language 1 (L1) and Language 2 (L2). Monolingual speakers of L1 name the square and the triangle in the same way as the black circles, whereas monolingual speakers of L2 name them in the same way as the white circles. The first hypothesis, which we will call the two-pattern hypothesis, states that bilinguals acquire and maintain two distinct sets of connections of word forms to their referents. For each language separately, the naming pattern parallels that of the corresponding monolinguals. In Fig. 1A, this is represented by the overlapping linguistic boundaries of monolinguals and bilinguals in L1 and L2, implying that in L1, bilinguals put the square and the triangle with the black circles (i.e., analogous to the monolinguals of L1), whereas in L2, they put them with the white circles (i.e., analogous to the monolinguals of L2). The two-pattern hypothesis assumes no interactions, connections, or feedback loops between the two languages of bilinguals. It thus predicts that the French and Dutch naming patterns will parallel the naming patterns of, respectively, the French-speaking monolinguals and the Dutchspeaking monolinguals. This representation of the bilingual lexicon requires substantial memory capacity, since two different mappings from word forms onto objects need to be stored separately. However, for bilinguals to demonstrate full native proficiency in each language, these separate mappings must be maintained.
The second hypothesis, which we will call the one-pattern hypothesis, assumes that through the simultaneous exposure to the two languages, bilinguals develop direct inter-connections or indirect feedback loops between the word forms of the two languages. At the same time, connections are developed from the word forms in each language to knowledge about referents. The continuous interaction between the two languages combines elements of the lexical concepts from both languages, so that the bilingual s' semantic knowledge deviates from that of both monolingual groups. Consequently, the connections between the word forms and the associated extensions in the two languages are tuned to one another. The two naming patterns merge into one naming pattern that differs from either monolingual naming pattern. This is represented in Fig. 1B by the single linguistic boundary of bilinguals, situated between the linguistic boundaries of monolingual speakers of L1 and L2. The bilinguals segment the stimulus space in a way different from both monolingual language groups: the square is put with the black circles, the triangle with the white circles. The resultant naming pattern can be considered as a compromise that is reached between the two languages in which differences in naming patterns between the languages are smoothed out. Depending on the relative influence of the languages, the merged naming pattern can take different forms, varying from largely dominated by one language to a balanced situation in which both languages carry equal weight in determining the naming pattern, to largely dominated by the other language. The one-pattern hypothesis predicts that the bilinguals use a single naming pattern both for the French and the Dutch naming, and that it will differ from the corresponding monolingual naming patterns. In comparison to the two-pattern hypothesis, a merged pattern is more cognitively economical, since storing only one set of connections between word forms and referents is less demanding on the limited resources of permanent memory. However, it means that bilinguals will not show fully native-like naming performance in one or both of their languages. The hypotheses just outlined occupy two extreme positions along the continuum of possible bilingual lexical organization. However, the truth may also be situated somewhere in between: the two naming patterns of bilinguals may converge toward one common naming pattern but not match perfectly. To take such an intermediate possibility into account, we consider a weaker version of the one-pattern hypothesis later.
General discussion Dissociation of naming and sorting Using French- and Dutch-speaking monolingual Belgians, we replicated the findings both of different linguistic segmentation of common objects by different languages and of a dissociation between linguistic categorization (naming) and non-linguistic understanding (sorting) obtained by Malt et al. (1999) for speakers of English, Spanish, and Chinese. The analysis of the dominant names, the analysis of similarities among naming distributions, and the application of the Cultural Consensus model revealed substantial differences between the naming patterns of French- and Dutch-speaking monolinguals. In contrast, virtually no differences were found in their perceptions of the commonalities among the objects, as revealed by the high correlation between the sorting data of the two monolingual language groups and by the CCM. The dissociation was found both for the bottles set and for the dishes set. This finding is consistent with Levelt et al' s (1999) distinction between universal non-linguistic and language-specific lexical concepts (see also Bierwisch & Schreuder, 1992; Levinson, 1997). Based on this result, we can conclude that naming is not fully driven by the shared understanding of commonalities among the objects. Language-specific factors as well as similarity must contribute to how people segment a domain into linguistic categories. The fact that we replicated Malt et al' s (1999) results with language groups that live in close proximity and share virtually the same culture supports Malt et al. s argument that naming patterns are affected by a language s history. The vocabulary of each language (or dialect) appears to evolve over time and to be shaped by mechanisms such as convention, pre-emption, and chaining.
A particular name can become associated to an object by linguistic convention rather than because of specific similarity relations to other objects associated with the category name; for instance, the name can be introduced by a manufacturer. Pre-emption occurs when people may avoid calling an object by a particular category name because using that name would lead to ambiguity or confusion with another object. Fig. 8 shows some examples found in our data set of naming that may reflect convention and pre-emption. The object in Fig. 8A was called "beker" by most of the Dutch-speaking monolinguals, but its average similarity was greater to the objects called "tas" (Dutch for cup) than to the other objects called "beker" This object may therefore be named "beker" and not "tas" by convention rather than because of similarity. The origin of this convention may have been a pre-emption. The word "beker" in Dutch is used for a plastic cup even though its features fall within the range of objects called "kop" or "tas". But calling it a would create referential confusion with porcelain cups. The use of "kop" or "tas" for the plastic cup may therefore be preempted by the other uses of these names. The name beker is used here to distinguish it from porcelain cups. A similar example was found for the French-speaking monolinguals (Fig. 8B).
Chaining is at work when an object, similar to central examples of a category (C1) receives a different name (C2) through links to near items that are more typical objects of the C2 category and that may be at some distance from central examples of the C1 category. Fig. 8C shows an object of the bottles set that was called "fles" by Dutch-speaking monolinguals, although it was more similar on average to objects labeled "bus". We suggest that the object has received its name through links to more typical objects in the fles category. Fig. 8D shows a similar example for the French-speaking monolinguals. As Malt et al. (1999) note, we are not able to reconstruct all the links in the chain that may lead to the name "fles", since our set of stimuli, though selected with the intention of representing the variability that exists within each domain, is not an exhaustive collection of all the forms of dishes that currently exist or historically did exist during the evolution of the current naming pattern.
Evidence against two separate naming patterns in bilinguals
The second, and primary, goal of the study was to evaluate the nature of lexical knowledge of bilinguals. Do the bilinguals maintain two separate sets of mappings of word forms to referents, one for each language (the two-pattern hypothesis), or do the naming patterns converge onto one naming pattern (the one-pattern hypothesis), implying some form of interconnections or feedback loops between the sets of word forms of the two languages and knowledge about their referents? The data force us to reject the two-pattern hypothesis. At the group level both for the bottles and for the dishes set, the correlation between the measures of name similarity of the bilingual naming patterns was significantly higher than the correlation between the monolingual naming patterns, indicating that the bilinguals in their
The word flacon, in contrast, refers more specifically to a small bottle containing perfume or tablets. Through the years, the use of "flacon" might have been introduced in the vocabulary of French native speakers (monolinguals) to differentiate small bottles holding perfume or tablets from the more ordinary bottles (preemption). The Dutch-speaking monolinguals do not have a distinct category name for this kind of more atypical bottles. For bilinguals, both naming patterns seem to be unaffected by this mechanism of pre-emption. Putting the French monolingual and bilingual naming patterns side by side, we can say that a restructuring (Pavlenko, 1999) has taken place for the extension.
Most of the bilinguals were students of a Dutch-language university or college. Keeping the language input permanently balanced is extremely difficult (Schaerlaekens, 1998). As the child grows older, if the language spoken outside the home is the same as one of the two ‘‘home’’ languages, this language may play a more decisive role in naming than the other ‘‘home’’ language. Besides language dominance, language-specific properties that favor one language over the other (e.g., the number and ambiguity of lexical alternatives) may also drive the common naming pattern more in the direction of one language. Mechanisms such as chaining, convention, and preemption still contribute to naming choices of bilinguals, but these sources of cross-linguistic diversity seem to operate to a smaller degree upon naming of bilinguals than upon naming patterns of monolinguals.
Another type of bilinguals is the group of subordinative bilinguals who learn the second language as a foreign language at a later age than the first (native) language. Malt and Sloman (2003) found that the English naming patterns of second language learners low in proficiency diverged substantially from naming of native speakers of English. More advanced learners improved, but even those with the most English language experience retained some discrepancies from native patterns. Malt and Sloman (2003) suggested that people learning a second language might start the second language acquisition by importing the word–object mappings from L1 (see also Jiang, 2000; Kroll & Stewart, 1994; Potter et al., 1984). So, for these learners, the influence may be unidirectional: from L1 to L2. This may result in a common naming pattern for both languages completely dominated by (and so, similar to) L1. This common naming pattern will differ from the common naming pattern of compound bilinguals, which is shaped by a mutual influence between the two languages.
However, their adjustments (of second-language learners) may never be quite sufficient for various reasons. For instance, there may be a lasting interference from the native language pattern of links between word forms and object representations, both because this pattern is initially been imported and because most second language learners continue to use the first language in many contexts (Jiang, 2000; Kroll & Curley,1988; Kroll & Stewart, 1994; Malt & Sloman, 2003).