Modelling Intrasentential Codeswitching [PDF]

... Arabic as a Minority Language edited by J onathan Owens Mouton de Gruyter Berlin New York ) 2000 Modelling in

38 0 3MB

Report DMCA / Copyright

DOWNLOAD PDF FILE

Papiere empfehlen

Linguistic Constraints On Codeswitching

0 0 439KB Read more

Transport: Modelling

0 0 2MB Read more

Kinematic Modelling of Robots

3 0 2MB Read more

Steel Art Modelling Magazine No. 06

0 0 72MB Read more

Hec-Ras 2D Flood Modelling Tutorial

4 1 6MB Read more

Airfix Magazine Guide 27 Modelling RAF Vehicles

1 0 34MB Read more

Steel Art Modelling Magazine No. 05

0 0 70MB Read more

Modelling Reactive Absorption of CO2 in Packed Columns

0 0 581KB Read more

Modelling of A Coil Steam Generator For CSP Applications

3 2 4MB Read more

Laboratory 2 ER Modelling and Relational Table Transformation

2 1 291KB Read more

Modelling Intrasentential Codeswitching [PDF]

Author / Uploaded
mihai

0 0 0
Gefällt Ihnen dieses papier und der download? Sie können Ihre eigene PDF-Datei in wenigen Minuten kostenlos online veröffentlichen! Anmelden

Datei wird geladen, bitte warten...

Zitiervorschau

...

Arabic as a Minority Language

edited by J onathan Owens

Mouton de Gruyter Berlin New York

)

2000

Modelling intrasentential codeswitching: a comparative study of AlgerianlFrench in Algeria and MoroccanlDutch in the Netherlands Louis Boumans and Dominique Caubet Our contribution addresses grammatical regularities in codeswitching. We will discuss the description and interpretation of codeswitching patterns in general, and compare Moroccan ArabicIDutch codeswitching in the Netherlands to Algerian ArabicIFrench in Algeria. Section I introduces an insertional model of codeswitching which combines the merits of earlier matrix language approaches in an eclectic manner, termed the Monolingual Structure Approach. The Moroccan ArabicIDutch and Algerian Arabic!French data corpora are then described according to the principles of this framework in sections 2 and 3. Subsequently, the major patterns of insertion in both corpora will be compared in section 4, where an account for some of the similarities and dissimilarities will be proposed. A summary concludes this article. Boumans wrote section 1, which reflects his point of view, as well as the section on Moroccan ArabicIDutch and the comparison of codeswitching in both data corpora. Caubet wrote section 3 on Algerian ArabicIFrench. Finally, we have revised each other's sections as needed. 1. The matrix langnage

i.i.introduction In the following we wiIl adopt an insertional approach to the description of codeswitching patterns. This means that codeswitching is viewed as the insertion of smaller or larger constituents from one language, to be called the Embedded Language, into a syntactic frame set by another language, the matrix language.' In Boumans (1998a) it is argued that this approach efficiently and economically describes most codeswitching phenomena, even if some data seem to systematically undermine the model. In the past decades several scholars have made proposals for insertion models of codeswitching which differ from each other with respect to the

114

Louis Boumans and Dominique Caubet

exact definition of the matrix language. This definition determines what types of embedded elements are possible, and what constitutes counter-evidence. A central question is whether the matrix language for a particular syntactic structure can be identified on the basis of the larger context. In the present chapter we follow the model Boumans developed for his PhD research on Moroccan ArabiclDutch codeswitching (Boumans 1998a). This model, called the Monolingual Structure Approach, combines in an eclectic manner insights from a number of scholars including, most prominently, Hasselmo (1972, 1974), Bautista (1975, 1980), K1avans (1985), Nishimura (1986) and Myers-Scotton (1993; Myers-Scotton and Jake 1995). Since the early 1990s, Myers-Scotton's Matrix Language Frame model has been the predominant insertional framework (Myers-Scotton 1993, 1997; Myers-Scotton and Jake 1995). The main point of divergence between the Matrix Language Frame model and the Monolingual Structure Approach concerns the scope of the matrix language and, related to this, the possibility of layered insertion (for which see below). The Matrix Language Frame model assumes a matrix language/embedded language dichotomy for only one particular syntactic level, the complementizer phrase, while inferring the matrix language from the make-up of mixed constituents within the complementizer phrase. This implies that according to the Matrix Language Frame model the same matrix language provides the morphosyntactic frame for the complementizer phrase and for each mixed constituent within the complementizer phrase. Layered or recursive insertion is thus precluded. The stance we take here is that the matrix language of a particular clause does not necessarily coincide with the language of the discourse as a whole, the speech turn or any larger sample than the clause itself. Likewise, the language that provides the syntactic frame of the clause as a whole does not necessarily provide the morphosyntactic structure of each constituent within this clause. The idea that the definition of the matrix language must refer to features of the matrix structure itself also entails that on the supra-clausal level the definition of the matrix language in a non-circular way becomes problematic. This problem concerns particularly the use of extra-clausal discourse markers (e.g. conjunctions and question tags) which combine with clauses from another language, and which are located in the specifier position of the complementizer phrase according to the Matrix Language Frame model. A second difference between the Monolingual Structure Approach and the Matrix Language Frame model concerns the definition of embedded

Modelling intrasentential codeswitching: a comparative study

115

language constituents. In the Monolingual Structure Approach embedded language material can be recognised as an embedded language constituent if it constitutes a possible constituent according to the grammatical rules of the embedded language, using traditional criteria for constituency such as distributional properties (Jacobson 1995). This implies that indefinite pronouns, for instance, may be classified as a type of noun phrase by virtue of their distributional properties; the embedded Dutch iedereen 'everyone' in (22) below exemplifies this. Likewise, singly occurring adverbs may be classified as full adverbial constituents. The Matrix Language Frame model does not allow for embedded single morpheme constituents, called "embedded language islands" in the model. In Myers-Scotton's words, "all islands must be composed of at least two lexemes/morphemes in a hierarchical relationship" (Myers-Scotton 1993: 138). In the remainder of this section, the definition of the matrix language according to the Monolingual Structure Approach is expounded (1.2 and 1.3). After that challenges and limitations of the Monolingual Structure Approach will be pointed out, with particular attention to clause-external discourse markers (lA). Section 1.5 summarises this discussion and section 1.6 explains how the Monolingual Structure Approach is employed to describe insertion patterns in a text corpus. 1.2. Identifying the matrix language

The success of any insertion approach hinges on the proper definition of the matrix language and the identification of embedded elements. The point of departure is that the make-up of each syntactic structure identified, e.g. clause or phrasal constituent, can be attributed to one and only one of the participating languages, which is thus identified as the matrix language for this structure. The matrix language governs the selection and relative order of the constituent parts that make up the structure, whether these constituents are from the same language or another language (the embedded language). Constituents are defined by traditional criteria such as distributional properties (Jacobson 1995). The process of identifYing the matrix language makes use of generalisations over a set of individual instances of codeswitching (Boumans 1998a: 61-90). A given set of data is described as a collection of insertions, using the smallest possible number of insertion types. This principle leads to the conclusion that word order and function morphemes are usually indicative of the matrix language; content words (including

116

Louis Boumans and Dominique Caubet

certain types of derived or inflected content words) are more liable to be inserted than functional morphemes; furthermore, in codeswitching with languages marking tense and/or aspect on the Verb, verbal inflection for these categories is a reliable indicator of the matrix language on the finite clause level, a principle introduced by Klavans (1985) and supported by Treffers-Daller (1994: 204). The following examples illustrate how the matrix language is identified. 1.2.1. Phrasal constituents as matrices In example (I) we find the Dutch word uitkering 'benefit' in a sentence that is otherwise in Moroccan Arabic. The noun uitkering is part of a nominal constituent (noun phrase) I-uitkering dyal-hifm 'their benefit'. This constituent can be regarded as a matrix. (I)

ye-~ti-w

n-nas l-uitkering dyal-bUm 3-give-PL DEF-people DEF-benefit of-3PL 'They'll give the people their [social security] benefit'. MAlDutch (Hayat)2

Moroccan Arabic is identified as the matrix language on the basis of the internal make-up of this noun phrase: all function morphemes and the relative order of all morphemes can be attributed to this language. The analytical possessive construction too embodies a recognisably Moroccan Arabic pattern: Dutch would use a possessive pronoun here (hun uitkering). Therefore uitkering in (I) is analysed as a Dutch (embedded language) content morpheme embedded in a Moroccan Arabic (matrix language) nominal constituent. While this is a straightforward example, it is principles of generalisation that favour this analysis rather than the alternative analysis in which 1uitkering dyal-hifm is considered as a Dutch noun phrase with the Moroccan Arabic article 1- and the possessive prepositional phrase dyalhUm as embedded elements, and the entire noun phrase being embedded in a Moroccan Arabic matrix clause. After all, embedded Dutch nouns occur in all positions in which Moroccan Arabic nouns can occur in monolingual Moroccan Arabic, and the insertion of single nouns is widely attested in codeswitching with any language pair. The insertion of single articles, affixes, or possessive markers, on the other hand, is cross-linguistically rare; the Moroccan ArabiclDutch data bear no evidence at all of insertion of either Moroccan Arabic or Dutch articles. No Moroccan Arabic articles

Modelling intrasentential codeswitching: a comparative study

117

appear in otherwise Dutch finite clauses, nor is there any indication that the distribution of Moroccan Arabic articles follows the rules of Dutch grammar (or vice versa). Though the insertion of single function morphemes is very rare, embedded complex word forms consisting of a content morpheme and one or more function morpheme affixes or, sometimes, clitics are quite common. Examples of this are embedded language words that contain derivational affixes. More interesting are embedded complex words where the embedded language function morpheme functions in the grammatical system of the matrix language. A rather common pattern is the insertion of embedded language nouns with embedded language plural markers. In the next example the English noun steaks is embedded in a Spanish noun phrase. The English plural marking in this word triggers agreement elsewhere in the Spanish noun phrase, namely in unos and sabrosos. This demonstrates that English steaks functions as a plural form in the Spanish nominal paradigm. (2)

steak-s tan sabroso-s daban unos they' gave INDEF'M'PL steak-PL so tastY'M-PL 'They served some steaks so tasty'. SpanishlE. (pfaff 1979: 306)

The insertion of content words accompanied by inflectional affixes or clitics is subject to regularities that are related to features of both the matrix language and the embedded language. To give an example: it is common for the plural of embedded language nouns to be marked by an embedded language affix when the matrix language is Arabic or one of the (Indo-) European languages, but when the matrix language is an agglutinative language like Swahili, Finnish or Turkish, the singular form of the noun is often inserted and inflected with an matrix language plural marker. In order to describe which complex word forms are inserted and which complex forms are not, the point of departure may be that only content morphemes are embedded. Subsequently the conditions under which inflectional affixes and clitics may accompany embedded language content morphemes can be formulated (see Boumans 1995; 1998b for an elaboration of this approach). "Content word" is used as a cover term to refer to both content morphemes and more complex word forms like plural nouns in contexts where the distinction is not critical.

118

Louis Boumans and Dominique Caubet

Modelling intrasentential codeswitching: a comparative study

1.2.2. The finite clause as a matrix

necessary precision when she contends that the "inflection bearing element" of the finite verb is indicative of the matrix language.

Examples (3) and (4) exempliry the insertion of nominal and prepositional constituents. The Moroccan Arabic independent pronoun hadak f-fi 'this' in (3) occupies the topic position in a Dutch finite clause. In accordance with Dutch grammar the topic constituent, often coinciding with the grammatical subject, is followed by the finite verb in a declarative main clause. When the topic constituent is not the subject, like the prepositional phrase mfa-k in (13), the relative order of the finite verb and subject is reversed. This is known as the West-Germanic verb-second rule.

(3)

hadak Hi

is eh uit

1.3. Layered insertion The preceding examples showed insertion in two kinds of matrix structures. A content morpheme or complex word form from the embedded language can be embedded in a matrix language constituent. The matrix language on this level is inferred from the internal make-up of the constituent. A complex constituent can also be embedded in a finite clause. In that case the matrix language is defined as the language of the inflection of the finite verb. Since the matrix language is defined independently on more than one level, the matrix language of the finite clause does not necessarily govern the internal structure of each complex constituent within the finite clause. The possibility of having embedded language constituents in a matrix clause already makes this apparent. embedded language constituents may also be themselves matrix structures in which an element of the other language is inserted. This is called "layered insertion" in the Monolingual Structure Approach. Ifwe assume that within one finite clause insertion can occur on more than one level - and there is no reason to exclude this possibility - we can account for many instances of codeswitching that are problematic for matrix language approaches identirying a matrix language on just one level. Nishimura (1986) proposes such a "layered insertion" analysis in order to account for her JapanesefEnglish data. Consider one of her examples:

den boze

DEF -thing is er from the evil 'No, this is fundamentally wrong'. (Samir) m~a-k ben ik mezelj with-2SG am I myself 'With you I'm being myself. Moroccan ArabicfDutch (Samir)

DEM

(4)

The order of the immediate constituents in (3) and (4) can be ascribed to Dutch syntactic rules. The Moroccan Arabic prepositional phrase and noun phrase constituents in these examples occupy a position within a larger matrix structure. This structure encompassing the verb and its arguments will be defined as a clause containing one and no more than one finite verb, labelled here as the finite clause. This definition is derived from the psycholinguistic model of speech production proposed by Levelt, according to whom the finite clause is the unit in which the ordering of constituents takes place (Levelt 1989: 256). The finite verb is probably the best criterion for the identification of the matrix language at the finite clause level. There turns out to be a constant correlation between the language of the inflection of the finite verb, and the language to which the order of the major constituents (the verb and its arguments) must be attributed. Here again generalisation provides the major argument: in those cases where the languages involved clearly do differ with respect to constituent order, it is hardly ever possible to consider the inflection ofthe finite verb itselfto be an insertion. The relationship between constituent order and the finite verb can be generalised to comprise those cases where both languages share the same constituent order, so that the inflection of the finite verb determines the matrix language in all cases. Note that non-finite verb forms as well as verb stems can be inserted, as will be shown in sections 2 and 3. Therefore, K1avans (1985) provides the

119

(5)

J slept with her basement

de LOCATIVE

'r slept with her in the basement'. JapanesefE. (Nishimura 1986:

I

130)

Nishimura views 1 slept with her basement de as an English sentence; in this English sentence a Japanese post-positional constituent, basement de, is inserted. This prepositional phrase in its turn constitutes a matrix structure that contains the English noun basement. Nishimura's analysis fits well into the Monolingual Structure Approach that identifies a matrix language on more than one level. To consider the Japanese postposition to be the embedded element is not an attractive alternative. Firstly, this would cause a new insertion type, namely "insertion of postposition", to be added to the corpus description. This runs counter the observation that "single" function morphemes tend not to be embedded, whereas the insertion of English

120

Louis Boumans and Dominique Caubet

nouns in JapaneselEnglish is not controversial. Secondly, the rela-tive order of noun and adposition argues against the analysis of basement de as being an English prepositional phrase.' More examples oflayered insertion will be discussed in sections 2.3.1 and 3.4 (exx. 41, 42, 78, 82). 1. 4. Counter-examples and limitations

While the Monolingual Structure Approach offers a coherent account of a great many aspects of codeswitching, it does not account for all aspects of codeswitching attested in the literature. In 1.4.1 we perfunctorily list five potential sources of difficulty, acknowledging that we have no ready answers to them. In 1.4.2 the status of discourse markers and c1auseexternal topics is taken up, as the description of these two elements as constituents within a higher order syntactic structure poses particular problems for an insertional approach to codeswitching. 1.4.1. Counter-examples "Bare" forms

Modelling intrasentential codeswitching: a comparative study

121

as well-formed embedded language constituents, the source language word order challenges the idea that the matrix language governs the distribution of the elements that make up the matrix structure, see islamitische school 'Islamic school' in (17) below. The Dutch word order adjective-noun is retained in these two words, but they do not constitute a well-formed Dutch constituent as Dutch would require a determiner. A possible explanation for this phenomenon is that collocations are stored together with their internal word order in the speaker's mental lexicon, from which they are retrieved as a whole in speech production. Such an explanation is particularly plausible for idiomatic collocations. In any case, the effect of collocational ties on the word order of embedded language forms challenges the idea that the matrix language determines word order inside the matrix language structure. Attributive adjectives Attributive adjectives generally constitute a problem area for insertion models. There are two word order possibilities: either the adjective precedes the head noun, or it follows it. The word order should in all cases be in conformity with the matrix language of the nominal constituent, but here counter-examples are relatively common (cf Santorini and Mahootian 1995). The matrix language in itself does not predict the order of embedded attributive adjectives very well. Instead, attributive adjectives recurrently display source language word order when embedded. It is possible that the matrix language is just one of several factors that together determine the word order of attributive adjectives. 4

An aspect of codeswitching is that in some cases function morphemes that are obligatory according to the matrix language grammar tend to be omitted. Moroccan ArabicIDutch codeswitching offers clear examples of this phenomenon: preceding an embedded Dutch noun more often than not the Moroccan Arabic definite prefix 1- is omitted, even in contexts where it is obligatory according to Moroccan Arabic grammar. This will be discussed in detail in section 2.2.1 below. In itself, the insertion approach to codeswitching cannot account for the omission of function morphemes, as this approach assumes that the matrix language governs the distribution of morphemes. Loosely stated, the absence of function morphemes must be attributed to the unproductivity of the (morphological) process that attaches inflections to "new" content morphemes. The Monolingual Structure Approach is a useful tool, however, for the detection of contexts in which certain morphemes tend to be omitted.

The most serious threat to the Monolingual Structure Approach on the level of the finite clause occurs when the inflection of the finite verb fails to predict the order of the major constituents. Such counter-examples are rare, however (cf Stenson 1990: 173-4). One possibility is that the verb itself is embedded (see Igla 1991 and Boumans 1998b for further discussion).

Collocations of content morphemes

Modal and aspectual adverbs

A further complication for the insertion model comes from the insertion of collocations of content morphemes (or content words) that typically retain embedded language word order. Common examples are noun-adjective and verb-object collocations. If these embedded collocations cannot be classified

Various scholars have shown that certain embedded language adverbs display syntactic features of their source language, rather than the language of the other constituents of the clause. The problematic adverbs often serve a discourse marking function, marking sequences in the structure of the text

Major constituent order

122

Louis Bournans and Dominique Caubet

(secondly, finally) or expressing subjective modal values (definitely); other embedded language adverbs displaying embedded language word order properties are more readily associated with aspect marking (still, already). Examples of "embedded" adverbs that display source language word order properties are discussed by Stenson (1990: 173, 182-3) and Hasselmo (1974: 223-4), among others, on English adverbs in lrishlEnglish and SwedishlEnglish respectively. We will not pursue this matter here, but note that the word order properties of these adverbs, although they appear to undermine the idea of a matrix language, will be very difficult to examine if the matrix language model is rejected altogether.

Independent development of the codeswitching variety In a community where codeswitching is a regularly used mode of communication, it is possible that the codeswitching variety itself develops independently of the two constituent languages. Such a development can be conceived of both on an individual level, as when a speaker changes her linguistic behaviour over time, and on the level of smaller or larger speech communities in which several speakers imitate each other. As a consequence of this development, some regularities in codeswitching behaviour can no longer be analysed as simply the combination of elements from the two monolingual varieties as the insertion approach claims. A relatively common example of this is the use of a periphrastic construction with an matrix language auxiliary verb to incorporate verbs or, in some West African languages, adjectives from the embedded language (Myers-Scotton 1993: 150-1; Meechan and Poplack 1995). Even if this practice builds on an existing matrix language construction, its frequent use in the codeswitching variety typically leads to grammaticalisation and semantic bleaching of the matrix language auxiliary verb. The Nijmegen corpus of Moroccan ArabiclDutch codeswitching shows this process rather clearly as individual respondents display varying degrees of grammaticalisation of the periphrastic construction in which Dutch verbs are embedded (see section 2.2.6). 1.4.2. Discourse markers and clause-external topics The concept of matrix language applies less well to various types of elements that function on the discourse level. In the present article the major problems will be pointed out. Various complications emerge with respect to "discourse markers", a heterogeneous group of particles, adverbs and

Modelling intrasententiai codeswitching: a comparative study

123

expressions that either order the text into sequences (e.g. though, so, and), or express the speaker's attitude toward what is being said, or toward her/his interlocutor (already, really, you see?). Above it was already noted that some embedded language adverbs tend to retain source language word order properties. In the case of discourse markers that occur clause-initially or clause-finally it is not at all obvious that they belong to the finite clause as a matrix structure and, therefore, can be viewed as embedded elements. Such discourse markers do not constitute straightforward counterexamples, but the Monolingual Structure Approach cannot account for their placement either. Similar difficulties arise with respect to extra-clausal foregrounding strategies. The Arabic emphatic pronouns that indicate a shift in topic and mark contrastive topics illustrate this. In codeswitching with Arabic, Arabic full personal pronouns commonly precede a finite clause in the other language (English, French, or Dutch), as the Moroccan ArabiclDutch example below illustrates; and see (80) below. 5 (6)

ze~ma n-gul-ubhal bhalikeh muhimm ~end-nal-maSakel at-IpL DEF-problems EPISTEMIcl-say-PL same same I er anyway nti-yai voorJOUj washetmisschien ehm iets 2SG F-EMPH for you was it maybe er somewhat moelijker more· difficult 'We have -let's say - the same problems. I er ... Anyway for YOU it was maybe somewhat more difficult'. Moroccan Arabic/Dutch (Samir)

Within Myers-Scotton's Matrix Language Frame model several discourse phenomena are analysed in terms of a higher-order syntactic structure called complementizer phrase in X-bar theory. The complementizer phrase matrix contains as its immediate constituents the finite clause, and the discourse marker or dislocated or topicalised constituent in the so-called specifier complementizer phrase position (Jake 1994; MyersScotton, Jake and Okasha 1996). It then becomes possible to consider either the finite clause, or the discourse marker or topicalised constituent to be an embedded language constituent. In the Matrix Language Frame model, the stretch ntiya .. moeilijker in example (6) above would probably be analysed as an Arabic complementizer phrase in which the stretch voor jou .. moeilijker is an embedded Dutch finite clause island. This appears to be an attractive account. However, the definition of the matrix language remains an intricate problem. Only Arabic can account for the syntactic and

124

Louis Boumans and Dominique Caubet

textual distribution of these clause-initial topic pronouns. If the topic pronoun itself is decisive in identifYing Arabic as "the language which sets the frame for the entire complementizer phrase", the whole reasoning becomes circular: the pronoun (or other discourse marker) designates the matrix language, and the matrix language predicts the selection and placement of the pronoun and the finite clause constituent. No interesting generalisations about insertion types can be formulated on this basis. In fact, the whole operation will amount to a roundabout way of saying that the distribution of discourse markers is predicted by their source language. 6 For this reason the Monolingual Structure Approach does not assume a hierarchical relationship between clause-initial or clause-final discourse markers and the finite clause. The source language of the discourse marker itself seems to predict its distribution in a text, and there is no independent criterion that consistently identifies the source language of the discourse marker as the matrix language. This means that the Monolingual Structure Approach cannot handle phenomena above the finite clause level. 1.5. The matrix language: summary

The following are the major features of the Monolingual Structure Approach, as compared to other insertion models: Firstly, the matrix language/embedded language dichotomy is a strictly grammatical distinction logically independent of the difference in social status of the languages involved. Defining the matrix language on grammatical grounds only, we can investigate how insertion patterns correlate with the social status of the matrix language. Secondly, in a hierarchical representation of sentence structure, it is possible to recognise a matrix language for grammatical constructions on more than one level, notably the level of the finite clause and the level of a nominal or prepositional constituent within that clause. As a consequence, it is possible to have insertions inside insertions, e.g., a language x content word in a language y nominal constituent that is part of a language x clause. Thirdly, on the finite clause level the matrix language is the language of the inflection bearing element of the finite verb; on the constituent level, there is no such independent criterion, and the matrix language must be inferred from the internal make-up of the constituents, and from generalisations with respect to attested insertion types. Finally, the Monolingual Structure Approach does not a priori exclude the possibility of inserting function morphemes. Instead, the observation that single function morphemes are not usually inserted follows from the application of the Monolingual Structure Approach to a set of codeswitching data. There are

Modelling intrasentential codeswitching: a comparative study

125

a number of recurrent codeswitching patterns that cannot be dealt with satisfactorily in an insertion model. However, even for phenomena that appear to undermine the model the assumption of a matrix language is functional for their recognition and their description in an economical way. 1.6. Corpus description in the Monolingual Structure Approach

A description of codeswitching data according to the Monolingual Structure Approach consists in identifYing types of embedded elements (morphemes, word forms, and constituent types) for each matrix language. Crucially, either language of a language pair can in principle occur as the matrix language, even though it will often turn out that the insertion types are highly dependent on the language which functions as the matrix language. That is, codeswitching is typically asymmetric. In assessing insertion types, the use of general terms like noun phrase and pronoun calls for particular concern because insertion is often restricted to a subclass of these categories, e.g., only certain types of noun phrase. A type of embedded element is basically assumed to be congruent with a corresponding matrix language category, i.e., have the same syntactic distribution as its matrix language counterpart. Therefore we will discuss the distribution of an embedded language category only insofar as it turns out to diverge from the expected pattern. In sections 2 and 3, we will use the Monolingual Structure Approach to describe two bilingual data corpora: Moroccan ArabiclDutch conversations recorded in the Netherlands, and Algerian ArabiclFrench as spoken in Algeria. The application of the same principles of interpretation and classification enables us to aSSess the insertion patterns which are particular to each codeswitching variety, as well as the patterns common to both of them. 2. Moroccan ArabicfDutch codeswitching in the Netherlands

The Moroccan ArabiclDutch data are extracted from Boumans (1998). Section 2.1 will provide the necessary information on the Moroccan community in the Netherlands, the Nijmegen data corpus, and the respondents who took part in the recordings. Then the morphological and syntactic description of the data is divided in two parts. The first part (2.2) concerns insertions of Dutch morphemes and constituents in Moroccan Arabic matrices; the second (2.3) concerns Moroccan Arabic insertions in Dutch.

126

Louis Boumans and Dominique Caubet

The grammatical description is a qualitative, rather than a quantitative, analysis of the data. However, in order to exclude non-recurrent phenomena the following minimum standards are observed: unless stated otherwise, the phenomena discussed occur at least five times in the data, distributed among at least two respondents. Only in the case of less frequent insertion types (less than 20 occurrences) is the absolute number of tokens given, so as to allow some comparison of different insertion types. All the respondents' names are pseudonyms. For each example cited the pseudonym of the respondent will be indicated, so that the reader may link the examples to the sociolinguistic information provided in section 2.1. The pseudonyms are further used to indicate the distribution of particular codeswitching phenomena among the respondents. 2.1. The Nijmegen data corpus

2.1.1. The Moroccan community in the Netherlands According to the so-called "combined birth country criterion", which includes everyone born in Morocco or having one parent born in Morocco, there were 195,536 Moroccans in the Netherlands as of January 1, 1992 (Martens, Roijen and Veenman 1994). Moroccan migration to the Netherlands originates in the 1960s when Dutch employers began to recruit personnel from various Mediterranean countries, due to a shortage of unskilled labour force in the Netherlands. After the economic crisis of 1973, the recruitment of foreign workers came to an end. Immigration continued though, as Moroccan workers had their wives and children come over and, particularly since 1984, due to new marriages and relations (Muus 1993: 56). People from the Berberophone Rif area in northern Morocco are well represented. It is estimated that about 70 per cent of the Moroccans in the Netherlands speak a variety of Berber as their mother tongue. Since Moroccan Arabic is the lingua franca in Morocco, nearly all Berberophone immigrants have a good command of this language, with the exception of some elderly women and Dutch born children. Dutch born Moroccans in general speak Dutch most of the time and are more fluent in this language than in their home language (El Aissati 1996). The present social situation of the Moroccan community is far from ideal. The community faces a dramatic unemployment rate, and problems of poor educational achievement and youth delinquency, though a clear improvement is signalled with respect to schooling in the recent years (see the chapter on the Netherlands in Basfao and TaaIji (1994). Furthermore,

Modelling intrasentential codeswitching: a comparative study

127

the ideological antagonism between Islamic and Western values is increasingly accentuated. As a consequence Moroccans encounter unfavourable attitudes and discrimination in Dutch society, which makes them very much aware of their ethnic identity. 2.1.2. The Nijmegen data corpus The Moroccan ArabiclDutch codeswitching project at the University of Nijmegen was initiated by Jacomine Nortier in 1991. She undertook the collection and ordering of the data corpus until February 1992. In 1993 Boumans became responsible for the codeswitching project, and the data that had been gathered were put at his disposal. This data consist of audio recordings of interviews and spontaneous conversations among Moroccan immigrants and immigrants' children, in addition to reports on their immigration history and patterns of language use. Youssef Azghari, a student of Moroccan descent, organised the recording sessions and also took part in most of them. He transcribed the passages that contained codeswitching. The transcripts where checked by Nortier and finally by Boumans. In this article the discussion will be limited to seven male and six female respondents, distributed over eight conversations (approximately 10 hours). Ten of these do not speak a Berber language at all; three young men have Tashelhit as their mother tongue (see below), but their Moroccan Arabic and their Moroccan ArabiclDutch codeswitching variety did not differ noticeably from those of the Arabophones. No Berber was spoken during the recordings 7 The data are heterogeneous in many respects: the sociolinguistic backgrounds of the respondents, their speech behaviour in terms oflanguage choice and the conversational settings. 2.1.3. The respondents The most salient sociolinguistic factors influencing individual respondents' codeswitching patterns are their competence in Moroccan Arabic and Dutch and the amount of use they make of these languages in daily communicative interactions. Both are related to the amount of time the respondent spent in Morocco and in the Netherlands, and the age at which (s)he began acquiring Dutch. The respondents were asked whether they were more confident in speaking Moroccan Arabic or Dutch. Jamal and Abdellah, who arrived in the Netherlands at the age of seven and ten, respectively, had no clear preference for either language. Those who had immigrated at an earlier age, or where born in the Netherlands, had a preference for Dutch: Maryam, and

128

Louis Boumans and Dominique Caubet

the Hamadi siblings Nawal, Abdelkrim, Younes and Samir. All remaining respondents had arrived at the age of 16 or older and were more confident in speaking Moroccan Arabic: Fatima, Mimoun, Maryam's mother Hayat, Zineb, Warda, and Mustafa. With the exception of Mustafa, all selected respondents were capable of sustaining a conversation in either language, though with varying degrees of fluency. Abdellah, Jamal, and Mimoun are of Soussi descent, and speak Tashelhit besides Moroccan Arabic. We may further note that the Hamadi siblings speak a distinctively eastern variety of Moroccan Arabic, 8 although in the recorded material they actually oscillate between East Moroccan and Atlantic Coast Koine forms. The ensuing description of Moroccan ArabicIDutch codeswitching will concentrate on the patterns that are most salient and characteristic of Moroccan ArabicIDutch codeswitching generally, leaving out many idiosyncrasies. Dutch insertions in Moroccan Arabic matrices will be considered first, followed by Moroccan Arabic insertions in Dutch matrix structures. 2.2. Dutch insertions in Moroccan Arabic matrices

The Dutch elements that occur in Moroccan Arabic matrix clauses or constituents are classified into a number of types. In the first place, there is the insertion of content words that is common to most varieties of codeswitching discussed in the literature: nouns, (predicate) adjectives, verbs, and adverbs. Secondly, a number of Dutch constituent types are embedded with low frequency: noun phrases, prepositional phrases, predicate constituents with nominal or adjectival heads, adverbial constituents, and certain types of subordinate clauses that function as a constituent in a Moroccan Arabic matrix clause (e.g., relative and conditional clauses). In addition, some Dutch discourse markers are used in the context of Moroccan Arabic clauses. In view of the size of our contribution, the discussion will be limited to the major word categories nouns, adjectives and verbs, as well as the embedded constituents headed by these categories.

Modelling intrasententiai codeswitching: a comparative study

129

Grammatical gender Both Dutch and Moroccan Arabic have a two-gender system. Moroccan Arabic distinguishes feminine and masculine, in analogy with natural gender. In Dutch, historical masculine and feminine have merged into one common class that is opposed to neuter. Dutch grammatical gender is marked only in the singular, in the definite article (de for common, het for neuter singular) and the agreement of most attributive adjectives (suffix -e [a1 for common gender, no marking for neuter singular). For all plural nouns, the form of the definite article and the attributive adjective is the same as for common gender singular. As a result, the formal distinction of common and neuter is not very salient nor transparent, and learners of Dutch as a second language and even second generation immigrants tend to generalise the forms of the common gender to both genders. Embedded Dutch nouns are assigned either Moroccan Arabic masculine or feminine gender, as indicated by the agreement patterns. In the following examples, a subscript i indicates the Moroccan Arabic gender agreement. (7)

(8)

ma hna sakn-in f dorp;, hiya; ~yir-a;, fi-ha; Vir I-hulan