Profiling Indo-Aryan in the Hindukush-Karakoram: A preliminary study of micro-typological patterns

Henrik Liljegren

doi:10.1515/jsall-2017-0004

Article Open Access

Profiling Indo-Aryan in the Hindukush-Karakoram: A preliminary study of micro-typological patterns

Henrik Liljegren

Published/Copyright: February 28, 2017

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information

From the journal Journal of South Asian Languages and Linguistics Volume 4 Issue 1

Abstract

The study is a typological profile of 31 Indo-Aryan (IA) languages in the Hindukush-Karakoram-Western Himalayan region (covering NE Afghanistan, N Pakistan, and parts of Kashmir). Native speakers were recruited to provide comparative data. This data, supplemented by reputable descriptions or field notes, was evaluated against a number of WALS- or WALS-like features, enabling a fine-tuned characterization of each language, taking different linguistic domains into account (phonology, morphology, syntax, lexicon). The emerging patterns were compared with global distributions as well as with characteristic IA features and well-known areal patterns. Some features, mainly syntactic, turned out to be shared with IA in general, whereas others do have scattered reflexes in IA outside of the region but are especially prevalent in the region: large consonant inventories, tripartite pronominal case alignment, a high frequency of left-branching constructions, and multi-degree deictic systems. Yet other features display a high degree of diversity, often bundling subareally. Finally, there was a significant clustering of features that are not characterizing IA in general: tripartite affricate differentiation, retroflexion across several subsets, aspiration contrasts involving voiceless consonants only, tonal contrasts and 20-based numerals. This clustering forms a “hard core” at the centre of the region, gradually fading out toward its peripheries.

Keywords: Indo-Aryan; Dardic; Hindukush-Karakoram; typology

1 Introduction

The extremely mountainous and in many respects remote region where the ranges of the Hindukush, the Pamirs, the Karakoram and the westernmost extension of the Himalayas meet is home to a considerable number of mostly small and lesser-known Indo-Aryan languages. This northernmost outpost of Indo-Aryan is wedged in between Iranian and Tibeto-Burman, and in present times the region is politically divided between Pakistan, Afghanistan and India. For a long time the classification of these languages vis-à-vis more “typical” Indo-Aryan languages of the Northwest, was subject to a great deal of discussion and controversy, and a term that at least in the past was applied collectively to most of them was “Dardic”. Few modern linguists hold on to this term other than as a possible cover term for a cluster of languages whose Indo-Aryan identity is beyond doubt but nevertheless are recognized by a few salient, mainly phonological, retentions from Old Indo-Aryan (Morgenstierne 1974: 3) as well as by what has been suggested as contact-related developments (Bashir 2003: 821–822), in the latter case owing both to contact among the Indo-Aryan communities themselves and to significant interaction between Indo-Aryan and adjacent non-Indo-Aryan communities.

No doubt, these communities have historically and culturally often found themselves somewhat outside the sphere of the major developments of the subcontinent (Masica 1991: 20–21), and the present study was carried out in order to account in more precise terms for the typological characteristics of these languages, and whenever possible also offer an explanation for any significant traces of areality or clustering, or the lack thereof. For this purpose, it was seen as necessary to break free from any previous preoccupation with classificatory terms, such as “Dardic” or “Northwestern Indo-Aryan” and instead include all attested New Indo-Aryan languages, regardless of lower-level classification, spoken within a set geographical window, and to identify a number of relevant features from a wide variety of linguistic domains to be compared. While suggestions regarding areality have been put forward before (see Section 2), conclusions have often been based on loose sampling and scanty data. The present study attempts to produce an empirically sound and balanced typological profile, using tight sampling and, as far as possible, high quality data from recent documentation and descriptive work in the region, thus avoiding the pitfall of “cherry picking”.

In Section ‎2, the geographical demarcations of investigation are precisely defined and an overview is given of the linguistic landscape found within the region thus defined. In connection with that, a few relevant suggestions pertaining to linguistic areality in the region, as they occur in previous research, are noted and briefly discussed. Section ‎3 gives details on the language data used in the study and outlines the general methods applied. Sections ‎4–‎7 present the results of the study, as much as possible in the form of feature tables that can be related to large-scale typological findings, in relevant cases supplemented with maps showing the geographical distribution within the region. Section ‎4 is dedicated to phonology, Section ‎5 to morphology and grammatical categories, Section ‎6 to syntax, and Section ‎7 to a few features that are related to the lexicon and to lexical organization. In Section ‎8, the findings of the previous sections are summarized and some general conclusions are offered.

2 The Indo-Aryan languages of the Hindukush-Karakoram and their neighbours

Although the geographically most salient feature of the region of interest is its mountainous environment, especially vis-à-vis the plains of northern India and the rest of Pakistan situated south of it, it was, in void of a generally and consistently applied definition of it, deemed practical to apply a somewhat random principle of demarcating it by means of geographical coordinates. Therefore, what henceforth will be referred to as the Hindukush-Karakoram region (HK), is the window between the longitudes 34 and 37 N and the latitudes 69 and 77 E. As far as Indo-Aryan, only the 34^th parallel as a demarcation of its southern boundary is really crucial, as there is no significant presence of any Indo-Aryan communities north, west or east of this window comprising northeastern Afghanistan, northernmost Pakistan and the northern part of the territory of Kashmir on both sides of the India-Pakistan “line-of-control” (Map 1).

Map 1:

Languages of the Hindukush-Karakoram region (Indo-Aryan languages encircled).

Within HK, 31 distinct Indo-Aryan languages have been identified, as defined by the language catalogue Ethnologue (Lewis et al. 2015), grouped into nine significant relatedness clusters, although the exact placement of a few of them (Dameli, Tirahi and Wotapuri-Katarqalai) remains uncertain (Table 1). The traditional label “Dardic” was collectively applied to the six first-mentioned clusters or groups, all with a longstanding presence in the region. Northern Hindko and Pahari-Pothwari are most naturally treated as part of a Punjabi macro-language or continuum with an extension far south of the region, and as such probably having more in common with the closest main Indo-Aryan languages of the Indo-Pakistani plains than with the six aforementioned groups. Gojri is the language of nomadic or semi-nomadic Gujurs, today spoken in pockets throughout the region and beyond, with a significant concentration in Kashmir, whereas its closest Rajasthani relatives are found at a considerable distance from the region itself, deep into the main belt of Indo-Aryan. Domaaki is a relative newcomer to the region. As the language of a small enclave of musicians and blacksmiths, it has during its 200–300 years in the area acquired a number of features typical of neighbouring and locally dominant languages (Weinreich 2011: 165–166). Like Gojri, its closest relatives are to be found in the plains of North India.

Table 1:

Indo-Aryan languages in the Hindukush-Karakoram according to sub-classification (with 3-letter codes and the areas where they are spoken). Afg=Afghanistan; Pak=Pakistan (or Pakistan-administered); Ind=India (or India-administered).

Group	Language	ISO 639–3 code	Area (Country)
Pashai	Northwest Pashai	[glh]	Kabul, Kapisa, Konar, Laghman, Nurestan (Afg)
	Southwest Pashai	[psh]	Kabul, Kapisa (Afg)
	Southeast Pashai	[psi]	Nangarhar, Laghman (Afg)
	Northeast Pashai	[aee]	Konar, Nangarhar (Afg)
Kunar	Shumashti	[sts]	Konar (Afg)
	Grangali	[nli]	Konar, Nangarhar (Afg)
	Gawarbati	[gwt]	Konar (Afg), Chitral (Pak)
	Dameli	[dml]	Chitral (Pak)
Chitral	Kalasha	[kls]	Chitral (Pak)
	Khowar	[khw]	Chitral, Gilgit-Baltistan (Pak)
Kohistani	Tirahi	[tra]	Nangarhar (Afg)
	Wotapuri-Katarqalai	[wsv]	Nurestan (Afg)
	Gawri (Kalami)	[gwc]	Upper Dir, Swat (Pak)
	Torwali	[trw]	Swat (Pak)
	Indus Kohistani	[mvy]	Kohistan (Pak)
	Gowro	[gwf]	Kohistan (Pak)
	Chilisso	[clh]	Kohistan (Pak)
	Bateri	[btv]	Kohistan (Pak)
Shina	Sawi	[sdg]	Konar (Afg)
	Palula	[phl]	Chitral (Pak)
	Kalkoti	[xka]	Upper Dir (Pak)
	Ushojo	[ush]	Swat (Pak)
	Kohistani Shina	[plk]	Kohistan (Pak)
	Kundal Shahi	[shd]	Jammu & Kashmir (Pak)
	Shina (Gilgiti)	[scl]	Gilgit-Baltistan (Pak), Jammu & Kashmir (Ind)
	Brokskat	[bkk]	Jammu & Kashmir (Ind)
Kashmiri	Standard Kashmiri	[kas]	Jammu & Kashmir (Ind), Jammu & Kashmir (Pak)
Western Punjabi	Northern Hindko	[hno]	Hazara, Jammu & Kashmir (Pak)
	Pahari-Pothwari	[phr]	Hazara, Rawalpindi, Gujarat, Jhelum, Jammu & Kashmir (Pak)
Rajasthani	Gojri (Gujari)	[gju]	Pockets throughout northern Pakistan, Jammu & Kashmir (Pak), Jammu & Kashmir (Ind) and beyond
Central	Domaaki	[dmk]	Gilgit-Baltistan (Pak)

Outside of Indo-Aryan, another 20 or so languages are spoken within HK, belonging to five separate genera. The heaviest component, both numerically (ten languages) and, in a couple of cases, as sub-regionally important lingua franca, is Iranian. Most of those languages are spoken in the western part of the region, in Afghanistan and Pakistan as well as in adjacent areas of Tajikistan and China. A number of them are spoken next to Indo-Aryan languages, and patterns of bilingualism involving Indo-Aryan and Iranian alike have most likely existed for a prolonged period. Another five-six languages are Nuristani, i. e. forming a third branch of Indo-Iranian (Strand 1973: 297–298). ^[1] All of them are spoken in a confined area of northeastern Afghanistan, with some minor spill-over into adjacent areas of Pakistan. Two Turkic languages are spoken at the very northern fringe of the region, although not immediately adjacent to any present-day Indo-Aryan community. In the eastern part of HK, we find at least two representatives of Tibeto-Burman languages, both with a significant number of speakers in Pakistan’s Gilgit-Baltistan region and in the adjacent Indian-controlled Kashmir. Finally, Burushaski, spoken in the extreme north of today’s Pakistan, not far from the border with China, stands on its own as a language isolate.

While this region, as already noted, does not have any fully agreed upon boundaries, it has nevertheless been referred to in an approximate way as a significant unit (at least partly overlapping with the HK region as defined here), either based on some cultural commonalities vis-à-vis surrounding regions (Cacopardo and Cacopardo 2001: 13–23; Jettmar 2002: 9–44) or as an areal-linguistically valid entity (Bashir 2003: 821–823; Èdel’man 1980; Fussman 1972; Tikkanen 1999; 2008; Toporov 1970). Regarding the latter, a number of individual features have been identified or suggested as indicators of areality, or subareality: tripartite affricate/fricative differentiation (Tikkanen 2008: 254–255), retroflex vowel phonemes (Heegård and Mørch 2004), lexically contrastive tone (Baart 2014), alignment with peculiar ergative-accusative splits (Liljegren 2014), the predominance of left-branching complex structures (Bashir 1988: 401–403; 1996a: 177), vigesimal numeral systems (Tikkanen 1988: 309), ‘say’-complementizers (Bashir 1996b), multi-valued deictic systems (Bashir 2003: 823), and grammatical evidentiality (Bashir 2006).

3 Methods and data

Building on previous areal observations, as mentioned in Section ‎2, and some preliminary findings in the course of my own field research in the region, a set of linguistic features was chosen as points of comparison, purposefully representing different linguistic domains (here grouped as phonology; morphology and grammatical categories; syntax; lexicon and lexical organization). Those features and the values set up are to a large extent corresponding to features/values found in WALS – the World Atlas of Language Structures (Dryer and Haspelmath 2013). ^[2] Subsequently, those were applied in order to categorize each individual language included in the study, arriving at a microtypology of Indo-Aryan in the Hindukush-Karakoram, mostly presented in table form showing the number of languages displaying each value of a feature, as well as the corresponding global distribution. This is also (when applicable) discussed against the display of values in Indo-Aryan in general, any significant geographical or sub-genealogical patterns or known or suspected patterning with non-Indo-Aryan in the region. Apart from detecting apparent inter-variety similarities, even significant divergence was considered of importance for the profiling (Nichols 1999: 13–24).

However, following the research strategy outlined by Koptjevskaja-Tamm (2010: 582–589) for characterizing a geographical region in areal-typological terms, it was frequently deemed necessary to pay further attention to details in variation, going beyond features and values covered in WALS. Therefore a number of often more fine-tuned, non-WALS, features were added and relevant values established and defined, yet in their formal structure reminiscent of the former features. In another few cases, where the characterization is either tendential rather than categorical, or when the results are only at best preliminary, due to limitations of available data or lack of adequate analysis, I altogether abstained from quantifying or applying specific values to individual languages. Instead, I only describe and exemplify what appear to be striking characteristics or strong tendencies of languages or a subset of Indo-Aryan in the region.

Instead of identifying a representative sample, the study aimed at using data from as many as possible of the 31 Indo-Aryan languages of the region. However, the quality, scope and amount of documentation varies a great deal from language to language, the reason why the total number of languages represented varies between the individual features investigated in the next few sections. In order to arrive at the tightest possible sampling, I have combined information extracted from reputable descriptions with my own (to a large extent unpublished) field material as well as data obtained in a number of collaborative elicitation workshops, involving native-speaker consultants recruited from some of the target communities. In Table 2, the sources of information for each language are specified.

Table 2:

Data sources for Indo-Aryan languages in the Hindukush-Karakoram.

Language	Sources	Language	Sources
Bateri	(Hallberg 1992: 207–225, 249–251); own data	Kundal Shahi	(Baart and Rehman 2005); own data
Brokskat	(Ramaswami 1982; Sharma 1998)	Pahari-Pothwari	(Kogan 2011; Khan and Bukhari 2011); own data
Chilisso	(Hallberg 1992: 207–225, 240–242)	Palula	(Liljegren 2016); own data
Dameli	(Morgenstierne 1942; Perder 2013); own data	Pashai, Northeast	(Morgenstierne 1967: 205–249)
Domaaki	(Weinreich 2011; Tikkanen 2011)	Pashai, Northwest	(Morgenstierne 1967: 143–203)
Gawarbati	(Morgenstierne 1950); own data	Pashai, Southeast	(Morgenstierne 1967: 251–297; Lehr 2014; Lamuwal and Baker 2013)
Gawri (Kalami)	(Baart 1997; 1999); own data	Pashai, Southwest	(Morgenstierne 1967: 45–142)
Gojri	(Losey 2002)	Sawi	(Buddruss 1967; Liljegren 2009: 43–48); own data
Gowro	(Hallberg 1992: 207–225, 243–248)	Shina, Gilgiti	(Bailey 1924; Degener 2008: 13–65; Radloff and Shakil 1998: 183–192); own data
Grangali	(Bashir 2003: 837–839; Grjunberg 1971)	Shina, Kohistani	(Schmidt and Kohistani 2008); own data
Hindko	(Rehman and Robinson 2011)	Shumashti	(Morgenstierne 1945)
Indus Kohistani	(Hallberg and Hallberg 1999; Bashir 2003: 874–877; Lubberger 2014); own data	Tirahi	(Morgenstierne 1934; Grierson 1927: 265–327)
Kalasha	(Heegård Petersen 2015: 35–49; Bashir 1988; Trail and Cooper 1999)	Torwali	(Lunsford 2001; Bashir 2003: 864–869; Grierson 1929); own data
Kalkoti	(Liljegren 2009: 43–48; 2013); own data	Ushojo	(Decker 1992); own data
Kashmiri, Standard	(Koul 2003; Koul and Bhat 2014; Verbeke 2011: 168–180); own data	Wotapuri-Katarqalai	(Buddruss 1960)
Khowar	(Bashir 2003: 844–849); own data

4 Phonology

4.1 Consonant inventories

Applying the same values as in W1A, the typical consonant inventory of Indo-Aryan in the Hindukush region is a large one (Table 3). While the average consonant inventory world-wide is in the lower twenties, the average of the present sample (in this case covering the totality of the region’s Indo-Aryan languages) is as high as 36, with nine of the languages falling within the “moderately large” category (26–33 consonant phonemes) and 22 within the “large” category (34 or more).

Table 3:

Consonant inventories.

Value	HK distribution	Global distribution (WALS)
Small (14 or less)	0 (0 %)	90 (16 %)
Moderately small (15–18)	0 (0 %)	121 (21 %)
Average (19–25)	0 (0 %)	182 (32 %)
Moderately large (26–33)	9 (29 %)	116 (21 %)
Large (34 or more)	22 (71 %)	54 (10 %)

Without exception, all the languages contrast five major places of articulation (bilabial, dental, retroflex, postalveolar/alveolo-palatal, and velar) and added to that, individual phonemes representing additional places (such as uvular and glottal) occur. The presence of a large number of plosives is characteristic, and to a varying extent the set involves contrasts both in voicing and aspiration. In a number of these languages, the same contrasts are extended to affricate and fricative sets. This general characterization holds even in the face of alternative analyses and counts (particularly the treatment of aspirated series is an issue which is not always entirely straightforward). In some cases, the distribution and manifestation of aspiration (at least when co-occurring with voicing) has led linguists to apply a suprasegmental or a cluster analysis to some of the individual languages rather than positing an entire set of e. g. aspirated voiced units.

The average Indo-Aryan language of this region has a phoneme inventory that is at least as large as that of Indo-Aryan languages in general (Masica 1991: 106–107). Many of the former have larger affricate/fricative sets, and make more use of the retroflex place of articulation than e. g. Hindi-Urdu. On the other hand, only a few of the languages of the region display a four-way contrast by combining the features +/-voice and +/-aspiration. The number of individual phonemes contrasting in voicing differs greatly between the Indo-Aryan languages of this region. While the plosive set shows almost complete consistency, this is much less so with the fricatives and affricates. The interaction between aspiration contrast and voicing contrast is intriguing and in need of further, more detailed, study. Also in many Indo-Aryan languages in general, there is a firm voicing opposition in the plosive set, but no corresponding contrast in the (mostly minimal) fricative set. When there is a contrast, it is often between s and z, where z has been introduced via Persian loans. It should be noted that there were no voiced fricatives in Old Indo-Aryan, only three voiceless fricatives.

Most of the genealogical subgroupings show a slight variation in relative size, but there is a tendency for the languages with the largest inventories to be found in the northern half of the region, and those with relatively small inventories in its southern half. Looking at non-Indo-Aryan languages in the region, all of them (regardless of their phylogenetic identities) also have large or moderately large inventories, with two notable exceptions, namely Dari and Hazaragi, two Iranian languages spoken at the region’s western periphery.

4.2 Vowel quality inventories

Most languages worldwide have an inventory of 5–6 vowels which is also the case in Indo-Aryan in HK. What sets it apart from the global distribution (W2A) is the virtual lack of small systems, as can be seen in Table 4. Dameli is an apparent exception with its 4-vowel contrast.

Table 4:

Vowel quality inventories.

Value	HK distribution	Global distribution (WALS)
Small (4 or less)	1 (3 %)	93 (16 %)
Average (5–6)	24 (77 %)	287 (51 %)
Large (7 or more)	6 (19 %)	184 (33 %)

There is a general problem drawing conclusions from analyses of individual languages, as some descriptions focus on quality while others focus on quantity. It is also difficult to know whether that reflects a language-particular focus or rather an analytical preference of the scholar in question. A tendency that can be seen in e. g. a number of Kohistani languages is the development of a front vs. back contrast for open vowels, essentially expanding from a basic 5-vowel system to a 6-vowel system. Especially noteworthy is also the presence of rounded front vowels in Kundal Shahi and the presence of a contrast between unrounded and rounded back (or central) vowels in Kashmiri. Both of those are situated in the southeastern part of the region.

Generally, there is more group-internal variation with vowel inventories than is the case with the consonant inventories. Even Indo-Aryan-wide there is a great deal of modern-day variation also outside of this region, ranging from five to at least nine qualitatively defined vowels (Masica 1991: 109–113), clearly contrasting with the ancestral system of a basic 5-vowel contrast.

4.3 Retroflexion

The display of retroflexion is not discussed as a dedicated feature in WALS. However, since it has turned out to be highly relevant for the characterization of the phonological systems of the region, and more specifically the extent to which the feature occurs in various phoneme subsets, a new feature was introduced, with the following classification values: a) no retroflex consonants, b) retroflex plosives but no fricatives, and c) retroflex plosives and fricatives. This does not mean there are no retroflex segments beyond those classes; there are indeed, and such instances will be touched on briefly in the discussion to follow. In order to arrive at a global distribution (for this and for some of the subsequent features that are not covered in WALS), data was extracted from UPSID, the UCLA Phonological Segment Inventory Database (with its 451-language sample), through an interface provided online. ^[3] The relevant distribution can be seen in Table 5.^[4] All of the languages in the HK region have retroflex plosives, and the large majority of them have retroflex fricatives in addition to the plosives.

Table 5:

Retroflex consonants.

Value	HK distribution	Global distribution (UPSID)
None	0 (0 %)	360 (80 %)
Retroflex plosives but no fricatives	9 (29 %)	35 (8 %)
Retroflex plosives and fricatives	22 (71 %)	7 (2 %)

In Table 6 the inventory of retroflex consonants in Gilgiti Shina is displayed as an example.

Table 6:

The inventory of retroflex consonants in Gilgiti Shina.

	Voiceless	Voiceless aspirated	Voiced
Plosives	ʈ	ʈʰ	ɖ
Affricates	ʈʂ	ʈʂʰ
Fricatives	ʂ		ʐ
Nasal			ɳ
Flap			ɽ

As for the geographical distribution, all the Indo-Aryan languages spoken in the central parts of the region have retroflex fricatives as well as plosives (Map 2). Retroflexion as a feature is pointed out as typically Indo-Aryan and also as a pan-South Asian areal feature, well beyond Indo-Aryan (Masica 1991: 131). What singles out this particular region from the rest of South Asia, however, is the prevalence of fricative (and affricate) retroflexion (Tikkanen 2008: 255–257; Hock 2015: 122–124).

Map 2:

Retroflexion in the Hindukush-Karakoram.

4.4 Affricates

The occurrence and size of an affricate subset is another relevant phonological feature that is not part of WALS. The classification values set up for this feature are: a) no affricates, b) only palatal affricate, c) only palatal and dental affricates, and d) palatal, dental and retroflex affricates. It is obvious from the results shown in Table 7 that the majority of the Indo-Aryan languages in HK have a tripartite contrast in its affricate set. Another four languages have affricates in two places of articulation, while the remaining seven only have palatal affricates.

Table 7:

Affricates.

Value	HK distribution	Global distribution (UPSID)
None	0 (0 %)	151 (33 %)
Palatal affricates only	7 (23 %)	147 (27 %)
Palatal and dental affricates only	4 (13 %)	82 (18 %)
Palatal, dental and retroflex affricates	20 (65 %)	10 (2 %)

The languages with affricates in all three positions seem to be clustered in the central parts of the region (Map 3). An example of such a language is Khowar, whose inventory of affricate consonants is displayed in Table 8.

Table 8:

The inventory of affricate consonants in Khowar.

	Dental	Retroflex (apical)	Palatal (laminal)
Voiceless	ts	ʈʂ	tɕ
Voiceless aspirated	tsʰ	ʈʂʰ	tɕʰ
Voiced	dz	ɖʐ	dʑ

It should be noted that affricates were not part of the Old Indo-Aryan system, and that the precursor of the modern-day palatal affricate was part of the plosive set. While the addition of dental affricates (not seldom a further development from a palatal pronunciation) is quite widespread in Indo-Aryan in general, the retroflex affricate seems to be a peculiarly HK development (Masica 1991: 94–95), and is as such a highly marked contrast (Tikkanen 2008: 255–256).

Map 3:

Affricates in the Hindukush-Karakoram.

4.5 Aspiration

Another important phonological feature for the region is the presence vs. absence of a contrast between aspirated and unaspirated consonants. The following values were set up: a) a contrast absent, b) a contrast present for voiceless consonants only, and c) a contrast for voiceless and voiced consonants. As shown in Table 9, only a few languages lack this contrast altogether, whereas the majority of them display a contrast between aspirated and unaspirated voiceless consonants (as in Table 10), and a smaller portion of the sample maintains a contrast for voiceless as well as voiced consonants, i. e. a four-way contrast,/p, pʰ, b, bʰ/, which is the normal pattern in Indo-Aryan in general (Masica 1991: 101).

Table 9:

Aspiration.

Value	HK distribution	Global distribution (UPSID)
Contrast absent	3 (10 %)	322 (71 %)
Contrast present for voiceless consonants	19 (61 %)	119 (26 %)
Contrast present for voiceless and voiced consonants	9 (29 %)	10 (2 %)

Although classification may seem relatively straightforward – and for virtually all Indo-Aryan languages in the region, phoneme inventories are available – it remains a difficult and somewhat puzzling feature. There is often conflicting evidence in one and the same language, and even when the feature is not contrastive in the modern languages, there are frequently traces of an earlier distinction in the form of tonal distinctions.

Table 10:

A partial consonant inventory in Domaaki, showing sets with contrasts in aspiration.

		Bilabial	Dental	Retroflex	Palatal	Velar	Uvular
Plosives	Voiceless	p	t	ʈ		k	q
	Voiceless aspirated	pʰ	tʰ	ʈʰ		kʰ	qʰ
	Voiced	b	d	ɖ		ɡ
Affricates	Voiceless		ts	ʈʂ	tɕ
	Voiceless aspirated		tsʰ	ʈʂʰ	tɕʰ
	Voiced			ɖʐ	dʑ

Generally, aspiration contrasts seem to be weak or waning in the western-most parts of the region (perhaps with new phonemes arising, such as in Gawarbati where there is evidence of a development f<ph). In other cases (e. g. in Palula), problems have to do with the choice between analyzing the occurrence of aspiration as clusters with e. g. a plosive and h, alternatively as distinct phonemes.

4.6 Uvular consonants

While the presence of uvulars globally is a highly marked feature (W6A), they are present in the phoneme inventories of about half of the Indo-Aryan languages in the HK. Most of them only have uvular stops, but at least three of them also have uvular continuants (Table 11).

Table 11:

Uvular consonants.

Value	HK distribution	Global distribution (WALS)
None	14 (48 %)	470 (83 %)
Uvular stops only	12 (41 %)	38 (7 %)
Uvular continuants only	0 (0 %)	11 (2 %)
Uvular stops and continuants	3 (10 %)	48 (8 %)

For some of the languages, an unvoiced uvular plosive is a marginal phoneme, susceptible to a great deal of intra-community variation. It often represents a high-prestige pronunciation of certain words of Perso-Arabic origin. However, particularly in the northern part of the region, such sounds are not only part of borrowed vocabulary – which is the normal situation in Indo-Aryan in general (Masica 1991: 105) –, but are equally found in basic vocabulary without any obvious loan origin (Bashir 2003: 844; Tikkanen 2008: 253), such as in Khowar /ɖɑq/ ‘boy’ or /qɑf/ ‘claw’.

4.7 Vowel nasalization

While in the world-wide sample (W10A), nasalization contrasts are absent in the majority of languages, the opposite holds for Indo-Aryan in HK, among which most languages display a phonemic contrast between oral and nasal vowels (Table 12).

Table 12:

Vowel nasalization.

Value	HK distribution	Global distribution (WALS)
Contrast present	17 (55 %)	64 (26 %)
Contrast absent	14 (45 %)	180 (74 %)

However, the exact contrasting function is hard to pin down. Some of the languages obviously have no nasalization beyond pure assimilation with the nasality feature of an adjacent nasal consonant (e. g. in Khowar), others have a nasalization contrast but no full set of nasalized vowels (e. g. Palula). Yet others appear to have a fully developed contrast with a set of oral vowels and an (almost complete) set of corresponding nasalized vowels (e. g. in Gawri, see Table 13). There is a tendency for nasalization to be of less importance in the western part of the region, and while it is a pervasive feature in languages in the Kohistani group and in Kashmiri, there is more internal variation within the Shina group.

Table 13:

Vowel phonemes in Gawri (Baart 1997: 31).

Oral vowels				Nasalized
Front		Back		Front		Back
Short	Long	Short	Long	Short	Long	Short	Long
i	iː	u	uː	ĩ	ĩː	ũ	ũː
e	e:	o	oː	ẽ	ẽ:		õː
æ	æː	ɑ	ɑː	æ᷉	æ᷉ː	ɑ᷉	ɑ᷉ː

Vowel nasalization is also a prominent feature of Indo-Aryan in general (Masica 1991: 117), but in that case stronger in the western parts of the Indo-Aryan belt than in its eastern parts. Outside of Indo-Aryan we find it only in some Tibeto-Burman and Munda languages and in Iranian Baluchi (Masica 1991: 132).

4.8 Tones

Most languages in the world-wide sample (W13A) have no tones, whereas the majority of Indo-Aryan in HK show some indication of making tonal distinctions. It is in this case always a matter of relatively simple tone systems, i. e. at the most involving distinctions between high and low as far as relative tone levels are concerned, and always as the feature of a word as a whole rather than of each syllable (Table 14).

Table 14:

Tone.

Value	HK distribution	Global distribution (WALS)
No tones	6 (24 %)	307 (58 %)
Simple tone system	19 (76 %)	132 (25 %)
Complex tone system	0 (0 %)	88 (17 %)

This is possibly an eastern or central feature, prevalent in languages belonging to the Shina and Kohistani groups, as well as in the languages more closely linked to the main Indo-Aryan languages of the subcontinental plains. Tonal distinctions have previously been reported also for Indo-Aryan languages, such as Punjabi, Bengali and Rajasthani.

In this case, we may be helped by a more fine-tuned characterization (Table 15) based on a taxonomy suggested by Baart (2014) with special reference to languages of this region.

Table 15:

Tonal types.

Value	HK distribution
No tones	6 (24 %)
2-way tonal contrast (“Shina-type tone”)	13 (52 %)
3-way tonal contrast (“Punjabi-type tone”)	3 (12 %)
4 or more-way contrast (“Kalami-type tone”)	3 (12 %)

The 2-way contrast, often described as pitch accent, appears to be the most common type, found throughout the region, with the system described for Gilgiti Shina as a typical representative of it. This may be an inherited system rather than one emerging as a response to segmental changes. This 2-way contrast is illustrated in Figure 1, showing f0 graphs of two Palula lexical items that are segmentally identical but contrast in terms of their respective fundamental frequencies.

Figure 1:

In Palula, the two lexical items/raát/‘night’ (left) and/ráat/‘blood’ contrast only in terms of a rising vs. a falling pitch pattern. The two items are pronounced by the same male speaker, occurring mid-sentence in an identical frame with the meaning ‘For this we say ___’.

The 3-way contrast seems to be related to historical loss of voiced aspiration and is typical of Hindko and some of its relatives within the Punjabi continuum (Bhatia 1975).

There is a possible correlation between the relatively complex Kalami-type tone and reduced syllable structure complexity in combination with the loss of voiced aspiration and apocope, diachronically added to the inherited contrast also reflected in the Shina-type tone. This complex system is found in Gawri (aka Kalami), its neighbour Torwali and in Kalkoti, the latter a Shina language spoken in an area overlapping with Gawri (Liljegren 2013).

4.9 Syllable structure

The majority of languages in the world-wide sample (W12A) are moderately complex (with CCVC as its most elaborate syllable), considerably fewer have complex patterns (with more consonants in the onset and/or coda than CCVC), and even fewer have a simple syllable structure (basically nothing beyond CV syllables). In the Indo-Aryan languages of the HK, however, the languages with moderately complex and the languages with complex structures are more or less equal in number, with a slight overweight for the complex ones (Table 16).

Table 16:

Syllable structure.

Value	HK distribution	Global distribution (WALS)
Simple	0 (0 %)	61 (13 %)
Moderately complex	11 (42 %)	274 (56 %)
Complex	15 (58 %)	151 (31 %)

Perhaps by showing certain typical co-occurrence patterns we would get a more complete picture: Initial C+r-clusters are common as are final nasal+C clusters. Only a few languages permit e. g. initial s+C or final sibilant+stop. It appears that greater restrictions on clusters are typical of the centrally located languages; this becomes particularly clear if comparing the Indus Valley with the Kunar Valley or Kashmir. The examples from Torwali, Khowar and Brokskat, respectively, in Table 17 illustrate this tendency. Torwali is a typical example of an IA language spoken in the central parts of the HK region showing a diachronic loss of clusters. Apart from a few surface clusters, such as /kʰwami/ ‘in the feet’ >/kʰu/ ‘foot’+/a/ pl+/mi/ ‘in’ (Lunsford 2001: 33), no unambiguous consonant clusters occur at word boundaries. Khowar, spoken in the West, has preserved clusters, especially in coda position, while there are considerable restrictions as to what consonants can co-occur word initially. In Brokskat, spoken in the extreme East of our proposed region, consonant clusters are infrequent in coda position while it is unusually generous as far as initial clusters are concerned. In general, Indo-Aryan languages in this region have fewer co-occurrence constraints than are observed in most of the main Indo-Aryan languages of the Subcontinent (Masica 1991: 125–127).

Table 17:

Consonants and consonant clusters at word boundaries in Torwali, Khowar and Brokskat. (Ramaswami 1982: 22–31; Sharma 1998: 42–44).

	Torwali		Khowar		Brokskat
#CV	/ʐat/	‘blood’	/bi/	‘seed’	/kani/	‘ear’
VC#	/ek/	‘one’	/buk/	‘throat’	/baːr/	‘stream’
#CCV	–		/brɑr/	‘brother’	/kro/	‘chest’
					/ʂmul/	‘silver’
#VCC	–		/frɔsk/	‘straight’	/əʂʈ/	‘eight’
			/bɔht/	‘stone’	/roks/	‘help’
#CCCV	–		–		/rɡjəl/	‘conquer’
					/straŋ/	‘lane’
VCCC#	–		/brɔnt͡sk/	‘kind of grass’	–

The same diversity and subareal tendencies seem to hold for non-Indo-Aryan languages in the region. However, Tibeto-Burman Balti, in the East, allows very complex structures, as does Iranian Pashto, in the West.

5 Morphology and grammatical categories

5.1 Occurrence of nominal plurality

Whether nominal plurality is explicitly indicated varies between languages (W34A). Typically, at least some types of nouns can be or have to be plural marked, as shown in Table 18. The type represented by the largest group of languages is the one that indicates plural and where such marking is obligatory for all nouns. That is also the case in our sample, but interestingly there is a sizeable group that have a plural marking strategy but for which its use is optional, or at least variable. For another two languages (Khowar and Kalasha), plurality is a feature only of human nouns, and it is optional, and in one other language (SE Pashai), plurality is applicable to all nouns but is optional for inanimates.

Table 18:

Occurrence of nominal plurality.

Value	HK distribution	Global distribution (WALS)
No nominal plural	0 (0 %)	28 (10 %)
Only human nouns, optional	2 (8 %)	20 (7 %)
Only human nouns, obligatory	0 (0 %)	40 (14 %)
All nouns, always optional	6 (24 %)	55 (19 %)
All nouns, optional in inanimates	1 (4 %)	15 (5 %)
All nouns, always obligatory	16 (64 %)	133 (46 %)

Optionality is largely a feature of the western part of the region, particularly prominent in the Chitral, Kunar and Pashai subgroupings. As shown in example ‎(1), Khowar nouns referring to non-human participants are not explicitly marked for plurality; the contrast between a singular (a) and plural (b) referent is instead reflected by the verb agreeing in person. Nouns referring to human participants (c) and (d), can, and often do, occur with a plural suffix.

(1)

Khowar

reni	waqi-r-an.
dog	bark-3sg-prs/fut.spc

‘The dog is barking.’ (Own data: KHW-ElicLSG09Morph1-AA)

reni	waqi-ni-an.
dog	bark-3pl-prs/fut.spc

‘The dogs are barking.’ (Own data: KHW-ElicLSG09Morph1-AA)

moš	kui	baɣ-ai.
man	where	go.pst.act-3sg

‘Where did the man go?’ (Own data: KHW-ElicLSG09Morph1-AA)

moš(-an)	kui	baɣ-ani.
man(-pl)	where	go- pst.act-3pl

‘Where did the men go?’ (Own data: KHW-ElicLSG09Morph1-AA)

Even Indo-Aryan languages in general display a fair amount of diversity in this respect, but it is primarily in the Eastern languages that plural marking, often by use of relatively new agglutinative material, tends to be optional (Masica 1991: 225–229).

5.2 Coding of nominal plurality

Although a number of coding strategies are attested in the world’s languages as far as nominal plurality is concerned (W33A), suffixation is the overall most common way of coding it. That is also the case in the Indo-Aryan languages of the HK, as seen in the Kohistani Shina and Dameli examples in Table 19.

Table 19:

Examples of singular and plural forms of nouns in Torwali, Kashmiri, Gawri, Kohistani Shina and Dameli. H=high; L=low; HL=high to low; LH=low to high; H(L)=delayed high to low. (Lunsford 2001: 42; Koul and Bhat 2014: 73; Baart 1999: 36–37; Schmidt and Kohistani 2008: 42–48; Perder 2013: 56–57).

	Singular	Plural
Torwali	šir (LH)	šir (L)	‘house’
	yap (H)	yəp (L)	‘irrigation canal’
	korsi (LH)	korsi (HL)	‘chair’
Kashmiri	kul	kulʸ	‘tree’
	moːl	məːlʸ	‘father’
	gagur	gagar	‘mouse’
Gawri	nār (H)	nēr (HL)	‘root’
	khan (LH)	khän (LH)	‘mountain’
	khāṭ (LH)	khǟṭ-ǟn (H[L])	‘tenant, labourer’
Kohistani Shina	hyúu	hyúu-i	‘heart’
	batshoó	batshó-e	‘calf’
	ǰip	ǰíb-a	‘tongue’
Dameli	ṭaaŋɡu	ṭaaŋɡu(-nam)	‘pear’
	baati	baati(-nam)	‘word’
	maasum	maasum(-aan)	‘child’

There are, however, a few notable exceptions. In one language (Kashmiri), stem changes are instead the method of coding, and in another language (Torwali), plurality is primarily indicated with tone. In another two languages (Gawri, Indus Kohistani), there is not one single strategy that could be said to be typical. Gawri applies tone as well as stem changes (Table 20).

Table 20:

Coding of nominal plurality.

Value	HK distribution	Global distribution (WALS)
Plural prefix	0 (0 %)	126 (12 %)
Plural suffix	24 (86 %)	513 (48 %)
Plural stem change	1 (4 %)	6 (1 %)
Plural tone	1 (4 %)	4 (0 %)
Plural complete reduplication	0 (0 %)	8 (1 %)
Mixed morphological plural	2 (7 %)	60 (6 %)
Plural word	0 (0 %)	170 (16 %)
Plural clitic	0 (0 %)	81 (8 %)
No plural	0 (0 %)	98 (9 %)

Even for a number of the languages that do add a suffix to code plurality, the scope of that is limited. For instance, in Kundal Shahi and in the Chitral languages, a morphological contrast between singular and plural is largely restricted to non-nominative cases – a phenomenon noted also in a number of Indo-Aryan languages outside of this region (Masica 1991: 229). For some of the lesser studied languages better contrastive data is also needed to conclusively decide to what extent tone plays a role in plural marking.

5.3 Number of cases

Languages can be grouped based on the complexity of their case systems. Globally (W49A), there is a great deal of diversity. The largest grouping consists of languages without any morphological case marking at all, but conversely, languages with extensive case-marking systems are not unusual (Table 21).

Table 21:

Number of cases.

Value	HK distribution	Global distribution (WALS)
No morphological case-marking	0 (0 %)	100 (38 %)
2 case categories	7 (27 %)	23 (9 %)
3 case categories	7 (27 %)	9 (3 %)
4 case categories	7 (27 %)	9 (3 %)
5 case categories	2 (8 %)	12 (5 %)
6-7 case categories	1 (4 %)	37 (14 %)
8-9 case categories	2 (8 %)	23 (9 %)
10 or more case categories	0 (0 %)	24 (9 %)
Exclusively borderline morphological case-marking	0 (0 %)	24 (9 %)

As for Indo-Aryan in HK, variation occurs, but within a smaller range. None of the languages are entirely void of case marking; instead it varies between two (as in Kalasha, see Table 22) and nine (as in Kohistani Shina, see Table 23), with only a few languages having more than four cases. All three of the languages with 6–9 case categories are spoken in the northeastern part of the region.

Table 22:

Kalasha case inflection: moč ‘man’. (Heegård Petersen 2015).

	Singular	Plural
Direct (nominative)	moč	moč(-an)
Genitive-oblique	moč-as	moč-an

Table 23:

Kohistani Shina case inflection: ẓáa ‘brother’. (Schmidt and Kohistani 2008: 45).

	Singular	Plural
Nominative	ẓáa	ẓáa-roe
Agentive (imperfective)	ẓáa-s	ẓáa-roes
Oblique	ẓaw-á	ẓáa-ro
Agentive (perfective)	ẓaw-í	ẓáa-roǰi
Possessive	ẓaw-ée	ẓáa-roo
Dative	ẓaw-áṛ	ẓáa-roṛ
Ablative-suppressive	ẓaw-ìǰ	ẓáa-roǰ
Sociative	ẓáw-ase	ẓáa-rose
Addessive	ẓaw-édi	ẓáa-rodi

When counting case-making categories, strict criteria were applied, and therefore only case forms that animate nouns can take were included, whereas e. g. locative cases that are clearly restricted to inanimate nouns were not counted. However, as also noted before, for some of the lesser studied languages better contrastive data would be needed to decide to what extent tone plays a role in case differentiation.

Another complicating factor, pointed out by Masica (1991: 230–231) for Indo-Aryan in general is the tendency for case marking to appear in successive layers in one and the same language. This makes cross-linguistic comparison particularly challenging.

5.4 Number of genders

Another grammatical category investigated in WALS is gender. In one of the articles (W30A), the focus is on the number of genders in each language. While gender differentiation (as evidenced by agreement) is altogether absent in a sizeable group of languages, the largest grouping with gender are languages with a two-gender system, in most cases making a masculine-feminine differentiation. The latter category is by far the most frequent one in the present sample (Table 24), thus representing an inherited system with many reflexes in the Indo-Aryan world in general, and there is not a single language, of those for which there is data to support it, without any gender differentiation at all. The situation in the Indo-Aryan world in general is one where gender is lacking in the northeast, a three-gender system prevails in the south, whereas a two-gender system dominates in the central and western parts of the Subcontinent (Masica 1991: 217–223).

Table 24:

Number of genders.

Value	HK distribution	Global distribution (WALS)
None	0 (0 %)	145 (56 %)
Two	23 (79 %)	50 (19 %)
Three	2 (7 %)	26 (10 %)
Four	4 (14 %)	12 (5 %)
Five or more	0 (0 %)	24 (9 %)

Ushojo, in ‎(2), is a typical example of a two-gender system, where the inherent gender of a noun is evidenced in agreement patterns; the finite verb agrees with the noun oóš ‘wind’ in feminine gender, whereas the masculine noun šídal ‘coldness’ triggers masculine verb agreement.

(2)

Ushojo

axeér	*oóš*	čóku	bíl-i.
finally	wind(f)	quiet	become.pfv-fsg

‘Finally the wind gave up.’ (Own data: USH-Northwind-AH:007)

maáti	*šídal*	bíl-u.
1sg.dat	coldness(m)	become.pfv-msg

‘I feel cold [lit. Coldness came to me].’ (Own data: USH-ValQuest-AH:060)

However, somewhat hidden behind these figures lie some intriguing systems that strongly diverge from the more typical masculine-feminine ones. That is the topic of the next feature to be discussed.

5.5 Sex-based and non-sex-based gender

Another fundamental issue pertaining to gender is what the individual systems are based on, and the primary distinction is whether they are sex-based or non-sex-based. As can be seen in Table 25, the more typical one is the sex-based, but there is also a fair number of non-sex-based systems. Often, animacy plays a crucial role in such systems. In the present sample, the sex-based system is the prevailing one, having a two-way, female vs. male, differentiation at its core.

Table 25:

Sex-based or non-sex-based gender.

Value	HK distribution	Global distribution (WALS)
No gender	0 (0 %)	145 (56 %)
Sex-based	27 (93 %)	84 (33 %)
Non-sex-based (only)	2 (7 %)	28 (11 %)

Significantly, however, the inherited sex-based system is altogether missing in two languages, Khowar and Kalasha, both belonging to the Chitral group (Map 4). Here, we find instead a two-way differentiation based on animacy, where animates (including humans and higher animals) are treated differently from inanimate nouns by at least one agreement target. In another six languages, all spoken in the western-most part of the region, the general part of the region where Khowar and Kalasha are also found, animacy differentiation occurs alongside sex-based differentiation (hence the 3 and 4-gender systems in Table 24), although sex-based differentiation is regarded as the primary distinction. An example of such an overlapping system is seen in ‎(3). In SE Pashai, the main verb agrees in masculine or feminine gender with the direct object (‘boy’ in [a], ‘prayer’ in [b], and ‘cup’ in [c]), while an auxiliary agrees with it in animate or inanimate gender.

(3)

SE Pashai

pari-y	kel-aa	kaṭ-ee=šeer-a
Pari(f)-obl	boy-m	cot-obl=head-loc

ne-l-aw-aa-e	aas.
sit-trz-stv.ptc-m-poss.3sg	be.an.m.prs.3

‘Pari has seated the boy on the cot.’ (Lehr 2014: 290)

miy	maada-y	doa	be
dem.sg.obl	woman-obl	prayer(m)	too

ka-w-aa-e	š-i.
do-stv.ptc-m-poss.3sg	be.inan.prs-3

‘This woman has made a prayer.’ (Lehr 2014: 297)

mam	pelek	meez-ee=šeer-a	ǰe-w-i-m
I	cup(f)	table(f)-obl=on-loc	place-stv.ptc-f-poss.1sg

š-i.

be.inan.prs-3

‘I have placed the cup on the table.’ (Lehr 2014: 290)

Inherited feminine-masculine two-gender systems akin to that found in many Indo-Aryan languages exist in Nuristani as well, but are relatively rare in adjacent Iranian languages in that part of the region. Gender is not a general feature of Tibeto-Burman or Turkic, two of the other non-Indo-Aryan genera represented, whereas Burushaski, curiously, has a four-gender system not altogether different from what was noted in the languages in the region with overlapping sex-based and non-sex-based gender.

Map 4:

Number of genders and their bases in the Hindukush-Karakoram.

5.6 Alignment of case marking (full nouns)

Alignment is the topic of the next few features to be covered. First, alignment (W98A) that can be observed in the case marking of full nouns (as opposed to pronominal case) can take a number of values, of which three stand out as frequent in the global sample: namely a neutral one (i. e. no case marking of A, S or P arguments), a standard nominative-accusative one (different case marking of P vis-à-vis A and S), and an ergative-absolutive one (different case marking of A vis-à-vis S and P). The other three are relatively rarely occurring. As for Indo-Aryan in HK, we see a distribution markedly different from the global one. First, the overwhelmingly most common alignment type, displayed in 20 of the 27 languages included in this sample, is the ergative-absolutive (as exemplified with Indus Kohistani in ‎(4), with the A argument in (b) receiving an ergative case marking vis-à-vis the absolutive form of ‘man’ in (a)), while only one language, Bateri (and even that needing more careful research), displays a neutral system, two languages – once again, Khowar and Kalasha – have nominative-accusative alignment, and four display (at least conditionally) a tripartite type of alignment (i. e. differentiating A vs. P vs. S), namely Gawarbati, Wotapuri-Katarqalai, Sawi and Hindko (Table 26).

Table 26:

Alignment of case marking (full nouns).

Value	HK distribution	Global distribution (WALS)
Neutral	1 (4 %)	98 (52 %)
Nominative – accusative (standard)	2 (7 %)	46 (24 %)
Nominative – accusative (marked nominative)	0 (0 %)	6 (3 %)
Ergative – absolutive	20 (74 %)	32 (17 %)
Tripartite	4 (15 %)	4 (2 %)
Active – inactive	0 (0 %)	4 (2 %)

(4)

Indus Kohistani

ṣuu	maaṣ	nahirii	thuu.
this	man	hunter	be.prs.msg

‘This man is a hunter.’ (Own data: MVY-ValQuest-HU:070)

maaṣ-ee	kanzuuṭ	sand-il.
man-erg	stick	make-pfv

‘The man made a stick.’ (Own data: MVY-ValQuest-HU:085)

It should be clarified here that even within an individual language, alignment principles may be differently applied depending on e. g. the tense or aspect category that is expressed. In order to operationalize the distinction and make it more comparable, it was decided to categorize a language according to its maximum display of case differentiation, which in most instances were in a perfective or past tense-aspect category.

A similar distribution and relative diversity can be seen in Indo-Aryan in general (Verbeke 2011; Deo and Sharma 2006), where ergative case marking (applied in the perfective) represents the system inherited from earlier stages of Indo-Aryan, whereas languages (such as Bengali and Oriya) that consistently apply nominative case marking usually represent a later development (Masica 1991: 343–344).

5.7 Alignment of case marking (pronouns)

Defined identically to the previous feature, but now applied to case marking of pronouns, we find a similar global distributional pattern (W99A), with neutral, nominative-accusative and ergative-absolutive as the most frequently occurring patterns, especially the first two mentioned. But again, we find an intriguingly different distribution as far as Indo-Aryan in HK is concerned. The majority pattern (represented by 14 languages) is tripartite, another eight languages are ergative-absolutive, and two are nominative-accusative (see Map 5) (Table 27).

Table 27:

Alignment of case marking (pronouns).

Value	HK distribution	Global distribution (WALS)
Neutral	0 (0 %)	79 (47 %)
Nominative – accusative (standard)	2 (8 %)	61 (36 %)
Nominative – accusative (marked nominative)	0 (0 %)	3 (2 %)
Ergative – absolutive	8 (33 %)	20 (12 %)
Tripartite	14 (58 %)	3 (2 %)
Active – inactive	0 (0 %)	3 (2 %)
None	0 (0 %)	3 (2 %)

What we see is that the languages that display a tripartite differentiation with full nouns remain tripartite in their pronominal differentiation, but another nine languages (Gawri, Indus Kohistani, Torwali, Dameli, Grangali, Shumashti, Kalkoti, Palula, Pahari-Pothwari) that display an ergative (short for ergative-absolutive) differentiation with full nouns make a tripartite differentiation in their pronominal system (as in example ‎(5)), and, interestingly, also Bateri, which displays neutral alignment with full nouns, is tripartite as far as pronouns are concerned. I should hasten to add, that it is, once again, the maximum principle that has been applied: if a language displays tripartite differentiation for third person but only ergative for first person, it has been classified as tripartite.

(5)

Gawarbati

aa	baazaar	d-im-em.
1sg.nom	bazaar	go-prs-1sg

‘I’m going to the bazaar.’ (Own data: GWT-ErgQuest-MS/FM:009)

mui	se	aama	ta-um.
1sg.erg	that	house	see.pst-1sg

‘I saw that house.’ (Own data: GWT-ErgQuest-MS/FM:101)

sui	dos	mo	ta-on.
3pl.erg	yesterday	1sg.obl	see.pst-3pl

‘Yesterday, they saw me.’ (Own data: GWT-ErgQuest-MS/FM:098)

Map 5:

Alignment of case marking (pronouns) in the Hindukush-Karakoram.

The tripartite differentiation can in many ways be explained as a result of two partly overlapping features, both with clearly geographical distributions (Liljegren 2014: 162–167). The first one is direct object marking. While object marking is altogether absent in four of the languages in the eastern-most part of the region, it is either restricted to pronominal objects or to definite objects in the rest of the region. In none of the Indo-Aryan languages investigated does it occur obligatorily with full nouns (Table 28).

Table 28:

Object marking.

Value	HK distribution
No object marking	4 (17 %)
Pronominal object marking	8 (35 %)
Definite object marking	11 (48 %)
Obligatory object marking	0 (0 %)

The second feature is agent marking. While it is altogether absent in two languages, both located in the northwestern part of the region, it is for the most part conditional, mostly meaning that it occurs with A arguments in the perfective only. Only in four of the languages, all spoken in the eastern-most part of the region, i. e. in the vicinity of Tibeto-Burman, does it appear regardless of the tense-aspect categories expressed. A similar tendency for agent marking to spread from the perfective to the imperfective has been noted for Nepali, another Indo-Aryan language spoken adjacent to Tibeto-Burman (Verbeke 2011: 163) (Table 29).

Table 29:

Agent marking.

Value	HK distribution
No agent marking	2 (7 %)
Conditional agent marking	21 (78 %)
Obligatory agent marking	4 (15 %)

5.8 Alignment of verbal person marking

Clearly related to, but yet morphologically distinct from, the immediately aforementioned features, alignment of verbal person marking is also part of the WALS set (W100A). In this case, the globally most frequent type is the accusative, followed by neutral alignment. All other alignment types are relatively infrequent, including ergative alignment or alignment types displaying ergativity in a subsystem of the language. Among the Indo-Aryan languages of HK, there is an even division between languages with accusative alignment and languages with a split alignment (Table 30).

Table 30:

Alignment of verbal person marking.

Value	HK distribution	Global distribution (WALS)
Neutral	0 (0 %)	84 (22 %)
Accusative	11 (50 %)	212 (56 %)
Ergative	0 (0 %)	19 (5 %)
Active	0 (0 %)	26 (7 %)
Hierarchical	0 (0 %)	11 (3 %)
Split	11 (50 %)	28 (7 %)

It should be particularly noted that accusative alignment of verbal person marking is not only a feature of languages with accusative case marking, such as Khowar, see ‎(6). Instead, we find languages, both in the eastern-most part and in the western-most part of the region that have a consistent accusative alignment of verbal person marking while simultaneously displaying ergative case marking in one or all of their tense-aspect categories, such as Gilgiti Shina, see ‎(8). However, in Palula, see ‎(7), as in many of the other languages of the region, ergative case alignment lines up with ergative verbal agreement (in the perfective).

(6)

Khowar

ḍaq	keɬki-o	čhin-t-ai.
boy	window-acc	break-pst.act-3sg

‘The boy broke the window.’ (Own data: KHW-ValQuest-AA:025)

(7)

Palula

phoo-á	darúṛi	phooṭéel-i.
boy-erg	window(f)	break.pfv-f

‘The boy broke the window.’ (Own data: PHL-ValQuest-NH:025)

(8)

Gilgiti Shina

ro	baál-se	khiṛkí	phuṭ-eéɡ-u.
rem.msg	boy-erg	window(f)	break-pfv-3msg

‘The boy broke the window.’ (Own data: SCL-ValQuest-AH:025)

While the typical pattern in Indo-Aryan in general is the split ergative one (such as in Urdu-Hindi), there are other examples of Indo-Aryan languages that either combine non-nominative case marking of transitive subjects with nominative-accusative verbal agreement (e. g. Nepali), or that display verbal as well as nominal accusativity (e. g. Bengali) (Deo and Sharma 2006: 374–382).

5.9 Verbal person marking

Elaborating on the immediately aforementioned feature, the present feature (W102A) specifies the number and identity (whether A or P) of the person-marking on the verb. What is most significant here is that the value which stands for “person marking of the A or P but not of both”, is the least favoured in the global sample, whereas it occurs in as many as nine of the 25 Indo-Aryan languages included in this sample (Table 31).

Table 31:

Verbal person marking.

Value	HK distribution	Global distribution (WALS)
No person marking	0 (0 %)	82 (22 %)
Only the A argument	11 (44 %)	73 (19 %)
Only the P argument	0 (0 %)	24 (6 %)
A or P argument	9 (36 %)	6 (2 %)
Both the A and P arguments	5 (20 %)	193 (51 %)

This alternation is related, yet again, to a (primarily) aspectual split, not uncommon in Indo-Aryan in general, displayed also in a major language such as Urdu-Hindi, so that the verb agrees with the A in the imperfective, while the verb agrees with the P in the perfective.

Table 32:

Object agreement.

Value	HK distribution
No P agreement	12 (48 %)
Nominative P agreement only	3 (12 %)
Non-nominative and nominative P agreement alike	10 (40 %)

However, contrary to the situation in Urdu-Hindi, where differential (non-nominative) case marking of an object neutralizes verb agreement, the majority of the Indo-Aryan languages of our target region display object agreement (in at least one of their tense-aspect categories) regardless of the case marking of the P argument (Table 32). In Sawi, as can be seen in ‎(9), the verb agrees in masculine gender and plural number with the direct object pronoun teeno ‘them’, itself occurring in an oblique form (the corresponding nominative form is se ‘they’).

(9)

Sawi

ti	teeno	mor-il-ee.
3sg.erg	3pl.obl	kill-pfv-mpl

‘He killed them.’ (Own data: SDG-PronQuest-S:062)

Again, the various patterns observed in the Indo-Aryan languages of HK have been noted for Indo-Aryan outside of the region, although it seems that it is primarily in the northwestern part of the Subcontinent where simultaneous A and P agreement can be found (Masica 1991: 343–346).

6 Syntax

6.1 Basic word order

The order of subject, object and verb in transitive, declarative clauses is considered a fundamental typological feature. Globally (W81A), the two types SOV (subject, followed by object, followed by verb) and SVO dominate greatly, while the remaining types are considerably less frequent. In the sample under study, the languages are almost exclusively SOV. The only notable exception to this pattern is Kashmiri (see ‎[10]), which, at least superficially is SVO, but is in more detailed studies better described as a V2 language (Table 33).

Table 33:

The order of subject, object and verb.

Value	HK distribution	Global distribution (WALS)
SOV	30 (97 %)	565 (41 %)
SVO	1 (3 %)	488 (35 %)
VSO	0 (0 %)	95 (7 %)
VOS	0 (0 %)	25 (2 %)
OVS	0 (0 %)	11 (1 %)
OSV	0 (0 %)	4 (0 %)
No dominant order	0 (0 %)	189 (14 %)

(10)

Kashmiri

kɔṛi	ron	maaz.
woman.erg	cook.pst.msg	meat

‘The woman cooked meat.’ (Own data: KAS-ValQuest-KR:079)

In this respect, the Indo-Aryan languages of HK are in line with the typology of much of the rest of Indo-Aryan as well as with that of the languages of the entire surrounding macro-area (Masica 2001: 240–243).

6.2 Adpositions

The feature defined as the order of adposition and noun phrase has two globally frequent values, namely postpositions and prepositions (W85A). As for the present sample, postpositions are exclusively favoured. Postpositions are often preceded or mediated by an oblique (or other non-nominative) case form of the noun, as in the Northern Hindko examples in Tables 34 and 35.

Table 34:

The order of adposition and noun phrase.

Value	HK distribution	Global distribution (WALS)
Postpositions	31 (100 %)	576 (49 %)
Prepositions	0 (0 %)	511 (43 %)
Inpositions	0 (0 %)	8 (1 %)
No dominant order	0 (0 %)	58 (5 %)
No adpositions	0 (0 %)	30 (3 %)

This feature, is, like the above-mentioned, widespread in South Asia and dominant as far as Indo-Aryan is concerned, although in this case it should be noted that the Indo-Aryan languages of HK are indeed spoken close to or immediately adjacent to Iranian languages with a dominant SOV/prepositional typology (Stilo 2009: 6).

Table 35:

Examples of postpositions and postpositional phrases in Northern Hindko.

Postpositions		Postpositional phrases
bič	‘in’	qasbe bič	‘in the town’
naal	‘with’	ḍanḍe naal	‘with a club’
suṇ	erg	kuṛi suṇ	‘the woman (a)’
ko	‘to’, acc/dat	daadi ko	‘to grandmother’
da/di/de	‘of’, gen	maasi de (peese)	‘the lady’s (money)’

6.3 Zero copula for predicate nominals

The possibility of expressing predicate nominals without an overt copula is investigated in WALS (W120A). There is an almost equal distribution between languages for which is it possible and for which it is impossible. A similar distribution is also found in the HK sample, although it was only possible to determine this for 18 out of 31 languages, and even then with a measure of uncertainty (Table 36).

Table 36:

Zero copular for predicate nominals.

Value	HK distribution	Global distribution (WALS)
Impossible	10 (56 %)	211 (55 %)
Possible	8 (44 %)	175 (45 %)

The possibility of expressing predicate nominals with a zero copula has been noted for individual languages belonging to the Chitral, Shina and Kohistani groups. Often this is only an option in present tense, whereas with a past-time reference a copula is needed, as is shown with the Kalkoti examples in ‎(11). The findings do not allow for stipulating any particular geographical distribution.

(11)

Kalkoti

roo	mii	ɡan	dra.
3sg.dist.nom	1sg.gen	big	brother

‘He is my older brother.’ (Own data: XKA-SentQuest-MJ:046)

soo	ä	proofisär	aas.
3sg.rem.nom	idef	professor	be.pst

‘He was a professor.’ (Own data: XKA-SentQuest-MJ:048)

This feature does not allow for any systematic comparison with other Indo-Aryan languages, but it should be noted that while overt copulas are typical in many of the main Indo-Aryan languages of the Subcontinent, a zero copula is instead the norm in eastern Indo-Aryan and in Sinhalese (Masica 1991: 336–337).

6.4 Position of polar question particles

The use and the position of polar question particles is covered in WALS (W92A). While such particles are frequent in the world’s languages, with a preference for the clause-final position, they are also entirely absent in a large group of languages, in which polar questions instead can be coded by intonation or a different word order. The present sample, although with less-than-ideal coverage, indicates a clear preference for final-position particles, such as the question particle -a in Brokskat cliticized to the finite verb: so ut-a? ‘Did he come?’ (Table 37)

Table 37:

Position of polar question particles.

Value	HK distribution	Global distribution (WALS)
Initial	0 (0 %)	129 (15 %)
Final	11 (73 %)	314 (36 %)
Second position	0 (0 %)	52 (6 %)
Other position	1 (7 %)	8 (1 %)
In either of two positions	0 (0 %)	26 (3 %)
No question particle	3 (20 %)	355 (40 %)

Question particles are found in most Indo-Aryan languages outside of the HK region too. However, initial markers (e. g. in Urdu-Hindi, perhaps influenced by Persian) may in fact be as frequent as final markers (e. g. in Marathi, a feature shared with Dravidian). Polar questions formed simply by a distinct intonational pattern (e. g. in Gujarati) also occurs (Masica 1991: 388–389).

6.5 Clause chaining

A feature of a geographically similar scope to that of the basic word order SOV is the use of converb constructions (or conjunctive participle constructions, as they are sometimes referred to in the South Asian linguistic tradition) in complex sentences. An example from Gilgiti Shina ‎(1) illustrates how such clause-final converbs link successive and closely related events in a string, before being concluded with a finite verb. This is extremely common in the Indo-Aryan languages of the HK region. Strikingly similar chaining constructions, often corresponding to coordinate clauses in English and other European languages, are found in a wider macro-area, from Dravidian languages in southern India (Krishnamurti 2003: 440) to the Caucasus (Haspelmath 1993: 375–378).

(12)

Gilgiti Shina

wa-ií	tóom	póoc̣-ey	hat-éč	lam-ií
come-cv	refl	grandson-gen	hand-by	catch-cv

ḍukúr-ir	har-iíɡ-u
hut-dat	take-pfv-pst.3msg

‘He came and took his grandson’s hand and brought him to the hut.’ (Radloff and Shakil 1998: 78)

As this feature is tendential rather than categorical, no attempt has been made to quantify or further specify the feature as it occurs in the region.

6.6 Complex constructions

Another tendential feature is the co-existence of left-branching and right-branching structures in the formation of complex constructions, but one which appears to show some interesting sub-areal differences in their respective distribution (Hook 1987). As shown in a smaller quantitative pilot study (Rönnqvist 2014), the further away from the dominance of strongly right-branching languages of wider communication, in this case Persian and Urdu-Hindi, a language is situated, the higher is the proportion of left-branching structures. The Palula sentence in ‎(13) is an example of a left-branching structure, in which an adverbial subordinate clause with a clause-final general subordination marker ta, precedes the main clause.

(13)

Palula

[ǰaanɡul-á	ma	bhanǰ-óol-u	ta]	ru-áan-u.
Jangul-erg	1sg.nom	beat-pfv-msg	sub	weep-prs-msg

‘I’m weeping, because Jangul beat me.’ (Own data: PHL-Hunter:104)

7 Lexicon and lexical organization

7.1 Numeral bases

In the categorization of numeral bases, the global sample shows a definite preference for decimal systems, whereas all other types are significantly less frequent. In this case, the present sample is particularly interesting in its overwhelming favouring of hybrid vigesimal-decimal systems, such as the one in Dameli, Table 38.

Table 38:

Numerals in Dameli.

‘10’	daš		‘60’	traa bišii	= three 20
‘20’	bišii		‘70’	traa bišii oo daš	= three 20 and 10
‘30’	bišii oo daš	= 20 and 10	‘80’	čoor bišii	= four 20
‘40’	du bišii	= two 20	‘90’	čoor bišii oo daš	= four 20 and 10
‘50’	du bišii oo daš	= two 20 and 10	‘100’	pããč bišii, sawa	= five 20, 100

Only four languages show no traces of vigesimal organization, all four spoken in the southeastern corner of the region, thus bordering on the main Indo-Aryan distribution (Table 39).

Table 39:

Numeral bases.

Value	HK distribution	Global distribution (WALS)
Decimal	4 (13 %)	125 (64 %)
Hybrid vigesimal-decimal	26 (87 %)	22 (11 %)
Pure vigesimal	0 (0 %)	20 (10 %)
Other base	0 (0 %)	5 (3 %)
Extended body-part system	0 (0 %)	4 (2 %)
Restricted	0 (0 %)	20 (10 %)

Vigesimality is also reflected in many of the non-Indo-Aryan languages represented in the region, in Iranian, Nuristani and Tibeto-Burman as well as in Burushaski.

Elaborating further on the actual composition of numerals (hybrid vigesimal-decimal and decimal alike), reveals an interesting pattern with sub-areal aspects. Three different types were noted: one that consistently uses a 10+n and 20+n pattern, another that consistently uses the reverse pattern, n+10 and n+20, and a third that uses a mixed pattern, i. e. n+10 and 20+n (Table 40). The mixed pattern is used by most of the languages for which there is data, followed by the consistent n+base pattern, leaving two languages with a consistent base+n pattern (Map 6).

Table 40:

Numeral composition.

Value	Sample representation
10+n, 20+n	2 (7 %)
n+10, 20+n	15 (54 %)
n+10, n+20	11 (39 %)

Examples of each of the three patterns are displayed in Table 41.

Table 41:

The numerals ‘6ʹ, ‘16ʹ and ‘26ʹ in Kalasha, Gawarbati and Bateri.

		‘6’	‘16’	‘26’
Kalasha	10+n, 20+n	ṣo	daš-že-ṣoa	biši-že-ṣo
Gawarbati	n+10, 20+n	ṣo	ṣo-ṛaas	iši-o-ṣo
Bateri	n+10, n+20	ṣu	ṣu-weeš	ṣu-yu-biiš

Map 6:

Numeral bases and composition in the Hindukush-Karakoram. (V=vigesimal(-decimal), D=decimal).

7.2 Contrasts in demonstratives

The WALS topic (W41A) is sorted under nominal categories. However, it could also be argued that such contrasts are lexical rather than grammatical, which is the reason for it to appear in this section. Globally, the most common system is one with a two-way distance contrast, as in English ‘this’ vs. ‘that’, followed by languages with a three-way contrast. More elaborate systems are very infrequent in the world-wide sample. The most common system, described or verifiable for at least half of the Indo-Aryan languages of the region, is one with three contrastive terms. In only one language, namely Grangali, do we have evidence, although far from firm or well-described, for four basic contrastive demonstrative terms (Table 42).

Table 42:

Distance contrasts in demonstratives.

Value	HK distribution	Global distribution (WALS)
No distance contrast	0 (0 %)	7 (3 %)
Two-way contrast	12 (41 %)	127 (54 %)
Three-way contrast	16 (55 %)	88 (37 %)
Four-way contrast	1 (3 %)	8 (3 %)
Five (or more)-way contrast	0 (0 %)	4 (2 %)

In a majority of the languages with what appears to be a three-way contrast, two primary features intersect, and only the first-mentioned is in a strict sense related to distance: a) proximal (within reach) vs. non-proximal (out of reach), and b) visible vs. non-visible. The three terms can thus be defined as +proximal+visible, -proximal+visible, -proximal -visible (Table 43).

Table 43:

Basic demonstrative contrasts in Khowar, Palula and Kashmiri. (Bashir 2003: 845; Koul 2003: 911–912).

	+prox+vis	-prox+vis	-prox-vis
Khowar	(ha)yá	(h)es	(ha)sé
Palula	(ee)nú	(ee)ṛó	(ee)sé
Kashmiri	yi(hoy)	ho(hay)	su(y)

In addition to that basic differentiation, another cross-cutting dimension often enters into it, namely emphasis or accessibility; the forms within parentheses are the emphatic elements in each of the terms displayed in Table 43. In many cases, this results in two contrasting forms of each basic demonstrative, one emphatic and one non-emphatic, the latter often used as third person pronouns.

Considerably complicating the picture is the availability of further specifications, especially applicable to the out of reach/visible category. Those specifications are frequently geomorphic, providing details as to e. g. the vertical position of a referent vis-à-vis the deictic centre.

7.3 Kinship systems

Although kinship systems are not treated in WALS, they are a highly promising domain for investigation, especially in the way they may reflect cross-community relationships in the region. As this is a vast field in itself, the present cross-linguistic comparison is very restricted and only one particular sub-domain has been selected for this study, namely kinship terms for one’s parents and their siblings. The values set up are the following: a) separate terms are used for all six types of relations, b) all three male relations have separate terms, but the term for mother’s and father’s sister is the same but different from mother, c) both father and father’s brother share terms, as do mother and mother’s sister; mother’s brother and father’s sister have separate terms, d) father and father’s brother are equal, but different from mother’s brother; all three female relations have separate terms, e) father and father’s brother are equal, but different from mother’s brother; the term for mother’s and father’s sister is the same but different from mother, f) father’s brother and mother’s brother share a term, while distinct from father; mother’s sister and father’s sister share a term, while distinct from mother. Interestingly, all six configurations have been verified in the sample languages (Table 44).

Table 44:

Kinship terms for one’s father and mother and their respective siblings (F=father; FB=father’s brother; FZ=father’s sister; M=mother; MB=mother’s brother; MZ=mother’s sister).

Value	Sample representation
F≠FB≠MB/M≠MZ≠FZ	9 (45 %)
F≠FB≠MB/M≠MZ=FZ	1 (1 %)
F=FB≠MB/M=MZ≠FZ	3 (15 %)
F=FB≠MB/M≠MZ≠FZ	5 (25 %)
F=FB≠MB/M≠MZ=FZ	1 (1 %)
F≠FB=MB/M≠MZ=FZ	1 (1 %)

Although the present sample only gives an indication of distribution (Map 7), it appears that the pattern F=FB/M=MZ is an eastern or northeastern feature (Figure 2), also reflecting what has been posited as the ancestral terminology of Burushaski (Parkin 1987: 165) and the terminology used in Balti, the nearest Tibeto-Burman neighbour. If only looking at the F=FB pattern, it appears typical of the languages spoken in an uninterrupted central belt, stretching from the Kunar valley in the west to Gilgit-Baltistan in the east of the HK region.

Figure 2:

Palula kinship terminology (ɡaaḍ-/ɡeeḍ- ‘big’, lhook- ‘small’).

The maximum differentiating terminology, with six different terms is instead common in a southern belt, thus aligning itself with Punjabi kinship systems, while an “aunt” and “uncle” terminology is found in the northwest (consistently so in Khowar), with reflexes in adjacent non-Indo-Aryan communities in the Pamir and its surroundings (e. g. in Iranian Yidgha). The other configurations are in various ways asymmetrical and are found in the central parts of the region.

Map 7:

Kinship systems (uncles, aunts) in the Hindukush-Karakoram.

8 Conclusions

The present study has produced an empirically sound and balanced, yet in many respects preliminary, typological profile of the 31 Indo-Aryan varieties spoken in the mountainous Hindukush-Karakoram-Western Himalayan region. For the sake of convenience, only language varieties that with some measure of confidence have been identified as Indo-Aryan, regardless of their lower-level classification, and are spoken by a community with a significant presence within the geographical coordinates of 34°-37° N and 69°-77° E (primarily north-eastern Afghanistan, northern-most Pakistan, and the disputed area of Kashmir), were included. The resulting micro-typological patterns were compared to global distributions as well as with features characterizing Indo-Aryan in general and patterns that have attested areal distributions overlapping with this particular region. Not surprisingly, a number of features (or distributions of values) are to a large extent shared with Indo-Aryan in general. Those are e. g. a basic SOV order, the use of adpositions, the size of vowel (quality) inventories, sex-based two-gender systems (shared by a majority of the languages), alignment systems that in one way or another are characterized by ergativity, and clause chaining by means of converb constructions.

Other features do have scattered reflexes in Indo-Aryan outside of the HK region, but are dominant or especially prevalent (although not necessarily found in all of the languages) in the region: large consonant inventories, the (lexical) use of contrastive tone, tripartite pronominal case marking alignment, final-position question markers, a high frequency of left-branching subordinate constructions, and the presence of multi-degree and multi-dimensional deictic systems.

Yet other features display a high or relatively high degree of diversity, often bundling subareally. That is e. g. the case with syllable structures: languages allowing complex ones are mainly found at the geographical peripheries while syllable patterns tend to be simple in the languages spoken at the centre. Vowel nasalization is more prevalent in the eastern part of the region than in its western part. Grammaticalized animacy distinctions, non-sex-based gender, a waning (or non-existent) ergativity, optionality of plural marking, object marking and the preference for expressing non-verbal predication without a copula occur much more readily in the western-most part of the region, features that to a large extent are also shared with adjacent Iranian and Nuristani languages. Agent-marking in the imperfective as well as in the perfective, simultaneous A and P agreement and extensive case marking systems bundle in the East or Northeast. At least with regard to agent marking, the feature is shared with neighbouring Burushaski and Tibeto-Burman Balti. While object-agreement prevails in the South or Southeast, i. e. in languages that are in direct contact with or are spoken in the vicinity of larger and more influential Indo-Aryan languages, agent-agreement (whether or not case-marking alignment is ergative or accusative) is widespread elsewhere in the region. Possible northern (mainly Chitral) features, shared with languages in the Central Asian Pamirs and beyond, are the presence of uvular consonants (stops and continuants alike), numerals composed as base+n, and a kinship system in which uncles and aunts, whether they are the siblings of the father or of the mother, have terms entirely separate from the those for father and mother, respectively.

Only in a few cases do we observe a significant (non-peripheral) clustering of features that are not characterizing Indo-Aryan in general: tripartite affricate – fricative differentiation, retroflexion across several manner of articulation sets, aspiration contrasts typically involving voiceless consonants only, tone used extensively in lexical differentiation, 20-based numeral systems and a characteristic polysemy pattern (father=father’s brother) in kinship terminology. This clustering, which at least partly extends areally also to non-Indo-Aryan languages, but nevertheless does not include the totality of the Indo-Aryan languages in the region (not even those languages that traditionally were classified as “Dardic”) seems to point in the direction of a “hard core” at the geographical centre of the region (Map 8). That is, languages – chiefly belonging to the Kohistani and Shina groups – that to a large extent share these specific as well as other less specific features, gradually fading out toward its peripheries. A similar feature clustering at the centre of a proposed linguistic area has for instance been observed with respect to Standard Average European (Auwera 1998: 814–826; Dahl 2001: 1458).

Map 8:

An approximate identification of a “hard core” of Hindukush Indo-Aryan.

This micro-typology therefore in a general sense affirms Morgenstierne’s analysis, namely that “[t]here is not a single common feature distinguishing Dardic, as a whole, from the rest of the IA languages” (1961: 139). On the one hand, it is obvious that these languages form a continuum together with the main Indo-Aryan languages of the northwestern Subcontinent, with a gradually increased clustering of more prototypical Hindukush-Karakoram features toward the central – but in relation to the rest of Indo-Aryan more peripheral – parts of this region. On the other hand, these languages also show a high degree of diversity, with individual languages taking part in various subareal configurations or transit zones that are represented in the region, further complicating any attempts at defining them collectively in more exact, or exclusive, areal terms. We must therefore bear in mind that any collective reference to these languages, be it Hindukush Indo-Aryan or any other term, will have to be interpreted as a highly gradient notion, acknowledging the apparent lack of any complete list of innovations, let alone retentions, that would cover more than a subset of them.

Funding statement: This work is part of the project Language contact and relatedness in the Hindukush region, supported by the Swedish Research Council (421-2014-631).

Acknowledgements

I would like to thank Alla-ud-din, Bahrain, Swat, for his help with digitizing questionnaire data, Noa Lange, Stockholm, for assistance in processing and annotating audio and video recordings, and Maria Koptjevskaja Tamm and Ekaterina Tavadze Melin, Stockholm, for making the contents of publications in Russian available to me. Thanks also to the two anonymous reviewers who offered a number of helpful suggestions.

Abbreviations

a: the most agent-like argument in a transitive clause (see p and s)
acc: accusative
act: actual
an: animate
cv: converb
dat: dative
dem: demonstrative
dist: distal
erg: ergative
f: feminine
fut: future
gen: genitive
idef: indefinite
inan: inanimate
loc: locative
m: masculine
nom: nominative
obl: oblique
p: the most patient-like argument in a transitive clause (see a and s)
pfv: perfective
pl: plural
poss: possessive
prs: present
pst: past
ptc: participle
refl: reflexive
rem: remote
s: the sole argument in an intransitive clause (see a and p)
sg: singular
spc: specific
stv: stative
sub: subordinator
trz: transitivizer
1: first person
3: third person

References

Auwera, Johan van der. 1998. Conclusions. In Johan van der Auwera (ed.), Adverbial constructions in the languages of Europe, 813–836. Berlin: Mouton de Gruyter.10.1515/9783110802610.813Search in Google Scholar

Baart, Joan L. G. 1997. The sounds and tones of Kalam Kohistani: With wordlist and texts (Studies in Languages of Northern Pakistan 1). Islamabad: National Institute of Pakistan Studies and Summer Institute of Linguistics.Search in Google Scholar

Baart, Joan L. G. 1999. A sketch of Kalam Kohistani grammar (Studies in Languages of Northern Pakistan 5). Islamabad: National Institute of Pakistan Studies and Summer Institute of Linguistics.Search in Google Scholar

Baart, Joan L. G. 2014. Tone and stress in North-West Indo-Aryan: A survey. In Johanneke Caspers, Yiya Chen, Willemijn Heeren, Jos Pacilly, Niels O. Schiller & Ellen van Zanten (eds.), Above and Beyond the Segments, 1–13. Amsterdam: John Benjamins Publishing Company. https://benjamins.com/catalog/z.189.01baa (accessed 9 November 2015).10.1075/z.189.01baaSearch in Google Scholar

Baart, Joan L. G. & Khawaja A. Rehman. 2005. A first look at the language of Kundal Shahi in Azad Kashmir. SIL Electronic Working Papers 2005–8. 22. November, 2016).Search in Google Scholar

Bailey, Thomas Grahame. 1924. Grammar of the Shina (ṣiṇā) language (Prize Publication Fund 8). London: Royal Asiatic Society.Search in Google Scholar

Bashir, Elena. 1988. Topics in Kalasha syntax: An areal and typological perspective. University of Michigan PhD Dissertation.Search in Google Scholar

Bashir, Elena. 1996a. The areal position of Khowar: South Asian and other affinities. In Elena Bashir & Israr-ud-Din (eds.), Proceedings of the Second International Hindukush Cultural Conference (Hindukush and Karakoram Studies v. 1), 167–179. Karachi: Oxford University Press.Search in Google Scholar

Bashir, Elena. 1996b. Mosaic of tongues: Quotatives and complementizers in Northwest Indo-Aryan, Burushaski, and Balti. In William L. Hanaway & Wilma Heston (eds.), Studies in Pakistani popular culture, 187–286. Lahore: Lok Virsa Pub. House and Sang-e-Meel Publications.Search in Google Scholar

Bashir, Elena. 2003. Dardic. In George Cardona & Danesh Jain (eds.), The Indo-Aryan Languages, 818–894. London: Routledge.Search in Google Scholar

Bashir, Elena. 2006. Evidentiality in South Asian Languages. In Miriam Butt & Tracy Holloway King (eds.), Proceedings of the LFG06 Conference. Stanford: CSLI Publications.Search in Google Scholar

Bhatia, Tej K. 1975. The Evolution of Tones in Punjabi. Studies in the Linguistic Sciences 5/2(Fall 1975). 12–24.Search in Google Scholar

Buddruss, Georg. 1960. Die Sprache von Wotapur und Katarqala: Linguistische Studien im afghanischen Hindukusch (Bonner Orientalische Studien 9). Bonn: Selbstverlag des Orientalischen Seminars der Universität Bonn.Search in Google Scholar

Buddruss, Georg. 1967. Die Sprache von Sau in Ostafghanistan: Beiträge zur Kenntnis des dardischen Phalûra (Münchener Studien Zur Sprachwissenschaft Beiheft [Supplement] M). Munich: Kitzinger in Kommission.Search in Google Scholar

Cacopardo, Alberto & Augusto Cacopardo. 2001. Gates of Peristan: History, religion and society in the Hindu Kush. (Reports and Memoirs 5). Rome: Istituto Italiano per l’Africa e l’Oriente (IsIAO).Search in Google Scholar

Dahl, Östen. 2001. Principles of areal typology. In Martin Haspelmath, Ekkehard König, Wulf Oesterreicher & Wolfgang Raible (eds.), Language typology and language universals: An international handbook, 1456–1470. Berlin: Walter de Gruyter.Search in Google Scholar

Decker, Sandra J. 1992. Ushojo. In Calvin R. Rensch, Sandra J. Decker & Daniel G. Hallberg (eds.), Languages of Kohistan (Sociolinguistic Survey of Northern Pakistan 1), 65–80, 193–205. Islamabad: National Institute of Pakistani Studies and Summer Institute of Linguistics.Search in Google Scholar

Degener, Almuth. 2008. Shina-Texte aus Gilgit (Nord-Pakistan): Sprichwörter und Materialien zum Volksglauben, gesammelt von Mohammad Amin Zia. Wiesbaden: Harrassowitz.Search in Google Scholar

Deo, Ashwini & Devyani Sharma. 2006. Typological variation in the ergative morphology of Indo-Aryan languages. Linguistic Typology 10(3). 369–418. doi:10.1515/LINGTY.2006.012.Search in Google Scholar

Dryer, Matthew S. & Martin Haspelmath (eds.). 2013. The world atlas of language structures online. Leipzig: Max Planck Institute for Evolutionary Anthropology. http://wals.info (accessed 9 December 2015).Search in Google Scholar

Èdel’man, Džoi Iosifovna. 1980. K substratnomu naslediju central’no-aziatskogo jazykovogo sojuza [On a substrate of the Central Asian Sprachbund]. Voprosy jazykoznanija 5. 21–32.Search in Google Scholar

Fussman, Gérard. 1972. Atlas linguistique des parlers dardes et kafirs. Paris: École française d’Extrême-Orient; Dépositaire: Adrien-Maisónneuve.Search in Google Scholar

Grierson, George A. 1927. Linguistic survey of India. Vol. 1. P. 1, Introductory. Calcutta: Government of India, Central Publication Branch.Search in Google Scholar

Grierson, George A. 1929. Torwali: An account of a Dardic language of the Swat Kohistan (Prize Publications Fund; 11). London: Royal Asiatic society.10.1163/9789004656024Search in Google Scholar

Grjunberg, Aleksandr L. 1971. K dialektologii dardiskikh jazykov (glangali i zemiaki) [On the dialectology of Dardic languages (Glangali and Zemiaki)]. In N. A Dvoriankov (ed.), Indijskaja i iranskaja filologija: Voprosy dialektologii, 3–29. Moscow: Nauka.Search in Google Scholar

Hallberg, Daniel G. 1992. The languages of Indus Kohistan. In Calvin R. Rensch, Sandra J. Decker & Daniel G. Hallberg (eds.), Languages of Kohistan (Sociolinguistic Survey of Northern Pakistan 1), 83–141, 207–257. Islamabad: National Institute of Pakistan Studies and Summer Institute of Linguistics.Search in Google Scholar

Hallberg, Daniel G. & Calinda E. Hallberg. 1999. Indus Kohistani: A preliminary phonological and morphological analysis (Studies in Languages of Northern Pakistan 8). Islamabad: National Institute of Pakistan Studies and Summer Institute of Linguistics.Search in Google Scholar

Haspelmath, Martin. 1993. A grammar of Lezgian. Berlin: Mouton de Gruyter.10.1515/9783110884210Search in Google Scholar

Heegård, Jan & Ida Elisabeth Mørch. 2004. Retroflex vowels and other peculiarities in the Kalasha sound system. In Anju Saxena (ed.), Himalayan languages: Past and present (Trends in Linguistics. Studies and Monographs 149), 57–76. Berlin: Mouton de Gruyter.Search in Google Scholar

Heegård Petersen, Jan. 2015. Kalasha texts – With introductory grammar. Acta Linguistica Hafniensia 47(sup1). 1–275. doi:10.1080/03740463.2015.1069049.Search in Google Scholar

Hock, Hans Henrich. 2015. The Northwest of South Asia and beyond: The issue of Indo-Aryan retroflexion yet again. Journal of South Asian Languages and Linguistics 2(1). 111–135. doi:10.1515/jsall-2015-0005.Search in Google Scholar

Hook, Peter Edwin. 1987. Linguistic areas: Getting at the grain of history. In George Cardona (ed.), Festschrift for Henry Hoenigswald, 155–168. Tübingen: Gunter Narr Verlag.Search in Google Scholar

Jettmar, Karl. 2002. Beyond the gorges of the Indus: Archaeology before excavation. Karachi: Oxford University Press.Search in Google Scholar

Khan, Abdul Qadir & Nadeem Haider Bukhari. 2011. An acoustic study of VOT in Pahari stops. Kashmir Journal of Language Research 14(1). 111–128.Search in Google Scholar

Kogan, A. I. 2011. Pothohari iazyk [The Pothohari language]. In T. I Oranskaia, IU.V Mazurova & A. A Kibrik (eds.), IAzyki mira: novye indoariiskie iazyki, 516–527. Moskva: Academia.Search in Google Scholar

Koptjevskaja-Tamm, Maria. 2010. Linguistic typology and language contact. In Jae Jung Song (ed.), The Oxford handbook of linguistic typology, 568–590. Oxford University Press.10.1093/oxfordhb/9780199281251.013.0027Search in Google Scholar

Koul, Omkar N. 2003. Kashmiri. In George Cardona & Danesh Jain (eds.), The Indo-Aryan languages, 895–952. London: Routledge.Search in Google Scholar

Koul, Omkar N. & Roop Krishen Bhat. 2014. Kashmiri. In Omkar N. Koul (ed.), The languages of Jammu & Kashmir (People’s Linguistic Survey of India 12), 68–161. New Delhi: Orient Blackswan Private Limited.Search in Google Scholar

Krishnamurti, Bhadriraju. 2003. Dravidian languages. West Nyack, NY: Cambridge University Press. http://site.ebrary.com/lib/alltitles/docDetail.action?docID=10070010 (accessed 30 April 2014).10.1017/CBO9780511486876Search in Google Scholar

Lamuwal, Abd-El-Malek & Adam Baker. 2013. Southeastern Pashayi. Journal of the International Phonetic Association 43(2). 243–246. doi:10.1017/S0025100313000133.Search in Google Scholar

Lehr, Rachel. 2014. A descriptive grammar of Pashai: The language and speech community of Darrai Nur. Chicago: University of Chicago PhD Dissertation.Search in Google Scholar

Lewis, M. Paul, Gary F. Simons & Charles D. Fennig (eds.). 2015. Ethnologue: Languages of the World, Eighteenth edition. Online version. Dallas, TX: SIL International. http://www.ethnologue.com (accessed9 December 2015).Search in Google Scholar

Liljegren, Henrik. 2009. The Dangari Tongue of Choke and Machoke: Tracing the proto-language of Shina enclaves in the Hindu Kush. Acta Orientalia 70. 7–62.10.5617/ao.5341Search in Google Scholar

Liljegren, Henrik. 2013. Notes on Kalkoti: A Shina Language with Strong Kohistani Influences. Linguistic Discovery 11(1). 129–160. doi:10.1349/PS1.1537-0852.A.423.Search in Google Scholar

Liljegren, Henrik. 2014. A survey of alignment features in the Greater Hindukush with special references to Indo-Aryan. In Pirkko Suihkonen & Lindsay J. Whaley (eds.), On diversity and complexity of languages spoken in Europe and North and Central Asia (Studies in Language Companion Series 164), 133–174. Amsterdam: John Benjamins Publishing Company.10.1075/slcs.164.05lilSearch in Google Scholar

Liljegren, Henrik. 2016. A grammar of Palula (Studies in Diversity Linguistics 8). Berlin: Language Science Press.10.26530/OAPEN_611690Search in Google Scholar

Losey, Wayne E. 2002. Writing Gojri: Linguistic and sociolinguistic constraints on a standardized orthography for the Gujars of South Asia. Grand Forks, ND: University of North Dakota MA thesis.Search in Google Scholar

Lubberger, Beate. 2014. A description and analysis of four metarepresentation markers of Indus Kohistani. Grand Forks, ND: University of North Dakota MA thesis.Search in Google Scholar

Lunsford, Wayne A. 2001. An overview of linguistic structures in Torwali, a language of Northern Pakistan. University of Texas at Arlington MA thesis.Search in Google Scholar

Masica, Colin P. 1991. The Indo-Aryan languages. Cambridge: Cambridge University Press.Search in Google Scholar

Masica, Colin P. 2001. The definition and significance of linguistic areas: Methods, pitfalls, and possibilities (with special reference to the validity of South Asia as a linguistic area). In Peri Baskararao (ed.), The yearbook of South Asian languages and linguistics 2001, 205–267. London: SAGE.10.1515/9783110245264.205Search in Google Scholar

Morgenstierne, Georg. 1934. Notes on Tirahi. Acta Orientalia 12. 161–189.Search in Google Scholar

Morgenstierne, Georg. 1942. Notes on Dameli: A Kafir-Dardic language of Chitral. Norsk Tidsskrift for Sprogvidenskap 12. 115–198.Search in Google Scholar

Morgenstierne, Georg. 1945. Notes on Shumashti: A Dardic dialect of the Gawar-Bati type. Norsk Tidsskrift for Sprogvidenskap 13. 239–281.Search in Google Scholar

Morgenstierne, Georg. 1950. Notes on Gawar-Bati. Oslo: Det Norske Videnskaps-Akademi.Search in Google Scholar

Morgenstierne, Georg. 1961. Dardic and Kafir Languages. Encyclopedia of Islam, vol. 2, Fasc. 25, 138–139. New Edition. Leiden: E.J. Brill.Search in Google Scholar

Morgenstierne, Georg. 1967. Indo-Iranian frontier languages. Vol. 3, The Pashai language, 1, Grammar (Instituttet for Sammenlignende Kulturforskning. Serie B, Skrifter, 0332–6217; 40:3:1).Search in Google Scholar

Morgenstierne, Georg. 1974. Languages of Nuristan and surrounding regions. In Karl Jettmar & Lennart Edelberg (eds.), Cultures of the Hindukush: Selected papers from the Hindu-Kush Cultural Conference held at Moesgård 1970 (Beträge Zur Südasienforschung), vol. 1, 1–10. Wiesbaden: Franz Steiner Verlag.Search in Google Scholar

Nichols, Johanna. 1999. Linguistic diversity in space and time. Chicago: University of Chicago Press.Search in Google Scholar

Parkin, Robert. 1987. Kin Classification in the Karakorum. Man 22 (1).(New Series). 157–170. doi:10.2307/2802968.Search in Google Scholar

Perder, Emil. 2013. A grammatical description of Dameli. Stockholm University PhD Dissertation.Search in Google Scholar

Radloff, Carla F. & Shakil Ahmad Shakil. 1998. Folktales in the Shina of Gilgit: Text, grammatical analysis and commentary (Studies in Languages of Northern Pakistan 2). Islamabad: National Institute of Pakistan Studies and Summer Institute of Linguistics.Search in Google Scholar

Ramaswami, N. 1982. Brokskat grammar. (CIIL Grammar Series; 8). Mysore: CIIL.Search in Google Scholar

Rehman, Khawaja A. & Mark A. Robinson. 2011. Khindko iazyk [The Hindko language]. In T. I Oranskaia, IU.V Mazurova & A. A Kibrik (eds.), IAzyki mira: novye indoariiskie iazyki, 527–537. Moskva: Academia.Search in Google Scholar

Rönnqvist, Hanna. 2014. From left to right and back again: The distribution of dependent clauses in the Hindukush. Stockholm: Stockholm University MA thesis.Search in Google Scholar

Schmidt, Ruth Laila & Razwal Kohistani. 2008. A grammar of the Shina language of Indus Kohistan (Beiträge Zur Kenntnis Südasiatischer Sprachen and Literaturen 17). Wiesbaden: Harrassowitz.Search in Google Scholar

Sharma, Devidatta. 1998. Tribal languages of Ladakh. Part one: A concise grammar and dictionary of Brok-skad (Studies in Tibeto-Himalayan Languages 6), 1st edn. New Delhi: Mittal Publications.Search in Google Scholar

Stilo, Donald L. 2009. Circumpositions as an areal response: The case study of the Iranian zone. Turkic Languages 13(1). 3–33.Search in Google Scholar

Strand, Richard F. 1973. Notes on the Nuristani and Dardic Languages. Journal of the American Oriental Society 93(3). 297–305. doi:10.2307/599462.Search in Google Scholar

Tikkanen, Bertil. 1988. On Burushaski and other ancient substrata in northwestern South Asia. Studia Orientalia 64. 303–325.Search in Google Scholar

Tikkanen, Bertil. 1999. Archaeological-linguistic correlations in the formation of retroflex typologies and correlating areal features in South Asia. In Roger Blench & Matthew Spriggs (eds.), Archaeology and language IV: Language change and cultural transformation, 138–148. London: Routledge.10.4324/9780203208793_chapter_6Search in Google Scholar

Tikkanen, Bertil. 2008. Some areal phonological isoglosses in the transit zone between South and Central Asia. In Israr-ud-Din (ed.), Proceedings of the third International Hindu Kush cultural conference, 250–262. Karachi: Oxford University Press.Search in Google Scholar

Tikkanen, Bertil. 2011. Domaki noun inflection and case syntax. Studia Orientalia 2011(110). 205–228.Search in Google Scholar

Toporov, Vladimir Nikolayevich. 1970. About the phonological typology of Burushaski. In Roman Jakobson & Shigeo Kawamoto (eds.), Studies in General and Oriental Linguistics Presented to Shiro Hattori on the occasion of his sixtieth birthday, 632–647. Tokyo: TEC Corporation for Language and Educational Research.Search in Google Scholar

Trail, Ronald L & Gregory R. Cooper. 1999. Kalasha dictionary: With English and Urdu (Studies in Languages of Northern Pakistan 7). Islamabad: National Institute of Pakistan Studies and Summer Institute of Linguistics.Search in Google Scholar

Verbeke, Saartje. 2011. Ergativity and alignment in Indo-Aryan. Gent: Universiteit Gent PhD Dissertation.Search in Google Scholar

Weinreich, Matthias. 2011. Domaaki iazyk [The Domaaki language]. In T. I Oranskaia, IU.V Mazurova & A. A Kibrik (eds.), IAzyki mira: novye indoariiskie iazyki, 165–194. Moskva: Academia.Search in Google Scholar

Published Online: 2017-02-28

Published in Print: 2018-10-25

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

Articles in the same Issue

https://doi.org/10.1515/jsall-2017-0004

Keywords for this article

Indo-Aryan; Dardic; Hindukush-Karakoram; typology

Creative Commons

BY-NC-ND 4.0