| Both sides previous revisionPrevious revisionNext revision | Previous revision |
| en:grammar:phonology_and_spelling [2022-11-21 12:44] – [Consonants] 'w' now doesn't occur anymore christian | en:grammar:phonology_and_spelling [2023-02-01 10:48] (current) – [Syllable structure and hyphenation] No diphthong–vowel sequences christian |
|---|
| //Notes:// | //Notes:// |
| |
| * To see diphthong frequencies, follow the LAPSyD link given above, then select "Aggregate Vowel inventory" instead of "Show Language list" and click "show visualization". To sort the results, click on the "count" column in the "Diphthongs" table. Five diphthongs occur in more than ten of the investigated languages. Two of these – /ei̯/ and /ou̯/ – are formed of vowels that are directly next to each other in the vowel chart given above. In the case of such related vowels the risk is higher that people will clearly articulate just one half of the diphthong (reducing /ei̯/ to /e/ or /ou̯/ to /o/), therefore we don't admit these diphthongs, but we accept the other three. | * To see diphthong frequencies, follow the LAPSyD link given above, then select "Aggregate Vowel inventory" instead of "Show Language list" and click "show visualization". To sort the results, click on the "count" column in the "Diphthongs" table. Four diphthongs occur in at least twelve of the investigated languages. The rarest of these – /ei̯/ – is formed of vowels that are directly next to each other in the vowel chart given above. In the case of such related vowels the risk is higher that people will clearly articulate just one half of the diphthong (reducing /ei̯/ to /e/), therefore we don't admit this diphthong, but we accept the other three. |
| * The use of the apostrophe as a vowel separator is inspired by [[wp>pinyin]]. | * The use of the apostrophe as a vowel separator is inspired by [[wp>pinyin]]. |
| * Some linguists distinguish between "falling diphthongs" – as described here – and "rising diphthongs" which are sequences of an approximant (or semivowel) followed by a vowel. The latter will be covered below. | * Some linguists distinguish between "falling diphthongs" – as described here – and "rising diphthongs" which are sequences of an approximant (or semivowel) followed by a vowel. The latter will be covered below. |
| //Notes:// | //Notes:// |
| |
| * /z/ and /v/ occur in 27–30% of the languages listed in PHOIBLE. But they are rarer than their voiceless equivalents /s/ and /f/, and a voicing contrast exists most typically in plosives, but not in fricatives, which include the sibilants (WALS 4). So, to avoid introducing such a voicing contrast, we don't admit these sounds as separate phonemes. /z/ is admitted as a variant pronunciation of its voiceless equivalent. /v/, on the other hand, is not considered an acceptable alternative of any other sound since it's unclear which should be the closest sound. Its voiceless equivalent /f/ would be one candidate, but speakers of languages exposing the widespread phenomenon known as [[wp>betacism]] might consider it most similar to /b/, and speakers of languages that treat /v/ and /w/ as allophones – [[wp>Hindustani phonology#Allophony of [v] and [w]|such as Hindustani]] – might consider it most similar to /w/. | * /z/ and /v/ occur in 27–30% of the languages listed in PHOIBLE. But they are rarer than their voiceless equivalents /s/ and /f/, and a voicing contrast exists most typically in plosives, but not in fricatives, which include the sibilants (WALS 4). So, to avoid introducing such a voicing contrast, we don't admit these sounds as separate phonemes. /z/ is admitted as a variant pronunciation of its voiceless equivalent. |
| | * /v/ could conceivably be considered an acceptable alternative to several other consonants. Its voiceless equivalent /f/ would be one candidate, while speakers of languages exposing the widespread phenomenon known as [[wp>betacism]] might consider it most similar to /b/, and speakers of languages that treat /v/ and /w/ as allophones – [[wp>Hindustani phonology#Allophony_of_[v]_and_[w]|such as Hindustani]] – might consider it most similar to /w/. Historically, the letter **w** has developed out of **v**. In the [[wp>Latin_alphabet#Classical_Latin_alphabet|Classical Latin alphabet]], **v** represented both the vowel /u/ and the semivowel /w/. Only in later times did the pronunciation of that semivowel [[wp>Romance languages#Lenition|shift]] to the fricative /β/ (still widespread in modern Spanish as an allophone of /b/) and in other Romance languages further to /v/. Because of the Hindustani allophony of /v/ and /w/, the historical development of /v/ in modern Romance (and related) languages out of classical /w/, and the visual similarity and joint origin of the letters **v** and **w**, we treat /v/ as a acceptable alternative of /w/. Accordingly, /v/ in the source languages typically becomes /w/ in Lugamun (written **v** for reasons that will be further explained below). |
| * All additional sounds occurring in at 18 percent of the world's languages are admitted as alternative pronunciations of the sound or sound combination to which they can be considered most similar. | * All additional sounds occurring in at 18 percent of the world's languages are admitted as alternative pronunciations of the sound or sound combination to which they can be considered most similar. |
| * As [[wp>rhotic consonant|rhotic consonants]] vary a lot between the world's languages and it might be hard for people to get used to new ones, we allow the three most common rhotic consonants as pronunciations of **r**. But which one should be the preferred pronunciation? According to the PHOIBLE data, the trill /r/ might be the most widespread pronunciation (44%), followed by the tap or flap /ɾ/ (26%). However, /r/ is probably overcounted at the cost of other rhotics because of transcribers sometimes using the simple letter /r/ instead of the IPA letters for other rhotics, which are less easy to type. Among our source languages, a tap or flap is actually somewhat more common than the trill. Japanese, Spanish, Swahili, and some widespread Arabic variety such as Egyptian and Moroccan Arabic use the alveolar tap or flap /ɾ/, while Hindustani uses the voiced retroflex flap /ɽ/. On the other hand, Standard Arabic (but not all dialects), Indonesian, Russian, and Spanish use the alveolar or postalveolar trill /r/. While this is a close call, the tap or flap seems a bit more common cross-lingustically; moreover, it is arguably easier to learn for those not used to either than the trill, and more similar to the approximant /ɹ–ɻ/, which is used in English and (by some speakers) in Mandarin Chinese. Therefore we recommend the tap or flap as standard pronunciation, while admitting the other variants as alternatives. | * As [[wp>rhotic consonant|rhotic consonants]] vary a lot between the world's languages and it might be hard for people to get used to new ones, we allow the three most common rhotic consonants as pronunciations of **r**. But which one should be the preferred pronunciation? According to the PHOIBLE data, the trill /r/ might be the most widespread pronunciation (44%), followed by the tap or flap /ɾ/ (26%). However, /r/ is probably overcounted at the cost of other rhotics because of transcribers sometimes using the simple letter /r/ instead of the IPA letters for other rhotics, which are less easy to type. Among our source languages, a tap or flap is actually somewhat more common than the trill. Japanese, Spanish, Swahili, and some widespread Arabic variety such as Egyptian and Moroccan Arabic use the alveolar tap or flap /ɾ/, while Hindustani uses the voiced retroflex flap /ɽ/. On the other hand, Standard Arabic (but not all dialects), Indonesian, Russian, and Spanish use the alveolar or postalveolar trill /r/. While this is a close call, the tap or flap seems a bit more common cross-lingustically; moreover, it is arguably easier to learn for those not used to either than the trill, and more similar to the approximant /ɹ–ɻ/, which is used in English and (by some speakers) in Mandarin Chinese. Therefore we recommend the tap or flap as standard pronunciation, while admitting the other variants as alternatives. |
| * /d̠ʒ/ is written **j** in English, Hausa, Indonesian, Javanese, and Swahili. No two other considered languages share the same common representation, making this the obvious choice. | * /d̠ʒ/ is written **j** in English, Hausa, Indonesian, Javanese, and Swahili. No two other considered languages share the same common representation, making this the obvious choice. |
| * /k/ is written **k** in German, Indonesian, Javanese, pinyin, Swahili, and Turkish. In English and Vietnamese, it is usually **c** or **k**, depending on context (the sound that follows); in French, Portuguese, and Spanish it is usually **c** or **qu**, depending on context. **c** might be considered an alternative, but those languages that use **c** for /k/ use that spelling only in certain contexts, while **c** before front vowels such as **e** and **i** is typically pronounced /s/ or similar. This would make misreadings likely if **c** were used everywhere. **qu** would be a conceivable alternative, but it is much less common than **k** and uses one letter more without any obvious advantage. | * /k/ is written **k** in German, Indonesian, Javanese, pinyin, Swahili, and Turkish. In English and Vietnamese, it is usually **c** or **k**, depending on context (the sound that follows); in French, Portuguese, and Spanish it is usually **c** or **qu**, depending on context. **c** might be considered an alternative, but those languages that use **c** for /k/ use that spelling only in certain contexts, while **c** before front vowels such as **e** and **i** is typically pronounced /s/ or similar. This would make misreadings likely if **c** were used everywhere. **qu** would be a conceivable alternative, but it is much less common than **k** and uses one letter more without any obvious advantage. |
| | * /w/ is written **w** in English, Hausa, Indonesian, Javanese, and Swahili; **w** or **u** (after initials) in pinyin; **u** in Portuguese and Spanish (typically after /k/ or /g/ or in diphthongs); **u** or sometimes **o** in Vietnamese (in the same positions); **ou** or sometimes **w** in French. The most common spelling would therefore be **w**, closely followed by **u** (which, however, is needed for the vowel). But the allowed alternative pronunciation of /w/ is /v/ (most typically written **v** in languages using the Latin alphabet), and it would be somewhat odd to have **w**, but no **v**, considering that **w** is graphically "double v", so it takes a bit more space in printing and a bit more time in handwriting than a single **v** would. Historically, the letters **u**, **v**, and **w** all have developed out of **v**, which used to represent both /u/ and /w/ (but not /v/) in the Classical Latin pronunciation (as noted above). To avoid the "pointless" doubling that having just **w** would entail, and as a nod to the Classical Latin alphabet, we therefore write /w/ as **v**. This can also be considered a "compromise" between **w** (double v) and **u** (visually similar to **v** and of the same origin, but needed for the vowel), these being the two most common representations of this sound. Finally, writing **v** facilitates the integration of words from European languages that have /v/ (such as **eviden** 'obvious, evident'); while this sound is converted to /w/, the spelling is typically preserved. |
| * /ʃ/ is written as **sh** in English, Hausa, and Swahili; as **ch** in French; as **ch** or **x** in Portuguese. **x** is also used in several other Romance languages. Standard Chinese doesn't have /ʃ/, but pinyin uses both **x** and **sh** for quite similar sounds – **x** for /ɕ/, the voiceless alveolo-palatal sibilant fricative, and **sh** for /ʂ/, the voiceless retroflex sibilant fricative. We prefer the single letter **x** over the digraphs. | * /ʃ/ is written as **sh** in English, Hausa, and Swahili; as **ch** in French; as **ch** or **x** in Portuguese. **x** is also used in several other Romance languages. Standard Chinese doesn't have /ʃ/, but pinyin uses both **x** and **sh** for quite similar sounds – **x** for /ɕ/, the voiceless alveolo-palatal sibilant fricative, and **sh** for /ʂ/, the voiceless retroflex sibilant fricative. We prefer the single letter **x** over the digraphs. |
| * /j/ is **y** in English, Hausa, Indonesian, Javanese, Swahili, Turkish, and occasionally also in French, Portuguese, Spanish, and Vietnamese. No two other considered languages share the same common representation, making this the obvious choice. | * /j/ is **y** in English, Hausa, Indonesian, Javanese, Swahili, Turkish, and occasionally also in French, Portuguese, Spanish, and Vietnamese. No two other considered languages share the same common representation, making this the obvious choice. |
| * **bl, fl, gl, kl, pl, sl** | * **bl, fl, gl, kl, pl, sl** |
| * **br, dr, fr, gr, kr, pr, tr** | * **br, dr, fr, gr, kr, pr, tr** |
| * **by, cy, fy, ky, my, ny, py, xy** | |
| * **cv, dv, gv, hv, kv, sv, tv, xv** | * **cv, dv, gv, hv, kv, sv, tv, xv** |
| | * **by, cy, fy, ky, my, ny, py, xy** |
| |
| Note that **v** and **y** can be considered as consonantal equivalents of the vowels **u** and **i**. If you don't know how to pronounce them or have difficulties pronouncing them in any of these clusters, just pronounce the vowel quickly and without stress, followed by the actual vowel which forms the core of the syllable. | Note that **v** and **y** can be considered as consonantal equivalents of the vowels **u** and **i**. If you don't know how to pronounce them or have difficulties pronouncing them in any of these clusters, just pronounce the vowel quickly and without stress, followed by the actual vowel which forms the core of the syllable. |
| Though they would be allowed by the rules listed above, the consonant combinations **ry, sy, ty** are avoided in Lugamun. Instead the semivowel is replaced by the corresponding vowel **i** in such cases (**ri, si, ti**), for example in **nasion** 'nation' and **sosieti** 'social'. In these and other cases you may pronounce an unstressed **i** or **u** followed by another vowel as the corresponding semivowel (**y** or **v**) if you wish. Hence **nasion** may be pronounced as /nasiˈon/ or as /nasˈjon/, just as you prefer. | Though they would be allowed by the rules listed above, the consonant combinations **ry, sy, ty** are avoided in Lugamun. Instead the semivowel is replaced by the corresponding vowel **i** in such cases (**ri, si, ti**), for example in **nasion** 'nation' and **sosieti** 'social'. In these and other cases you may pronounce an unstressed **i** or **u** followed by another vowel as the corresponding semivowel (**y** or **v**) if you wish. Hence **nasion** may be pronounced as /nasiˈon/ or as /nasˈjon/, just as you prefer. |
| |
| //Notes:// | Within roots, a diphthong is never immediately followed by another vowel; in cases where this might be an option, the second part of the diphthong is instead replaced with the corresponding semivowel. For example, the Arabic numeral أَوَّل (ʾawwal) is adapted as **aval** /aˈwal/, not as //*aual// or //*auval//. Sequences of a diphthong followed by a vowel are, however, possible in compounds, e.g. the root **dau** and the suffix **-isme** form the compound **dauisme**. |
| | |
| | //Rationale:// |
| |
| * The rule for syllable-final consonants is inspired by APiCS, which notes that typical creole languages allow only a single consonant at the end of syllables (APiCS 119). The further restriction to the six allowed consonants was made by inspecting our source languages. Only consonants that commonly occur in a word-final position in at least half of them were accepted, with the further requirement that at least two of the source language that allow them must be non-Indo-European. The latter restriction was motivated by the fact that Indo-European languages tend to be much more generous in the set of final consonants they accept than other languages, at least among our sources. As Japanese, Mandarin, and Swahili are particularly restrictive regarding final consonants, the practical result is that the final consonants that commonly occur in both Arabic and Indonesian are allowed in our phonology as well. | * The specific set of consonants allowed to end a syllable was chosen on the basis of our source languages. Only consonants that commonly occur in a word-final position in at least half of them were accepted, with the further requirement that at least two of the source language that allow them must be non-Indo-European. The latter restriction was motivated by the fact that Indo-European languages tend to be much more generous in the set of final consonants they accept than other languages, at least among our sources. As Japanese, Mandarin, and Swahili are particularly restrictive regarding final consonants, the practical result is that the final consonants that commonly occur in both Arabic and Indonesian are allowed in our phonology as well. |
| * There is only one consonant allowed in two or more non-Indo-European source languages that fails the "half of all our source languages" criterion: the velar nasal /ŋ/, which in a word-final position can only be found in English, Hindi, Indonesian, and Mandarin. Since it is also rare at the beginning of syllables (WALS 9), this means that no position remains where it could occur as an independent sound. Hence we only accept it as an optional consonant into our phonology. | * There is only one consonant allowed in two or more non-Indo-European source languages that fails the "half of all our source languages" criterion: the velar nasal /ŋ/, which in a word-final position can only be found in English, Hindi, Indonesian, and Mandarin. Since it is also rare at the beginning of syllables (WALS 9), this means that no position remains where it could occur as an independent sound. Hence we only accept it as an optional consonant into our phonology. |
| * The consonant pairs allowed to start syllables are those that occur in this position (more frequently than as rare exceptions) in at least five of our ten source languages. Moreover, consonant pairs that occur in this position in Mandarin Chinese are also allowed even if they only occur in two or three other source languages. This adds adds six clusters ending in one of the semivowels **-v** and **-y** that would otherwise not be allowed (**cw, cy, hw, tw, xw, xy**). The reason for these additional admissions is that such consonant–semivowel pairs are very widespread in the Chinese vocabulary, where each core concept tends to be represented by a single syllable. Changing the semivowel to a vowel in such cases (hence dividing the single syllable into two) would make words of Chinese origin much less recognizable. | * The consonant pairs allowed to start syllables are those that occur in this position (more frequently than as rare exceptions) in at least five of our ten source languages. Moreover, consonant pairs that occur in this position in Mandarin Chinese are also allowed even if they only occur in two or three other source languages. This adds adds six clusters ending in one of the semivowels **-v** and **-y** that would otherwise not be allowed (**cv, cy, hv, tv, xv, xy**). The reason for these additional admissions is that such consonant–semivowel pairs are very widespread in the Chinese vocabulary, where each core concept tends to be represented by a single syllable. Changing the semivowel to a vowel in such cases (hence dividing the single syllable into two) would make words of Chinese origin much less recognizable. |
| * Two additional pairs that would fulfill the above criteria have, however, been excluded: **dy** because in rapid speech it can sound quit similar to **j**, and **ty** because it can sound similar to **c**. For the same reason, **ty** is also avoided between vowels, where it could otherwise still occur (since **t** is allowed to end a syllable). Likewise, **sy** is avoided between vowels because in rapid speech it can sound quit similar to **x**. The combination **ry** is avoided since it could be quite hard to pronounce, especially if one speaks the **r** as an approximant, as usual in English and Mandarin. | * Two additional pairs that would fulfill the above criteria have, however, been excluded: **dy** because in rapid speech it can sound quit similar to **j**, and **ty** because it can sound similar to **c**. For the same reason, **ty** is also avoided between vowels, where it could otherwise still occur (since **t** is allowed to end a syllable). Likewise, **sy** is avoided between vowels because in rapid speech it can sound quit similar to **x**. The combination **ry** is avoided since it could be quite hard to pronounce, especially if one speaks the **r** as an approximant, as usual in English and Mandarin. |
| |