Lugamun

An easy and fair language for global communication

User Tools

Site Tools


en:background:source_languages

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:background:source_languages [2022-09-27 15:59] – Are different source languages treated differently? christianen:background:source_languages [2022-11-14 21:59] (current) – w -> v christian
Line 27: Line 27:
 ===== Are different source languages treated differently? ===== ===== Are different source languages treated differently? =====
  
-Yes. A distinction is made between the five most widely spoken language (the "top 5") and the other five source languages (the "next 5"). A candidate word must be from one of the top 5 languages //or// it must have a related candidate in another language to be eligible for selection.+Yes. A distinction is made between the five most widely spoken languages (the "top 5") and the other five source languages (the "next 5"). A candidate word must be from one of the top 5 languages //or// it must have a related candidate in another language to be eligible for selection.
  
-This means that words from the "next 5" (French, Russian, Indonesian/Malay, Japanese, and Swahili) are not considered candidate words unless they have a related word (a true or false cognate) in any of the other nine source languages. For example, the word **to** 'that' is based on Japanese と (to) and related to Russian что (što) – without this related candidate, it would not have been eligible for selection and hence could not have made it into the language.+This means that words from the "next 5" (French, Russian, Indonesian/Malay, Japanese, and Swahili) are not considered candidate words unless they have a related word (a true or false cognate) in any of the other nine source languages. For example, the word **to** 'that' is based on Japanese と (to) and related to Russian что (što) – without this related candidate, it would not have been eligible for selection and hence could not have made it into the dictionary.
  
-On the other hand, candidates from the "top 5" (English, Mandarin Chinese. Hindustani, Arabic, Spanish) are eligible for selection even if they don't have any related candidate. For example, **twi** 'leg' is from from Chinese 腿 (tuǐ); there are no related (similar) words in any of the other source languages.+On the other hand, candidates from the "top 5" (English, Mandarin Chinese. Hindustani, Arabic, and Spanish) are eligible for selection even if they don't have any related candidate. For example, **tvi** 'leg' is from Chinese 腿 (tuǐ); there are no related (similar) words in any of the other source languages.
  
-All candidate words are sorted first by the number of related candidates and only then by their total penalty, which means that words that have at least one related candidate will always be preferred over those that have none. Hence the candidates from the "top 5" languages without any related candidates will be placed at the end of candidate list, after all candidates that do have related candidates. So they can be considered as "choices of last resort" that are only considered if no candidate word has (true or false) cognates in other source languages.+All candidate words are sorted first by the number of related candidates and only then by their total penalty, which means that words that have at least one related candidate will always be preferred over those that have none. Hence the candidates from the "top 5" languages without any related candidates will be placed at the end of the candidate list, after all candidates that do have related candidates. So they can be considered as "choices of last resort" that are only considered if no candidate word has (true or false) cognates in other source languages.
  
-The reason for limiting these "choices of last resort" to the top 5 languages is that it helps to ensure that all of Lugamun's words will be recognizable for a considerable number of people. Even without related candidates, many people will recognize words from English or Mandarin – much more than would recognize a word from Japanese or Swahili. This is why only the former, but not the latter, become eligible as "choices of last resort".+The reason for limiting these "choices of last resort" to the top 5 languages is that it helps to ensure that all of Lugamun's words will be recognizable to a considerable number of people. Even without related candidates, many people will recognize words from English or Mandarin – much more than would recognize a word from Japanese or Swahili. This is why only the former, but not the latter, become "choices of last resort."
  
 In cases where no word has any related candidates in other languages, words from the top 5 languages will therefore always be chosen, since only they are eligible. In all other cases, words from any of the source languages may be chosen, based on their overall penalty. In cases where no word has any related candidates in other languages, words from the top 5 languages will therefore always be chosen, since only they are eligible. In all other cases, words from any of the source languages may be chosen, based on their overall penalty.
  
-Nevertheless, the fact that the top 5 languages are sometimes privileged given them a visible advantage in the [[en:statistics#influence distribution|influence statistics]], where the top 5 (plus French) all have a higher influence than the other 4. French, thought not a top 5 language and so never yielding "choices of last resort", manages to retain a very high influence because its words are very often related to the words used in other source languages (especially with Spanish and English, but also with Russian). On the other hand, Mandarin quite rarely shares words with any other source language, and since Lugamun's selection algorithm always prefers shared words, its influence is therefore lower than that of any other top 5 language.+Nevertheless, the fact that the top 5 languages are sometimes privileged gives them a visible advantage in the [[en:statistics#influence distribution|influence statistics]], where the top 5 (plus French) all have a higher influence than the other 4. French, thought not a top 5 language and so never yielding "choices of last resort,manages to retain a very high influence because its words are often related to the words used in other source languages (especially with Spanish and English, but also with Russian and even Indonesian). On the other hand, Mandarin quite rarely shares words with any other source language, and since Lugamun's selection algorithm always prefers shared words, its influence is therefore lower than that of any other top 5 language.
  
-(In a few exceptional cases, words from the next 5 languages may be chosen even without the support of a related word. But this is only the case in the rare situation that //none// of the normally eligible candidates, so that the usual selection criteria need to be relaxed. If this is the case, the exceptional choice is always explained and justified in the [[https://gitlab.com/ChristianSi/lugamun/-/raw/main/data/selectionlog.txt|selection log]]. One example where this is the case is the optional object preposition **o** (from Japanese), which was accepted because none of the top 5 languages has an object marker/preposition that could replace it.)+(In a few exceptional cases, words from the next 5 languages may be chosen even without the support of a related word. But this is only the case in the rare situation that //none// of the normally eligible candidates is suitable, so that the usual selection criteria need to be relaxed. If this is the case, the exceptional choice is always explained and justified in the [[https://gitlab.com/ChristianSi/lugamun/-/raw/main/data/selectionlog.txt|selection log]]. One example where this is the case is the optional object preposition **o** (from Japanese), which was accepted because none of the top 5 languages has an object marker/preposition that could replace it.)
  
 ===== Why do you consider Hindustani a single language? ===== ===== Why do you consider Hindustani a single language? =====
en/background/source_languages.1664287174.txt.gz · Last modified: 2022-09-27 15:59 by christian

Except where otherwise noted, content on this wiki is licensed under the following license: CC0 1.0 Universal
CC0 1.0 Universal Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki