Comparative method: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 03:17, 7 September 2023 edit Treetoes023 (talk \| contribs) Extended confirmed users 9,907 edits No edit summary Tag: Disambiguation links added ← Previous edit		Latest revision as of 02:47, 17 July 2025 edit undo Citation bot (talk \| contribs) Bots 5,863,086 edits Added article-number. Removed URL that duplicated identifier. Removed parameters. Some additions/deletions were parameter name changes. \| Use this bot. Report bugs. \| #UCB_CommandLine
(16 intermediate revisions by 11 users not shown)
Line 20: Relation is considered to be "established beyond a reasonable doubt" if a reconstruction of the common ancestor is feasible.{{Sfn\|Hock\|1991\|p=567}} {{Quote\|text=The ultimate proof of genetic relationship, and to many linguists' minds the only real proof, lies in a successful reconstruction of the ancestral forms from which the semantically corresponding cognates can be derived.\|author=[[Hans Henrich Hock]]\|title=''Principles of Historical Linguistics''\|source=1991, p. 567.}}In some cases, this reconstruction can only be partial, generally because the compared languages are too scarcely attested, the temporal distance between them and their proto-language is too deep, or their internal evolution render many of the sound laws obscure to researchers. In such case, a relation is considered plausible, but uncertain.<ref>{{Cite journal \|last=Igartua \|first=Iván \|date=2015 \|title=From cumulative to separative exponence in inflection: Reversing the morphological cycle \|url=https://www.jstor.org/stable/24672169 \|journal=Language \|volume=91 \|issue=3 \|pages=676–722 \|doi=10.1353/lan.2015.0032 \|jstor=24672169 \|s2cid=122591029 \|issn=0097-8507\|url-access=subscription }}</ref> ===Terminology=== Line 32: ==Origin and development== In [[~~Classical~~classical antiquity~~\|Antiquity~~]], Romans were aware of the similarities between Greek and Latin, but did not study them systematically. They sometimes explained them mythologically, as the result of Rome being a Greek colony speaking a debased dialect.<ref>{{Cite journal \|last=Stevens \|first=Benjamin \|date=2006 \|title=Aeolism: Latin as a Dialect of Greek \|url=https://www.jstor.org/stable/30038039 \|journal=The Classical Journal \|volume=102 \|issue=2 \|pages=115–144 \|jstor=30038039 \|issn=0009-8353}}</ref> Even though grammarians of Antiquity had access to other languages around them ([[Oscan language\|Oscan]], [[Umbrian language\|Umbrian]], [[Etruscan language\|Etruscan]], [[Gaulish language\|Gaulish]], [[Ancient Egyptian\|Egyptian]], [[Parthian language\|Parthian]]...), they showed little interest in comparing, studying, or just documenting them. Comparison between languages really began after ~~Antiquity~~classical antiquity. ===Early works=== {{See also\|Uralic languages#Uralic studies}} In the 9th or 10th century AD, [[Yehuda Ibn Quraysh]] compared the phonology and morphology of Hebrew, Aramaic and Arabic but attributed the resemblance to the Biblical story of Babel, with Abraham, Isaac and Joseph retaining Adam's language, with other languages at various removes becoming more altered from the original Hebrew.<ref>"The reason for this similarity and the cause of this intermixture was their close neighboring in the land and their genealogical closeness, since Terah the father of Abraham was Syrian, and Laban was Syrian. Ishmael and Kedar were Arabized from the Time of Division, the time of the confounding [of tongues] at Babel, and Abraham and Isaac and Jacob (peace be upon them) retained the Holy Tongue from the original Adam." [http://lameen.googlepages.com/ibn-quraysh.html Introduction of Risalat Yehuda Ibn Quraysh – مقدمة رسالة يهوذا بن قريش] {{Webarchive\|url=https://web.archive.org/web/20090729093347/http://lameen.googlepages.com/ibn-quraysh.html \|date=29 July 2009 }}</ref> [[File:Sajnovics - Demonstratio.jpg\|thumb\|Title page of Sajnovic's 1770 work.\|alt=\|258x258px]] In publications of 1647 and 1654, [[Marcus Zuerius van Boxhorn]] first described a rigorous methodology for historical linguistic comparisons<ref name="Driem">George van Driem [~~http~~https://www.~~eastling~~isw.~~org~~unibe.ch/~~paper~~e41142/~~Driem~~e41180/e523709/e546670/2005d_ger.pdf The genesis of polyphyletic linguistics] {{webarchive\|url=https://web.archive.org/web/20110726012439/http://www.eastling.org/paper/Driem.pdf\|date=26 July 2011}}</ref> and proposed the existence of an [[Indo-European languages\|Indo-European]] proto-language, which he called "Scythian", unrelated to Hebrew but ancestral to Germanic, Greek, Romance, Persian, Sanskrit, Slavic, Celtic and Baltic languages. The Scythian theory was further developed by [[Andreas Jäger]] (1686) and [[William Wotton]] (1713), who made early forays to reconstruct the primitive common language. In 1710 and 1723, [[Lambert ten Kate]] first formulated the regularity of [[sound law]]s, introducing among others the term [[root vowel]].<ref name="Driem" /> Another early systematic attempt to prove the relationship between two languages on the basis of similarity of [[grammar]] and [[lexicon]] was made by the Hungarian [[János Sajnovics]] in 1770, when he attempted to demonstrate the relationship between [[Sami languages\|Sami]] and [[Hungarian language\|Hungarian]]. That work was later extended to all [[Finno-Ugric languages]] in 1799 by his countryman [[Samuel Gyarmathi]].<ref name="ssix">{{harvnb\|Szemerényi\|1996\|p=6}}.</ref> However, the origin of modern [[historical linguistics]] is often traced back to [[William Jones (philologist)\|Sir William Jones]], an English [[Philology\|philologist]] living in [[India]], who in 1786 made his famous {{nowrap\|observation:<ref>{{cite web\|last=Jones\|first=Sir William\|title=The Third Anniversary Discourse delivered 2 February 1786 By the President [on the Hindus]\|editor-first=Guido\|editor-last=Abbattista\|publisher=Eliohs Electronic Library of Historiography\|url=http://www.eliohs.unifi.it/testi/700/jones/Jones_Discourse_3.html\|access-date=18 December 2009}}</ref>}}<blockquote>The [[Sanskrit\|Sanscrit language]], whatever be its antiquity, is of a wonderful structure; more perfect than the [[Ancient Greek language\|Greek]], more copious than the [[Latin]], and more exquisitely refined than either, yet bearing to both of them a stronger affinity, both in the roots of verbs and the forms of grammar, than could possibly have been produced by accident; so strong indeed, that no philologer could examine them all three, without believing them to have sprung from some common source, which, perhaps, no longer exists. There is a similar reason, though not quite so forcible, for supposing that both the [[Germanic languages\|Gothick]] and the [[Celtic languages\|Celtick]], though blended with a very different idiom, had the same origin with the Sanscrit; and the [[Persian language\|old Persian]] might be added to the same family.</blockquote> Line 154: \|} [[Loanword\|Borrowings]] or [[false cognate]]s can skew or obscure the correct data.<ref>{{harvnb\|Lyovin\|1997\|pp=3–5}}.</ref> For example, English ''taboo'' ({{IPA\|[tæbu]}}) is like the six Polynesian forms because of borrowing from Tongan into English, not because of a genetic similarity.<ref>{{cite encyclopedia\|url=http://dictionary.reference.com/browse/taboo\|encyclopedia=Dictionary.com\|title=Taboo}}</ref> That problem can usually be overcome by using basic vocabulary, such as kinship terms, numbers, body parts and pronouns.<ref>{{harvnb\|Lyovin\|1997\|p=3}}.</ref> Nonetheless, even basic vocabulary can be sometimes borrowed. [[Finnish language\|Finnish]], for example, borrowed the word for "mother", ''{{lang\|fi\|äiti''}}, from Proto-Germanic aiþį̄ (compare to [[Gothic language\|Gothic]] ''{{lang\|got\|aiþei''}}).<ref>{{harvnb\|Campbell\|2004\|pp=65, 300}}.</ref> [[English language\|English]] borrowed the pronouns "they", "them", and "their(s)" from [[Old Norse language\|Norse]].<ref>{{cite encyclopedia\|url=http://dictionary.reference.com/browse/they\|encyclopedia=Dictionary.com\|title=They}}</ref> [[Thai language\|Thai]] and various other [[East Asian languages]] borrowed their numbers from [[Chinese language\|Chinese]]. An extreme case is represented by [[Pirahã language\|Pirahã]], a [[Muran languages\|Muran language]] of South America, which has been controversially<ref>{{cite journal\|last1=Nevins\|first1=Andrew\|first2=David\|last2=Pesetsky\|first3=Cilene\|last3=Rodrigues\|year=2009\|url=http://www.people.fas.harvard.edu/%7Enevins/npr09.pdf\|url-status=dead\|archive-url=https://web.archive.org/web/20110604103305/http://www.people.fas.harvard.edu/~nevins/npr09.pdf\|archive-date=4 June 2011\|title=Pirahã Exceptionality: a Reassessment\|journal=Language\|volume=85\|issue=2\|pages=355–404\|doi=10.1353/lan.0.0107\|citeseerx=10.1.1.404.9474\|hdl=1721.1/94631\|s2cid=15798043}}</ref> claimed to have borrowed all of its [[pronoun]]s from [[Nheengatu language\|Nheengatu]].<ref>{{harvnb\|Thomason\|2005\|pp=8–12 in pdf}}; {{harvnb\|Aikhenvald\|1999\|p=355}}.</ref><ref>"Superficially, however, the Piraha pronouns don't look much like the Tupi–Guarani pronouns; so this proposal will not be convincing without some additional information about the phonology of Piraha that shows how the phonetic realizations of the Tupi–Guarani forms align with the Piraha phonemic system." [http://www-personal.umich.edu/~thomason/papers/pronborr.pdf "Pronoun borrowing" Sarah G. Thomason & Daniel L. Everett University of Michigan & University of Manchester]</ref> ===Step 2, establish correspondence sets=== The next step involves determining the regular sound-correspondences exhibited by the lists of potential cognates. For example, in the Polynesian data above, it is apparent that words that contain ''t'' in most of the languages listed have cognates in Hawaiian with ''k'' in the same position. That is visible in multiple cognate sets: the words glossed as 'one', 'three', 'man' and 'taboo' all show the relationship. The situation is called a "regular correspondence" between ''k'' in Hawaiian and ''t'' in the other Polynesian languages. Similarly, a regular correspondence can be seen between Hawaiian and Rapanui ''h'', Tongan and Samoan ''f'', Maori ''ɸ'', and Rarotongan ''ʔ''. Mere phonetic similarity, as between [[English language\|English]] ''day'' and [[Latin]] ''{{lang\|la\|dies''}} (both with the same meaning), has no probative value.<ref name="ltwo">{{harvnb\|Lyovin\|1997\|p=2}}.</ref> English initial ''d-'' does not ''regularly'' match {{nowrap\|Latin ''d-''<ref name="bonetwoseven">{{harvnb\|Beekes\|1995\|p=127}}</ref>}} since a large set of English and Latin non-borrowed cognates cannot be assembled such that English ''d'' repeatedly and consistently corresponds to Latin ''d'' at the beginning of a word, and whatever sporadic matches can be observed are due either to chance (as in the above example) or to [[loanword\|borrowing]] (for example, Latin ''{{lang\|la\|diabolus''}} and English ''devil'', both ultimately of Greek origin<ref>{{cite encyclopedia\|title=devil\|encyclopedia=Dictionary.com\|url=http://dictionary.reference.com/browse/devil}}</ref>). However, English and Latin exhibit a regular correspondence of ''t-'' : ''d-''<ref name="bonetwoseven"/> (in which "A : B" means "A corresponds to B"), as in the following examples:<ref>In Latin, {{angle bracket\|c}} represents {{IPA\|/k/}}; ''dingua'' is an [[Old Latin]] form of the word later attested as ''lingua'' ("tongue").</ref> {\| class="wikitable" Line 170: \|- \| align=left \|  '''Latin'''  \| align=center \|  {{lang\|la\|'''d'''ecem}}  \| align=center \|  {{lang\|la\|'''d'''uo}}  \| align=center \|  {{lang\|la\|'''d'''ūco}}  \| align=center \|  {{lang\|la\|'''d'''ingua}}  \| align=center \|  {{lang\|la\|'''d'''ent-}}  \|} Line 182: During the late 18th to late 19th century, two major developments improved the method's effectiveness. First, it was found~~{{by whom\|date=November 2017}}~~ that many sound changes are conditioned by a specific ''context''. For example, in both [[Ancient Greek\|Greek]] and [[Sanskrit]], an [[Aspiration (phonetics)\|aspirated]] [[stop consonant\|stop]] evolved into an unaspirated one, but only if a second aspirate occurred later in the same word;<ref>{{harvnb\|Beekes\|1995\|p=128}}.</ref> this is [[Grassmann's law]], first described for [[Sanskrit]] by [[Sanskrit grammarians\|Sanskrit grammarian]] [[Pāṇini]]<ref>{{harvnb\|Sag\|1974\|p=591}}; {{harvnb\|Janda\|1989}}.</ref> and promulgated by [[Hermann Grassmann]] in 1863. Second, it was found that sometimes sound changes occurred in contexts that were later lost. For instance, in Sanskrit [[velar consonant\|velars]] (''k''-like sounds) were replaced by [[palatal consonant\|palatals]] (''ch''-like sounds) whenever the following vowel was ''i'' or ''e''.<ref>The asterisk () indicates that the sound is inferred/reconstructed, rather than historically documented or attested</ref> Subsequent to this change, all instances of ''e'' were replaced by ''a''.<ref>More accurately, earlier ''e'', ''o'', and ''a'' merged as ''a''.</ref> The situation could be reconstructed only because the original distribution of ''e'' and ''a'' could be recovered from the evidence of other [[Indo-European languages]].<ref>{{harvnb\|Beekes\|1995\|pp=60–61}}.</ref> For instance, the [[Latin]] suffix ''{{lang\|la\|que''}}, "and", preserves the original ''e'' vowel that caused the consonant shift in Sanskrit: {\| class="wikitable" Line 215: \|- \|  '''1.'''  \| align=center \|  {{lang\|it\|corpo}}  \| align=center \|  {{lang\|es\|cuerpo}}  \| align=center \|  {{lang\|pt\|corpo}}  \| align=center \|  {{lang\|fr\|corps}}  \| align=center \|  body  \|- \|  '''2.'''  \| align=center \|  {{lang\|it\|crudo}}  \| align=center \|  {{lang\|es\|crudo}}  \| align=center \|  {{lang\|pt\|cru}}  \| align=center \|  {{lang\|fr\|cru}}  \| align=center \|  raw  \|- \|  '''3.'''  \| align=center \|  {{lang\|it\|catena}}  \| align=center \|  {{lang\|es\|cadena}}  \| align=center \|  {{lang\|pt\|cadeia}}  \| align=center \|  {{lang\|fr\|chaîne}}  \| align=center \|  chain  \|- \|  '''4.'''  \| align=center \|  {{lang\|it\|cacciare}}  \| align=center \|  {{lang\|es\|cazar}}  \| align=center \|  {{lang\|pt\|caçar}}  \| align=center \|  {{lang\|fr\|chasser}}  \| align=center \|  to hunt  \|} Line 265: \|} Since French ''{{IPA\|ʃ}}'' occurs only before ''a'' where the other languages also have ''a'', and French ''k'' occurs elsewhere, the difference is caused by different environments (being before ''a'' conditions the change), and the sets are complementary. They can, therefore, be assumed to reflect a single proto-phoneme (in this case ''k'', spelled ⟨c⟩ in [[Latin language\|Latin]]).<ref>{{harvnb\|Campbell\|2004\|p=26}}.</ref> The original Latin words are ''{{lang\|la\|corpus''}}, ''{{lang\|la\|crudus''}}, ''{{lang\|la\|catena''}} and ''{{lang\|la\|captiare''}}, all with an initial ''k''. If more evidence along those lines were given, one might conclude that an alteration of the original ''k'' took place because of a different environment. A more complex case involves consonant clusters in [[Proto-Algonquian]]. The Algonquianist [[Leonard Bloomfield]] used the reflexes of the clusters in four of the daughter languages to reconstruct the following correspondence sets:<ref>The table is modified from that in {{harvnb\|Campbell\|2004\|p=141}}.</ref> Line 356: \|} has only one [[Voiced bilabial stop\|voiced stop]], ''b'', and although it has an [[alveolar nasal\|alveolar]] and a [[velar nasal]], ''n'' and ''ŋ'', there is no corresponding [[Bilabial nasal\|labial nasal]]. However, languages generally maintain symmetry in their phonemic inventories.<ref>{{Cite journal \|last1=Tabain \|first1=Marija \|last2=Garellek \|first2=Marc \|last3=Hellwig \|first3=Birgit \|last4=Gregory \|first4=Adele \|last5=Beare \|first5=Richard \|date=2022-03-01 \|title=Voicing in Qaqet: Prenasalization and language contact \|journal=Journal of Phonetics \|language=en \|volume=91 \|~~pages~~article-number=101138 \|doi=10.1016/j.wocn.2022.101138 \|s2cid=247211541 \|issn=0095-4470\|doi-access=free }}</ref> In this case, a linguist might attempt to investigate the possibilities that either what was earlier reconstructed as ''b'' is in fact ''m'' or that the ''n'' and ''ŋ'' are in fact ''d'' and ''g''. Even a symmetrical system can be typologically suspicious. For example, here is the traditional [[Proto-Indo-European]] stop inventory:<ref>{{harvnb\|Beekes\|1995\|p=124}}.</ref> Line 404: <blockquote>The Comparative Method ''as such'' is not, in fact, historical; it provides evidence of linguistic relationships to which we may give a historical interpretation.... [Our increased knowledge about the historical processes involved] has probably made historical linguists less prone to equate the idealizations required by the method with historical reality.... Provided we keep [the interpretation of the results and the method itself] apart, the Comparative Method can continue to be used in the reconstruction of earlier stages of languages.</blockquote> Proto-languages can be verified in many historical instances, such as Latin.<ref>{{Cite book \|last=Kortlandt \|first=Frederik ~~\|url=https://www.worldcat.org/oclc/697534924~~ \|title=Studies in Germanic, Indo-European and Indo-Uralic \|date=2010 \|publisher=Rodopi \|isbn=978-90-420-3136-4 \|___location=Amsterdam \|oclc=697534924}}</ref><ref>{{Cite book \|last=Koerner \|first=E. F. K. ~~\|url=https://www.worldcat.org/oclc/742367480~~ \|title=Linguistic historiography : projects & prospects \|date=1999 \|publisher=J. Benjamins \|isbn=978-90-272-8377-1 \|___location=Amsterdam \|oclc=742367480}}</ref> Although no longer a law, settlement-archaeology is known to be essentially valid for some cultures that straddle history and prehistory, such as the Celtic Iron Age (mainly Celtic) and [[Mycenaean civilization]] (mainly Greek). None of those models can be or have been completely rejected, but none is sufficient alone. ===The Neogrammarian principle=== Line 438: The tree model features nodes that are presumed to be distinct proto-languages existing independently in distinct regions during distinct historical times. The reconstruction of unattested proto-languages lends itself to that illusion since they cannot be verified, and the linguist is free to select whatever definite times and places seems best. Right from the outset of Indo-European studies, however, [[Thomas Young (scientist)\|Thomas Young]] said:<ref>{{citation\|title=Miscellaneous works of the late Thomas Young\|first=Thomas\|last=Young\|contribution=Languages, From the Supplement to the Encyclopædia Britannica, vol. V, 1824\|volume=III, Hieroglyphical Essays and Correspondence, &c.\|editor-first=John\|editor-last=Leitch\|___location=London\|publisher=John Murray\|year=1855\|page=480}}</ref><blockquote>It is not, however, very easy to say what the definition should be that should constitute a separate language, but it seems most natural to call those languages distinct, of which the one cannot be understood by common persons in the habit of speaking the other.... Still, however, it may remain doubtfull whether the Danes and the Swedes could not, in general, understand each other tolerably well... nor is it possible to say if the twenty ways of pronouncing the sounds, belonging to the Chinese characters, ought or ought not to be considered as so many languages or dialects.... But,... the languages so nearly allied must stand next to each other in a systematic order…</blockquote> The assumption of uniformity in a proto-language, implicit in the comparative method, is problematic. Even small language communities ~~are~~ always have differences in [[dialect]], whether they are based on area, gender, class or other factors. The [[Pirahã language]] of [[Brazil]] is spoken by only several hundred people but has at least two different dialects, one spoken by men and one by women.<ref>{{harvnb\|Aikhenvald\|1999\|p=354}}; {{harvnb\|Ladefoged\|2003\|p=14}}.</ref> Campbell points out:<ref>{{harvnb\|Campbell\|2004\|pp=146–147}}</ref> <blockquote>It is not so much that the comparative method 'assumes' no variation; rather, it is just that there is nothing built into the comparative method which would allow it to address variation directly.... This assumption of uniformity is a reasonable idealization; it does no more damage to the understanding of the language than, say, modern reference grammars do which concentrate on a language's general structure, typically leaving out consideration of regional or social variation.</blockquote> Line 448: The reconstruction of unknown proto-languages is inherently subjective. In the [[Proto-Algonquian]] example above, the choice of ''m'' as the parent [[phoneme]] is only ''likely'', not ''certain''. It is conceivable that a Proto-Algonquian language with ''b'' in those positions split into two branches, one that preserved ''b'' and one that changed it to ''m'' instead, and while the first branch developed only into [[Arapaho language\|Arapaho]], the second spread out more widely and developed into all the other [[Algonquian peoples\|Algonquian]] tribes. It is also possible that the nearest common ancestor of the [[Algonquian languages]] used some other sound instead, such as ''p'', which eventually mutated to ''b'' in one branch and to ''m'' in the other. Examples of strikingly complicated and even circular developments are indeed known to have occurred (such as Proto-Indo-European ''t'' > Pre-Proto-Germanic ''þ'' > [[Proto-Germanic]] ''ð'' > Proto-West-Germanic ''d'' > [[Old High German]] ''{{lang\|goh\|t''}} in ''{{lang\|goh\|fater''}} > Modern German ''{{lang\|de\|Vater''}}), but in the absence of any evidence or other reason to postulate a more complicated development, the preference of a simpler explanation is justified by the principle of parsimony, also known as [[Occam's razor]]. Since reconstruction involves many such choices, some linguists{{who\|date=January 2020}} prefer to view the reconstructed features as abstract representations of sound correspondences, rather than as objects with a historical time and place.{{citation needed\|date=January 2020}} The existence of proto-languages and the validity of the comparative method is verifiable if the reconstruction can be matched to a known language, which may be known only as a shadow in the [[loanword]]s of another language. For example, [[Finnic languages]] such as [[Finnish language\|Finnish]] have borrowed many words from an early stage of [[Germanic languages\|Germanic]], and the shape of the loans matches the forms that have been reconstructed for [[Proto-Germanic]]. Finnish ''{{lang\|fi\|kuningas''}} 'king' and ''{{lang\|fi\|kaunis''}} 'beautiful' match the Germanic reconstructions ''kuningaz'' and ''skauniz'' (> German ''{{lang\|de\|König''}} 'king', ''{{lang\|de\|schön''}} 'beautiful').<ref>{{harvnb\|Kylstra\|1996\|p=62}} for KAUNIS, p. 122 for KUNINGAS.</ref> ====Additional models==== Line 511: [[Category:Historical linguistics]] [[Category:Comparative linguistics]] [[Category:Methods in linguistics]]