Wikipedia:Reference desk/Archives/Language/2025 July 29
Language desk | ||
---|---|---|
< July 28 | << Jun | July | Aug >> | July 30 > |
Welcome to the Wikipedia Language Reference Desk Archives |
---|
The page you are currently viewing is a transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages. |
July 29
editSemitic roots and LLM tokenisation
editI recently was sent an abstract about failures of LLMs to correctly answer questions about the Quran in Arabic. That got me pondering. As I understand the tokenizers used in LLMs, they identify relatively frequent character sequences as tokens. That is a good match for languages that work (mostly) with a word stem and various pre- and postfixes for grammatical markers. But is this a good match for languages like Hebrew or Arabic that use multilateral roots and modify words by injecting extra characters in between the consonant roots? And is this the right desk or is this a computing question? --Stephan Schulz (talk) 13:17, 29 July 2025 (UTC)
Homage - a vs. an
editIs it a homage
or an homage
in American English? The Cable Guy has: The fight sequence at Medieval Times between Chip (Jim Carrey) and Steven (Matthew Broderick) is an homage to the Star Trek episode "Amok Time"...
Jay 💬 13:19, 29 July 2025 (UTC)
- I would say "an". As a rule, the article follows pronunciation, not spelling, and the "h" is silent. --Stephan Schulz (talk) 14:15, 29 July 2025 (UTC)
- I would assert that the version with the silent h - usually along with the faux-French pronunciation with the stress on the second syllable - is a relatively recent arrival that has become a sort of buzz word that people use in mostly inappropriate places. All my life it was only ever HOMM-idge, until this weird o-MAHZH started cropping up about 20 years ago, particularly among pop culture people who think they're sounding sophisticated. -- Jack of Oz [pleasantries] 21:42, 29 July 2025 (UTC)
- Thanks. After posting this, I saw more usages - two at Sofia Coppola, so realized it was not a one-off. I didn't even know "homage" can be pronounced with a silent 'h'. When I asked Google to pronounce the word, it did not make the h silent. When I asked it to pronounce with the silent h, it did not, but gave me further links. Later on the AI must have kicked in, and it gave me a drop-down with British and American pronunciations, with the silent h for American. Jay 💬 10:05, 30 July 2025 (UTC)
- To me the om-aazh pronunciaiton is very much the language of pseuds and poseurs. I tend to associate it with an excessive admiration for Bloomsbury and Wagner, but that's probably the people I first heard it from, them and late-night BBC2/Channel 4 "culchure" programmes. DuncanHill (talk) 11:57, 30 July 2025 (UTC)
- I think that originally there were two different but related words. Firstly, there's homage (pronounced with an /h/, stressed on the first syllable and ending with the "j" sound), which meant "respect paid to someone" and in a historical context "the oath sworn by a subordinate to his lord in the Middle Ages"; this word was inherited from Middle English, which borrowed it from Old French in the 13th century or so. Secondly, there's the doublet hommage (pronounced without an /h/, stressed on the second syllable and ending with the "zh" sound), which means "a work of art done in respectful imitation of another artist"; this word was borrowed from modern French probably in the 20th century. However, the distinction between the two in both spelling and pronunciation has become blurred in recent years, so that among younger people at least the spelling homage and French-like pronunciation (no /h/, second syllable stress, ending with "zh") is being used for both the general meaning and the art-specific meaning. Language changes, cry me a river. —Mahāgaja · talk 16:42, 31 July 2025 (UTC)
- To me the om-aazh pronunciaiton is very much the language of pseuds and poseurs. I tend to associate it with an excessive admiration for Bloomsbury and Wagner, but that's probably the people I first heard it from, them and late-night BBC2/Channel 4 "culchure" programmes. DuncanHill (talk) 11:57, 30 July 2025 (UTC)
- Thanks. After posting this, I saw more usages - two at Sofia Coppola, so realized it was not a one-off. I didn't even know "homage" can be pronounced with a silent 'h'. When I asked Google to pronounce the word, it did not make the h silent. When I asked it to pronounce with the silent h, it did not, but gave me further links. Later on the AI must have kicked in, and it gave me a drop-down with British and American pronunciations, with the silent h for American. Jay 💬 10:05, 30 July 2025 (UTC)
- I would assert that the version with the silent h - usually along with the faux-French pronunciation with the stress on the second syllable - is a relatively recent arrival that has become a sort of buzz word that people use in mostly inappropriate places. All my life it was only ever HOMM-idge, until this weird o-MAHZH started cropping up about 20 years ago, particularly among pop culture people who think they're sounding sophisticated. -- Jack of Oz [pleasantries] 21:42, 29 July 2025 (UTC)
- Either pronunciation and either spelling is found on either side of the Atlantic Divide.
- The prevalence is sounded /h/ in RP and silent ⟨h⟩ in American English.
- For more, see Ben Zimmer's item 'Homage' in the New York Times Magazine column On Language. ‑‑Lambiam 19:01, 29 July 2025 (UTC)
- Ngram Viewer verifies [7] it. Modocc (talk) 20:46, 29 July 2025 (UTC)
It's complicated.
h (or eta) hasn't been used in the Greek language since it was replaced with the spiritus asper diacritic in the Athenian Spelling Reform of 403 B.C. Therefore, such words are treated as beginning with a vowel even though the first syllable is aspirated. Nevertheless, in Latin—and by extension, English—we to this day continue to spell Greek-based words such as "hero," "habit," and "history" with the 8th letter for reasons of etymology. (And don't even get me started on the whole "ydor/hydrogen" "nero/aneroid" mess.)
Thus, to widely varying degrees, prescriptive English grammarians have insisted on using an before such words so as to honor their origin. Viz., some always use an; some always use a; and still others use every possible combination in between.
In my personal writing style, however, it's quite simple.
A.) if the first letter is silent then always use an
Silent h |
an hour |
an honor |
an heir |
and B.) If the first letter is not silent, then only use an if the word has a primary stress on the second syllable.
Primary stress on first syllable | Primary stress on second syllable | Primary stress on third syllable | Primary stress on fourth syllable |
a hero | an heroic act | a hydroponic plant | a heterosexual man |
a history | an historic occasion | a hyperactive child | |
a habit | an habitual offender | a homogeneous mixture |
Most 21st-Century writers consider this somewhat dated. I myself, however, find that it still makes quite an impression!
Pine (talk) 23:50, 29 July 2025 (UTC)
- This is an excellent answer! I learned something and will pay homage to you ;-). --Stephan Schulz (talk) 07:28, 30 July 2025 (UTC)
- A small addendum to Pine's answer (which corresponds exactly to my practice in writing and formal speech): in accents such as Cockney that drop initial 'h' –so that, for example, hedge becomes 'edge – speakers usually treat the result as vowel-initial and precede it by 'an' – a hedge, an 'edge – thus avoiding the insertion of a glottal stop. (I myself do this when speaking casually.) In writing, however, this would only be applied if attempting to represent the accent in spoken mono- or dialogue. {The poster formerly known as 87.81.230.195} 90.193.253.201 (talk) 11:50, 30 July 2025 (UTC)
- Some people think the primary stress in homogeneous is on the second syllable though. Card Zero (talk) 12:02, 30 July 2025 (UTC)
- Because they're pronouncing "-neous" as one syllable, as in the recent variant (and arguably incorrect) "homogenous"? That would make it the antipenultimate syllable, which (pace Pine above) is I think the underlying rule from original Greek pronunciation. {The poster formerly known as 87.81.230.195} 90.193.253.201 (talk) 13:55, 30 July 2025 (UTC)
- Indeed!
- And I'd like to add that, according to the Oxford American Dictionary, Third Edition, homogeneous is the only correct spelling when used to mean all of the same kind, as in "Iceland's homogeneous population."
- Said dictionary, however, also lists homogenous as a term once used in evolutionary biology meaning "having a common descent," but now more-or-less displaced by homologous.
- Pine (talk) 17:43, 30 July 2025 (UTC)
- OED says "The spelling homogenous is less common than the pronunciation /həˈmɒdʒɪnəs/ , which perhaps owes its currency partly to the influence of the verb homogenize and its derivatives." It has citations from Websters, The Times, Elisabeth Palmer's translation of Andre Martinet, and Nature. As for the biological use of homogenous, it gives homogenetic, and in surgery homoplastic. DuncanHill (talk) 15:55, 31 July 2025 (UTC)
- Because they're pronouncing "-neous" as one syllable, as in the recent variant (and arguably incorrect) "homogenous"? That would make it the antipenultimate syllable, which (pace Pine above) is I think the underlying rule from original Greek pronunciation. {The poster formerly known as 87.81.230.195} 90.193.253.201 (talk) 13:55, 30 July 2025 (UTC)