This content applies to:
v2.1 | Latest version:
v4.0 (GA)
Azure AI Document Intelligence models provide multilingual document processing support. Our language support capabilities enable your users to communicate with your applications in natural ways and empower global outreach. Document analysis models enable text extraction from forms and documents and return structured business-ready content ready for your organization's action, use, or progress. The following tables list the available language and locale support by model and feature:
- Read: The read model enables extraction and analysis of printed and handwritten text. This model is the underlying OCR engine for other Document Intelligence prebuilt models like layout, general document, invoice, receipt, identity (ID) document, health insurance card, tax documents and custom models. For more information, see Read model overview
- Layout: The layout model enables extraction and analysis of text, tables, document structure, and selection marks (like radio buttons and checkboxes) from forms and documents.
Note
Language code optional
Document Intelligence's deep learning based universal models extract all multi-lingual text in your documents, including text lines with mixed languages, and don't require specifying a language code.
Don't provide the language code as the parameter unless you are sure of the language and want to force the service to apply only the relevant model. Otherwise, the service may return incomplete and incorrect text.
Also, It's not necessary to specify a locale. This is an optional parameter. The Document Intelligence deep-learning technology will auto-detect the text language in your image.
Read model
Model ID: prebuilt-read
The following table lists read model language support for extracting and analyzing printed text.
Language |
Code (optional) |
Abaza |
abq |
Abkhazian |
ab |
Achinese |
ace |
Acoli |
ach |
Adangme |
ada |
Adyghe |
ady |
Afar |
aa |
Afrikaans |
af |
Akan |
ak |
Albanian |
sq |
Algonquin |
alq |
Angika (Devanagari) |
anp |
Arabic |
ar |
Asturian |
ast |
Asu (Tanzania) |
asa |
Avaric |
av |
Awadhi-Hindi (Devanagari) |
awa |
Aymara |
ay |
Azerbaijani (Latin) |
az |
Bafia |
ksf |
Bagheli |
bfy |
Bambara |
bm |
Bashkir |
ba |
Basque |
eu |
Belarusian (Cyrillic) |
be , be-cyrl |
Belarusian (Latin) |
be , be-latn |
Bemba (Zambia) |
bem |
Bena (Tanzania) |
bez |
Bhojpuri-Hindi (Devanagari) |
bho |
Bikol |
bik |
Bini |
bin |
Bislama |
bi |
Bodo (Devanagari) |
brx |
Bosnian (Latin) |
bs |
Brajbha |
bra |
Breton |
br |
Bulgarian |
bg |
Bundeli |
bns |
Buryat (Cyrillic) |
bua |
Catalan |
ca |
Cebuano |
ceb |
Chamling |
rab |
Chamorro |
ch |
Chechen |
ce |
Chhattisgarhi (Devanagari) |
hne |
Chiga |
cgg |
Chinese Simplified |
zh-Hans |
Chinese Traditional |
zh-Hant |
Choctaw |
cho |
Chukot |
ckt |
Chuvash |
cv |
Cornish |
kw |
Corsican |
co |
Cree |
cr |
Creek |
mus |
Crimean Tatar (Latin) |
crh |
Croatian |
hr |
Crow |
cro |
Czech |
cs |
Danish |
da |
Dargwa |
dar |
Dari |
prs |
Dhimal (Devanagari) |
dhi |
Dogri (Devanagari) |
doi |
Duala |
dua |
Dungan |
dng |
Dutch |
nl |
Efik |
efi |
English |
en |
Erzya (Cyrillic) |
myv |
Estonian |
et |
Faroese |
fo |
Fijian |
fj |
Filipino |
fil |
Finnish |
fi |
Language |
Code (optional) |
Fon |
fon |
French |
fr |
Friulian |
fur |
Ga |
gaa |
Gagauz (Latin) |
gag |
Galician |
gl |
Ganda |
lg |
Gayo |
gay |
German |
de |
Gilbertese |
gil |
Gondi (Devanagari) |
gon |
Greek |
el |
Greenlandic |
kl |
Guarani |
gn |
Gurung (Devanagari) |
gvr |
Gusii |
guz |
Haitian Creole |
ht |
Halbi (Devanagari) |
hlb |
Hani |
hni |
Haryanvi |
bgc |
Hawaiian |
haw |
Hebrew |
he |
Herero |
hz |
Hiligaynon |
hil |
Hindi |
hi |
Hmong Daw (Latin) |
mww |
Ho(Devanagiri) |
hoc |
Hungarian |
hu |
Iban |
iba |
Icelandic |
is |
Igbo |
ig |
Iloko |
ilo |
Inari Sami |
smn |
Indonesian |
id |
Ingush |
inh |
Interlingua |
ia |
Inuktitut (Latin) |
iu |
Irish |
ga |
Italian |
it |
Japanese |
ja |
Jaunsari (Devanagari) |
Jns |
Javanese |
jv |
Jola-Fonyi |
dyo |
Kabardian |
kbd |
Kabuverdianu |
kea |
Kachin (Latin) |
kac |
Kalenjin |
kln |
Kalmyk |
xal |
Kangri (Devanagari) |
xnr |
Kanuri |
kr |
Karachay-Balkar |
krc |
Kara-Kalpak (Cyrillic) |
kaa-cyrl |
Kara-Kalpak (Latin) |
kaa |
Kashubian |
csb |
Kazakh (Cyrillic) |
kk-cyrl |
Kazakh (Latin) |
kk-latn |
Khakas |
kjh |
Khaling |
klr |
Khasi |
kha |
K'iche' |
quc |
Kikuyu |
ki |
Kildin Sami |
sjd |
Kinyarwanda |
rw |
Komi |
kv |
Kongo |
kg |
Korean |
ko |
Korku |
kfq |
Koryak |
kpy |
Kosraean |
kos |
Kpelle |
kpe |
Kuanyama |
kj |
Kumyk (Cyrillic) |
kum |
Kurdish (Arabic) |
ku-arab |
Kurdish (Latin) |
ku-latn |
Kurukh (Devanagari) |
kru |
Kyrgyz (Cyrillic) |
ky |
Lak |
lbe |
Lakota |
lkt |
Language |
Code (optional) |
Latin |
la |
Latvian |
lv |
Lezghian |
lex |
Lingala |
ln |
Lithuanian |
lt |
Lower Sorbian |
dsb |
Lozi |
loz |
Lule Sami |
smj |
Luo (Kenya and Tanzania) |
luo |
Luxembourgish |
lb |
Luyia |
luy |
Macedonian |
mk |
Machame |
jmc |
Madurese |
mad |
Mahasu Pahari (Devanagari) |
bfz |
Makhuwa-Meetto |
mgh |
Makonde |
kde |
Malagasy |
mg |
Malay (Latin) |
ms |
Maltese |
mt |
Malto (Devanagari) |
kmj |
Mandinka |
mnk |
Manx |
gv |
Maori |
mi |
Mapudungun |
arn |
Marathi |
mr |
Mari (Russia) |
chm |
Masai |
mas |
Mende (Sierra Leone) |
men |
Meru |
mer |
Meta' |
mgo |
Minangkabau |
min |
Mohawk |
moh |
Mongolian (Cyrillic) |
mn |
Mongondow |
mog |
Montenegrin (Cyrillic) |
cnr-cyrl |
Montenegrin (Latin) |
cnr-latn |
Morisyen |
mfe |
Mundang |
mua |
Nahuatl |
nah |
Navajo |
nv |
Ndonga |
ng |
Neapolitan |
nap |
Nepali |
ne |
Ngomba |
jgo |
Niuean |
niu |
Nogay |
nog |
North Ndebele |
nd |
Northern Sami (Latin) |
sme |
Norwegian |
no |
Nyanja |
ny |
Nyankole |
nyn |
Nzima |
nzi |
Occitan |
oc |
Ojibwa |
oj |
Oromo |
om |
Ossetic |
os |
Pampanga |
pam |
Pangasinan |
pag |
Papiamento |
pap |
Pashto |
ps |
Pedi |
nso |
Persian |
fa |
Polish |
pl |
Portuguese |
pt |
Punjabi (Arabic) |
pa |
Quechua |
qu |
Ripuarian |
ksh |
Romanian |
ro |
Romansh |
rm |
Rundi |
rn |
Russian |
ru |
Rwa |
rwk |
Sadri (Devanagari) |
sck |
Sakha |
sah |
Samburu |
saq |
Samoan (Latin) |
sm |
Sango |
sg |
Language |
Code (optional) |
Sangu (Gabon) |
snq |
Sanskrit (Devanagari) |
sa |
Santali(Devanagiri) |
sat |
Scots |
sco |
Scottish Gaelic |
gd |
Sena |
seh |
Serbian (Cyrillic) |
sr-cyrl |
Serbian (Latin) |
sr , sr-latn |
Shambala |
ksb |
Shona |
sn |
Siksika |
bla |
Sirmauri (Devanagari) |
srx |
Skolt Sami |
sms |
Slovak |
sk |
Slovenian |
sl |
Soga |
xog |
Somali (Arabic) |
so |
Somali (Latin) |
so-latn |
Songhai |
son |
South Ndebele |
nr |
Southern Altai |
alt |
Southern Sami |
sma |
Southern Sotho |
st |
Spanish |
es |
Sundanese |
su |
Swahili (Latin) |
sw |
Swati |
ss |
Swedish |
sv |
Tabassaran |
tab |
Tachelhit |
shi |
Tahitian |
ty |
Taita |
dav |
Tajik (Cyrillic) |
tg |
Tamil |
ta |
Tatar (Cyrillic) |
tt-cyrl |
Tatar (Latin) |
tt |
Teso |
teo |
Tetum |
tet |
Thai |
th |
Thangmi |
thf |
Tok Pisin |
tpi |
Tongan |
to |
Tsonga |
ts |
Tswana |
tn |
Turkish |
tr |
Turkmen (Latin) |
tk |
Tuvan |
tyv |
Udmurt |
udm |
Uighur (Cyrillic) |
ug-cyrl |
Ukrainian |
uk |
Upper Sorbian |
hsb |
Urdu |
ur |
Uyghur (Arabic) |
ug |
Uzbek (Arabic) |
uz-arab |
Uzbek (Cyrillic) |
uz-cyrl |
Uzbek (Latin) |
uz |
Vietnamese |
vi |
Volapük |
vo |
Vunjo |
vun |
Walser |
wae |
Welsh |
cy |
Western Frisian |
fy |
Wolof |
wo |
Xhosa |
xh |
Yucatec Maya |
yua |
Zapotec |
zap |
Zarma |
dje |
Zhuang |
za |
Zulu |
zu |
The following table lists read model language support for extracting and analyzing printed text.
Language |
Code (optional) |
Afrikaans |
af |
Angika |
anp |
Arabic |
ar |
Asturian |
ast |
Awadhi |
awa |
Azerbaijani |
az |
Belarusian (Cyrillic) |
be , be-cyrl |
Belarusian (Latin) |
be-latn |
Bagheli |
bfy |
Mahasu Pahari |
bfz |
Bulgarian |
bg |
Haryanvi |
bgc |
Bhojpuri |
bho |
Bislama |
bi |
Bundeli |
bns |
Breton |
br |
Braj |
bra |
Bodo |
brx |
Bosnian |
bs |
Buriat |
bua |
Catalan |
ca |
Cebuano |
ceb |
Chamorro |
ch |
Montenegrin (Latin) |
cnr , cnr-latn |
Montenegrin (Cyrillic) |
cnr-cyrl |
Corsican |
co |
Crimean Tatar |
crh |
Czech |
cs |
Kashubian |
csb |
Welsh |
cy |
Danish |
da |
German |
de |
Dhimal |
dhi |
Dogri |
doi |
Lower Sorbian |
dsb |
English |
en |
Spanish |
es |
Estonian |
et |
Basque |
eu |
Persian |
fa |
Finnish |
fi |
Filipino |
fil |
Language |
Code (optional) |
Fijian |
fj |
Faroese |
fo |
French |
fr |
Friulian |
fur |
Western Frisian |
fy |
Irish |
ga |
Gagauz |
gag |
Scottish Gaelic |
gd |
Gilbertese |
gil |
Galician |
gl |
Gondi |
gon |
Manx |
gv |
Gurung |
gvr |
Hawaiian |
haw |
Hindi |
hi |
Halbi |
hlb |
Chhattisgarhi |
hne |
Hani |
hni |
Ho |
hoc |
Croatian |
hr |
Upper Sorbian |
hsb |
Haitian |
ht |
Hungarian |
hu |
Interlingua |
ia |
Indonesian |
id |
Icelandic |
is |
Italian |
it |
Inuktitut |
iu |
Japanese |
|
Jaunsari |
jns |
Javanese |
jv |
Kara-Kalpak (Latin) |
kaa , kaa-latn |
Kara-Kalpak (Cyrillic) |
kaa-cyrl |
Kachin |
kac |
Kabuverdianu |
kea |
Korku |
kfq |
Khasi |
kha |
Kazakh (Latin) |
kk , kk-latn |
Kazakh (Cyrillic) |
kk-cyrl |
Kalaallisut |
kl |
Khaling |
klr |
Malto |
kmj |
Language |
Code (optional) |
Korean |
|
Kosraean |
kos |
Koryak |
kpy |
Karachay-Balkar |
krc |
Kurukh |
kru |
Kölsch |
ksh |
Kurdish (Latin) |
ku , ku-latn |
Kurdish (Arabic) |
ku-arab |
Kumyk |
kum |
Cornish |
kw |
Kirghiz |
ky |
Latin |
la |
Luxembourgish |
lb |
Lakota |
lkt |
Lithuanian |
lt |
Maori |
mi |
Mongolian |
mn |
Marathi |
mr |
Malay |
ms |
Maltese |
mt |
Hmong Daw |
mww |
Erzya |
myv |
Neapolitan |
nap |
Nepali |
ne |
Niuean |
niu |
Dutch |
nl |
Norwegian |
no |
Nogai |
nog |
Occitan |
oc |
Ossetian |
os |
Panjabi |
pa |
Polish |
pl |
Dari |
prs |
Pushto |
ps |
Portuguese |
pt |
K'iche' |
quc |
Camling |
rab |
Romansh |
rm |
Romanian |
ro |
Russian |
ru |
Sanskrit |
sa |
Santali |
sat |
Language |
Code (optional) |
Sadri |
sck |
Scots |
sco |
Slovak |
sk |
Slovenian |
sl |
Samoan |
sm |
Southern Sami |
sma |
Northern Sami |
sme |
Lule Sami |
smj |
Inari Sami |
smn |
Skolt Sami |
sms |
Somali |
so |
Albanian |
sq |
Serbian (Latin) |
sr , sr-latn |
Sirmauri |
srx |
Swedish |
sv |
Swahili |
sw |
Tetum |
tet |
Tajik |
tg |
Thangmi |
thf |
Turkmen |
tk |
Tonga |
to |
Turkish |
tr |
Tatar |
tt |
Tuvinian |
tyv |
Uighur |
ug |
Urdu |
ur |
Uzbek (Latin) |
uz , uz-latn |
Uzbek (Cyrillic) |
uz-cyrl |
Uzbek (Arabic) |
uz-arab |
Volapük |
vo |
Walser |
wae |
Kangri |
xnr |
Yucateco |
yua |
Zhuang |
za |
Chinese (Han (Simplified variant)) |
zh , zh-hans |
Chinese (Han (Traditional variant)) |
zh-hant |
Zulu |
zu |
The following table lists read model language support for extracting and analyzing handwritten text.
Language |
Language code (optional) |
Language |
Language code (optional) |
English |
en |
Japanese |
ja |
Chinese Simplified |
zh-Hans |
Korean |
ko |
French |
fr |
Portuguese |
pt |
German |
de |
Spanish |
es |
Italian |
it |
Russian |
ru |
Thai |
th |
Arabic |
ar |
The following table lists read model language support for extracting and analyzing handwritten text.
Language |
Language code (optional) |
Language |
Language code (optional) |
English |
en |
Japanese |
ja |
Chinese Simplified |
zh-Hans |
Korean |
ko |
French |
fr |
Portuguese |
pt |
German |
de |
Spanish |
es |
Italian |
it |
|
|
The following table lists read model language support for extracting and analyzing handwritten text.
Language |
Language code (optional) |
Language |
Language code (optional) |
English |
en |
Japanese |
ja |
Chinese Simplified |
zh-Hans |
Korean |
ko |
French |
fr |
Portuguese |
pt |
German |
de |
Spanish |
es |
Italian |
it |
|
|
The Read model API supports language detection for the following languages in your documents. This list can include languages not currently supported for text extraction.
Important
Language detection
- Document Intelligence read model can detect the presence of languages and return language codes for languages detected.
Detected languages vs extracted languages
- This section lists the languages we can detect from the documents using the Read model, if present.
- Please note that this list differs from list of languages we support extracting text from, which is specified in the above sections for each model.
Language |
Code |
Afrikaans |
af |
Albanian |
sq |
Amharic |
am |
Arabic |
ar |
Armenian |
hy |
Assamese |
as |
Azerbaijani |
az |
Basque |
eu |
Belarusian |
be |
Bengali |
bn |
Bosnian |
bs |
Bulgarian |
bg |
Burmese |
my |
Catalan |
ca |
Central Khmer |
km |
Chinese |
zh |
Chinese Simplified |
zh_chs |
Chinese Traditional |
zh_cht |
Corsican |
co |
Croatian |
hr |
Czech |
cs |
Danish |
da |
Dari |
prs |
Divehi |
dv |
Dutch |
nl |
English |
en |
Esperanto |
eo |
Estonian |
et |
Fijian |
fj |
Finnish |
fi |
French |
fr |
Galician |
gl |
Georgian |
ka |
German |
de |
Greek |
el |
Gujarati |
gu |
Haitian |
ht |
Hausa |
ha |
Hebrew |
he |
Hindi |
hi |
Hmong Daw |
mww |
Hungarian |
hu |
Icelandic |
is |
Igbo |
ig |
Indonesian |
id |
Inuktitut |
iu |
Irish |
ga |
Italian |
it |
Japanese |
ja |
Javanese |
jv |
Kannada |
kn |
Kazakh |
kk |
Kinyarwanda |
rw |
Kirghiz |
ky |
Korean |
ko |
Kurdish |
ku |
Lao |
lo |
Latin |
la |
Language |
Code |
Latvian |
lv |
Lithuanian |
lt |
Luxembourgish |
lb |
Macedonian |
mk |
Malagasy |
mg |
Malay |
ms |
Malayalam |
ml |
Maltese |
mt |
Maori |
mi |
Marathi |
mr |
Mongolian |
mn |
Nepali |
ne |
Norwegian |
no |
Norwegian Nynorsk |
nn |
Odia |
or |
Pasht |
ps |
Persian |
fa |
Polish |
pl |
Portuguese |
pt |
Punjabi |
pa |
Queretaro Otomi |
otq |
Romanian |
ro |
Russian |
ru |
Samoan |
sm |
Serbian |
sr |
Shona |
sn |
Sindhi |
sd |
Sinhala |
si |
Slovak |
sk |
Slovenian |
sl |
Somali |
so |
Spanish |
es |
Sundanese |
su |
Swahili |
sw |
Swedish |
sv |
Tagalog |
tl |
Tahitian |
ty |
Tajik |
tg |
Tamil |
ta |
Tatar |
tt |
Telugu |
te |
Thai |
th |
Tibetan |
bo |
Tigrinya |
ti |
Tongan |
to |
Turkish |
tr |
Turkmen |
tk |
Ukrainian |
uk |
Urdu |
ur |
Uzbek |
uz |
Vietnamese |
vi |
Welsh |
cy |
Xhosa |
xh |
Yiddish |
yi |
Yoruba |
yo |
Yucatec Maya |
yua |
Zulu |
zu |
Layout
Model ID: prebuilt-layout
The following table lists the supported languages for printed text:
Language |
Code (optional) |
Abaza |
abq |
Abkhazian |
ab |
Achinese |
ace |
Acoli |
ach |
Adangme |
ada |
Adyghe |
ady |
Afar |
aa |
Afrikaans |
af |
Akan |
ak |
Albanian |
sq |
Algonquin |
alq |
Angika (Devanagari) |
anp |
Arabic |
ar |
Asturian |
ast |
Asu (Tanzania) |
asa |
Avaric |
av |
Awadhi-Hindi (Devanagari) |
awa |
Aymara |
ay |
Azerbaijani (Latin) |
az |
Bafia |
ksf |
Bagheli |
bfy |
Bambara |
bm |
Bashkir |
ba |
Basque |
eu |
Belarusian (Cyrillic) |
be , be-cyrl |
Belarusian (Latin) |
be , be-latn |
Bemba (Zambia) |
bem |
Bena (Tanzania) |
bez |
Bhojpuri-Hindi (Devanagari) |
bho |
Bikol |
bik |
Bini |
bin |
Bislama |
bi |
Bodo (Devanagari) |
brx |
Bosnian (Latin) |
bs |
Brajbha |
bra |
Breton |
br |
Bulgarian |
bg |
Bundeli |
bns |
Buryat (Cyrillic) |
bua |
Catalan |
ca |
Cebuano |
ceb |
Chamling |
rab |
Chamorro |
ch |
Chechen |
ce |
Chhattisgarhi (Devanagari) |
hne |
Chiga |
cgg |
Chinese Simplified |
zh-Hans |
Chinese Traditional |
zh-Hant |
Choctaw |
cho |
Chukot |
ckt |
Chuvash |
cv |
Cornish |
kw |
Corsican |
co |
Cree |
cr |
Creek |
mus |
Crimean Tatar (Latin) |
crh |
Croatian |
hr |
Crow |
cro |
Czech |
cs |
Danish |
da |
Dargwa |
dar |
Dari |
prs |
Dhimal (Devanagari) |
dhi |
Dogri (Devanagari) |
doi |
Duala |
dua |
Dungan |
dng |
Dutch |
nl |
Efik |
efi |
English |
en |
Erzya (Cyrillic) |
myv |
Estonian |
et |
Faroese |
fo |
Fijian |
fj |
Filipino |
fil |
Finnish |
fi |
Language |
Code (optional) |
Fon |
fon |
French |
fr |
Friulian |
fur |
Ga |
gaa |
Gagauz (Latin) |
gag |
Galician |
gl |
Ganda |
lg |
Gayo |
gay |
German |
de |
Gilbertese |
gil |
Gondi (Devanagari) |
gon |
Greek |
el |
Greenlandic |
kl |
Guarani |
gn |
Gurung (Devanagari) |
gvr |
Gusii |
guz |
Haitian Creole |
ht |
Halbi (Devanagari) |
hlb |
Hani |
hni |
Haryanvi |
bgc |
Hawaiian |
haw |
Hebrew |
he |
Herero |
hz |
Hiligaynon |
hil |
Hindi |
hi |
Hmong Daw (Latin) |
mww |
Ho(Devanagiri) |
hoc |
Hungarian |
hu |
Iban |
iba |
Icelandic |
is |
Igbo |
ig |
Iloko |
ilo |
Inari Sami |
smn |
Indonesian |
id |
Ingush |
inh |
Interlingua |
ia |
Inuktitut (Latin) |
iu |
Irish |
ga |
Italian |
it |
Japanese |
ja |
Jaunsari (Devanagari) |
Jns |
Javanese |
jv |
Jola-Fonyi |
dyo |
Kabardian |
kbd |
Kabuverdianu |
kea |
Kachin (Latin) |
kac |
Kalenjin |
kln |
Kalmyk |
xal |
Kangri (Devanagari) |
xnr |
Kanuri |
kr |
Karachay-Balkar |
krc |
Kara-Kalpak (Cyrillic) |
kaa-cyrl |
Kara-Kalpak (Latin) |
kaa |
Kashubian |
csb |
Kazakh (Cyrillic) |
kk-cyrl |
Kazakh (Latin) |
kk-latn |
Khakas |
kjh |
Khaling |
klr |
Khasi |
kha |
K'iche' |
quc |
Kikuyu |
ki |
Kildin Sami |
sjd |
Kinyarwanda |
rw |
Komi |
kv |
Kongo |
kg |
Korean |
ko |
Korku |
kfq |
Koryak |
kpy |
Kosraean |
kos |
Kpelle |
kpe |
Kuanyama |
kj |
Kumyk (Cyrillic) |
kum |
Kurdish (Arabic) |
ku-arab |
Kurdish (Latin) |
ku-latn |
Language |
Code (optional) |
Kurukh (Devanagari) |
kru |
Kyrgyz (Cyrillic) |
ky |
Lak |
lbe |
Lakota |
lkt |
Latin |
la |
Latvian |
lv |
Lezghian |
lex |
Lingala |
ln |
Lithuanian |
lt |
Lower Sorbian |
dsb |
Lozi |
loz |
Lule Sami |
smj |
Luo (Kenya and Tanzania) |
luo |
Luxembourgish |
lb |
Luyia |
luy |
Macedonian |
mk |
Machame |
jmc |
Madurese |
mad |
Mahasu Pahari (Devanagari) |
bfz |
Makhuwa-Meetto |
mgh |
Makonde |
kde |
Malagasy |
mg |
Malay (Latin) |
ms |
Maltese |
mt |
Malto (Devanagari) |
kmj |
Mandinka |
mnk |
Manx |
gv |
Maori |
mi |
Mapudungun |
arn |
Marathi |
mr |
Mari (Russia) |
chm |
Masai |
mas |
Mende (Sierra Leone) |
men |
Meru |
mer |
Meta' |
mgo |
Minangkabau |
min |
Mohawk |
moh |
Mongolian (Cyrillic) |
mn |
Mongondow |
mog |
Montenegrin (Cyrillic) |
cnr-cyrl |
Montenegrin (Latin) |
cnr-latn |
Morisyen |
mfe |
Mundang |
mua |
Nahuatl |
nah |
Navajo |
nv |
Ndonga |
ng |
Neapolitan |
nap |
Nepali |
ne |
Ngomba |
jgo |
Niuean |
niu |
Nogay |
nog |
North Ndebele |
nd |
Northern Sami (Latin) |
sme |
Norwegian |
no |
Nyanja |
ny |
Nyankole |
nyn |
Nzima |
nzi |
Occitan |
oc |
Ojibwa |
oj |
Oromo |
om |
Ossetic |
os |
Pampanga |
pam |
Pangasinan |
pag |
Papiamento |
pap |
Pashto |
ps |
Pedi |
nso |
Persian |
fa |
Polish |
pl |
Portuguese |
pt |
Punjabi (Arabic) |
pa |
Quechua |
qu |
Ripuarian |
ksh |
Romanian |
ro |
Romansh |
rm |
Rundi |
rn |
Russian |
ru |
Language |
Code (optional) |
Rwa |
rwk |
Sadri (Devanagari) |
sck |
Sakha |
sah |
Samburu |
saq |
Samoan (Latin) |
sm |
Sango |
sg |
Sangu (Gabon) |
snq |
Sanskrit (Devanagari) |
sa |
Santali(Devanagiri) |
sat |
Scots |
sco |
Scottish Gaelic |
gd |
Sena |
seh |
Serbian (Cyrillic) |
sr-cyrl |
Serbian (Latin) |
sr , sr-latn |
Shambala |
ksb |
Shona |
sn |
Siksika |
bla |
Sirmauri (Devanagari) |
srx |
Skolt Sami |
sms |
Slovak |
sk |
Slovenian |
sl |
Soga |
xog |
Somali (Arabic) |
so |
Somali (Latin) |
so-latn |
Songhai |
son |
South Ndebele |
nr |
Southern Altai |
alt |
Southern Sami |
sma |
Southern Sotho |
st |
Spanish |
es |
Sundanese |
su |
Swahili (Latin) |
sw |
Swati |
ss |
Swedish |
sv |
Tabassaran |
tab |
Tachelhit |
shi |
Tahitian |
ty |
Taita |
dav |
Tajik (Cyrillic) |
tg |
Tamil |
ta |
Tatar (Cyrillic) |
tt-cyrl |
Tatar (Latin) |
tt |
Teso |
teo |
Tetum |
tet |
Thai |
th |
Thangmi |
thf |
Tok Pisin |
tpi |
Tongan |
to |
Tsonga |
ts |
Tswana |
tn |
Turkish |
tr |
Turkmen (Latin) |
tk |
Tuvan |
tyv |
Udmurt |
udm |
Uighur (Cyrillic) |
ug-cyrl |
Ukrainian |
uk |
Upper Sorbian |
hsb |
Urdu |
ur |
Uyghur (Arabic) |
ug |
Uzbek (Arabic) |
uz-arab |
Uzbek (Cyrillic) |
uz-cyrl |
Uzbek (Latin) |
uz |
Vietnamese |
vi |
Volapük |
vo |
Vunjo |
vun |
Walser |
wae |
Welsh |
cy |
Western Frisian |
fy |
Wolof |
wo |
Xhosa |
xh |
Yucatec Maya |
yua |
Zapotec |
zap |
Zarma |
dje |
Zhuang |
za |
Zulu |
zu |
The following table lists layout model language support for extracting and analyzing printed text.
Language |
Code (optional) |
Afrikaans |
af |
Angika |
anp |
Arabic |
ar |
Asturian |
ast |
Awadhi |
awa |
Azerbaijani |
az |
Belarusian (Cyrillic) |
be , be-cyrl |
Belarusian (Latin) |
be-latn |
Bagheli |
bfy |
Mahasu Pahari |
bfz |
Bulgarian |
bg |
Haryanvi |
bgc |
Bhojpuri |
bho |
Bislama |
bi |
Bundeli |
bns |
Breton |
br |
Braj |
bra |
Bodo |
brx |
Bosnian |
bs |
Buriat |
bua |
Catalan |
ca |
Cebuano |
ceb |
Chamorro |
ch |
Montenegrin (Latin) |
cnr , cnr-latn |
Montenegrin (Cyrillic) |
cnr-cyrl |
Corsican |
co |
Crimean Tatar |
crh |
Czech |
cs |
Kashubian |
csb |
Welsh |
cy |
Danish |
da |
German |
de |
Dhimal |
dhi |
Dogri |
doi |
Lower Sorbian |
dsb |
English |
en |
Spanish |
es |
Estonian |
et |
Basque |
eu |
Persian |
fa |
Finnish |
fi |
Filipino |
fil |
Language |
Code (optional) |
Fijian |
fj |
Faroese |
fo |
French |
fr |
Friulian |
fur |
Western Frisian |
fy |
Irish |
ga |
Gagauz |
gag |
Scottish Gaelic |
gd |
Gilbertese |
gil |
Galician |
gl |
Gondi |
gon |
Manx |
gv |
Gurung |
gvr |
Hawaiian |
haw |
Hindi |
hi |
Halbi |
hlb |
Chhattisgarhi |
hne |
Hani |
hni |
Ho |
hoc |
Croatian |
hr |
Upper Sorbian |
hsb |
Haitian |
ht |
Hungarian |
hu |
Interlingua |
ia |
Indonesian |
id |
Icelandic |
is |
Italian |
it |
Inuktitut |
iu |
Japanese |
|
Jaunsari |
jns |
Javanese |
jv |
Kara-Kalpak (Latin) |
kaa , kaa-latn |
Kara-Kalpak (Cyrillic) |
kaa-cyrl |
Kachin |
kac |
Kabuverdianu |
kea |
Korku |
kfq |
Khasi |
kha |
Kazakh (Latin) |
kk , kk-latn |
Kazakh (Cyrillic) |
kk-cyrl |
Kalaallisut |
kl |
Khaling |
klr |
Malto |
kmj |
Language |
Code (optional) |
Korean |
|
Kosraean |
kos |
Koryak |
kpy |
Karachay-Balkar |
krc |
Kurukh |
kru |
Kölsch |
ksh |
Kurdish (Latin) |
ku , ku-latn |
Kurdish (Arabic) |
ku-arab |
Kumyk |
kum |
Cornish |
kw |
Kirghiz |
ky |
Latin |
la |
Luxembourgish |
lb |
Lakota |
lkt |
Lithuanian |
lt |
Maori |
mi |
Mongolian |
mn |
Marathi |
mr |
Malay |
ms |
Maltese |
mt |
Hmong Daw |
mww |
Erzya |
myv |
Neapolitan |
nap |
Nepali |
ne |
Niuean |
niu |
Dutch |
nl |
Norwegian |
no |
Nogai |
nog |
Occitan |
oc |
Ossetian |
os |
Panjabi |
pa |
Polish |
pl |
Dari |
prs |
Pushto |
ps |
Portuguese |
pt |
K'iche' |
quc |
Camling |
rab |
Romansh |
rm |
Romanian |
ro |
Russian |
ru |
Sanskrit |
sa |
Santali |
sat |
Language |
Code (optional) |
Sadri |
sck |
Scots |
sco |
Slovak |
sk |
Slovenian |
sl |
Samoan |
sm |
Southern Sami |
sma |
Northern Sami |
sme |
Lule Sami |
smj |
Inari Sami |
smn |
Skolt Sami |
sms |
Somali |
so |
Albanian |
sq |
Serbian (Latin) |
sr , sr-latn |
Sirmauri |
srx |
Swedish |
sv |
Swahili |
sw |
Tetum |
tet |
Tajik |
tg |
Thangmi |
thf |
Turkmen |
tk |
Tonga |
to |
Turkish |
tr |
Tatar |
tt |
Tuvinian |
tyv |
Uighur |
ug |
Urdu |
ur |
Uzbek (Latin) |
uz , uz-latn |
Uzbek (Cyrillic) |
uz-cyrl |
Uzbek (Arabic) |
uz-arab |
Volapük |
vo |
Walser |
wae |
Kangri |
xnr |
Yucateco |
yua |
Zhuang |
za |
Chinese (Han (Simplified variant)) |
zh , zh-hans |
Chinese (Han (Traditional variant)) |
zh-hant |
Zulu |
zu |
Language |
Language code |
Afrikaans |
af |
Albanian |
sq |
Asturian |
ast |
Basque |
eu |
Bislama |
bi |
Breton |
br |
Catalan |
ca |
Cebuano |
ceb |
Chamorro |
ch |
Chinese (Simplified) |
zh-Hans |
Chinese (Traditional) |
zh-Hant |
Cornish |
kw |
Corsican |
co |
Crimean Tatar (Latin) |
crh |
Czech |
cs |
Danish |
da |
Dutch |
nl |
English (printed and handwritten) |
en |
Estonian |
et |
Fijian |
fj |
Filipino |
fil |
Finnish |
fi |
French |
fr |
Friulian |
fur |
Galician |
gl |
German |
de |
Gilbertese |
gil |
Greenlandic |
kl |
Haitian Creole |
ht |
Hani |
hni |
Hmong Daw (Latin) |
mww |
Hungarian |
hu |
Indonesian |
id |
Interlingua |
ia |
Inuktitut (Latin) |
iu |
Irish |
ga |
Language |
Language code |
Italian |
it |
Japanese |
ja |
Javanese |
jv |
K'iche' |
quc |
Kabuverdianu |
kea |
Kachin (Latin) |
kac |
Kara-Kalpak |
kaa |
Kashubian |
csb |
Khasi |
kha |
Korean |
ko |
Kurdish (latin) |
kur |
Luxembourgish |
lb |
Malay (Latin) |
ms |
Manx |
gv |
Neapolitan |
nap |
Norwegian |
no |
Occitan |
oc |
Polish |
pl |
Portuguese |
pt |
Romansh |
rm |
Scots |
sco |
Scottish Gaelic |
gd |
Slovenian |
slv |
Spanish |
es |
Swahili (Latin) |
sw |
Swedish |
sv |
Tatar (Latin) |
tat |
Tetum |
tet |
Turkish |
tr |
Upper Sorbian |
hsb |
Uzbek (Latin) |
uz |
Volapük |
vo |
Walser |
wae |
Western Frisian |
fy |
Yucatec Maya |
yua |
Zhuang |
za |
Zulu |
zu |
The following table lists layout model language support for extracting and analyzing handwritten text.
Language |
Language code (optional) |
Language |
Language code (optional) |
English |
en |
Japanese |
ja |
Chinese Simplified |
zh-Hans |
Korean |
ko |
French |
fr |
Portuguese |
pt |
German |
de |
Spanish |
es |
Italian |
it |
Russian |
ru |
Thai |
th |
Arabic |
ar |
Model ID: prebuilt-layout
The following table lists layout model language support for extracting and analyzing handwritten text.
Language |
Language code (optional) |
Language |
Language code (optional) |
English |
en |
Japanese |
ja |
Chinese Simplified |
zh-Hans |
Korean |
ko |
French |
fr |
Portuguese |
pt |
German |
de |
Spanish |
es |
Italian |
it |
|
|
Note
Document Intelligence v2.1 does not support handwritten text extraction.
The following table lists layout model language support for extracting and analyzing handwritten text.
Language |
Language code (optional) |
Language |
Language code (optional) |
English |
en |
Japanese |
ja |
Chinese Simplified |
zh-Hans |
Korean |
ko |
French |
fr |
Portuguese |
pt |
German |
de |
Spanish |
es |
Italian |
it |
Russian |
ru |
Thai |
th |
Arabic |
ar |
General document
Important
With Document Intelligence v4.0:2024-11-30 (GA), the general document model (prebuilt-document) is being added to layout (prebuilt-layout). To extract key-value pairs, selection marks, text, tables, and structure from documents, use the following models:
Key value pairs |
version |
Model ID |
Layout model with query string features=keyValuePairs specified. |
• v4:2024-11-30 (GA) • v3.1:2023-07-31 (GA) |
prebuilt-layout |
General document model |
• v3.1:2023-07-31 (GA) • v3.0:2022-08-31 (GA) |
prebuilt-document |
Model ID: prebuilt-document
The following table lists general document model language support.
Model ID |
Language—Locale code |
Default |
prebuilt-document |
English (United States)—en-US |
English (United States)—en-US |