Jump to content

Help:Multilingual support

From Wikipedia, the free encyclopedia
(Redirected from Wikipedia:UTF-8)

Articles on the English Wikipedia may contain words or texts written in different languages and scripts. To be able to correctly view and edit these articles requires that you have the appropriate fonts installed and to have correctly configured your operating system and browser. This guide will help you to do so.

Overview

[edit]

Unicode

[edit]

Articles on Wikipedia are encoded using Unicode (specifically UTF-8)[a], an industry standard designed to allow text and symbols from all of the writing systems of the world to be consistently represented and manipulated by computers. Because UTF-8 is backward compatible with ASCII, and most modern browsers have at least basic Unicode support, most users will experience little difficulty reading and editing most of Wikipedia.

Font

[edit]

Most computers with Microsoft Windows, Apple's macOS and many Linux variants will already have fonts with support for Latin, Greek, Cyrillic, Hebrew, Arabic, Chinese, Japanese, Korean and the International Phonetic Alphabet installed. Many mobile devices, such as the iPhone and iPad also include such fonts. Several historic and accented characters (used in the transliteration of foreign scripts) may be missing, though.

Microsoft fonts

[edit]

Other available Unicode fonts

[edit]

Bolded fonts are recommended.

Font Typeface License Format Encoding
Aboriginal Sans-serif, Serif Freeware OpenType Unicode 5.2
Charis SIL Serif Open Source OpenType, Graphite Unicode 7.0
Code2002 Archived December 15, 2010, at the Wayback Machine Freeware (must not be altered) TrueType Unicode, plane 2
Code2001 0.919 Archived September 27, 2007, at the Wayback Machine Freeware (must not be altered) TrueType Unicode, plane 1
Code2000 1.171 Archived September 27, 2007, at the Wayback Machine Serif Shareware (unrestricted) TrueType Unicode, plane 0
DejaVu Sans-serif, Sans-mono, Serif Open Source OpenType Unicode
Doulos SIL Serif Open Source OpenType, Graphite Unicode 7.0
Everson Mono 3.2b4 Sans-mono Shareware TrueType Unicode
Fonts for Ancient Scripts (Greek, Egyptian, cuneiform...) Varying No license, but may be used for any purpose TrueType Unicode
Google Noto (Project to support all Unicode scripts) Sans-serif, Serif Open Source OpenType Unicode
Hanazono (80,000+ Chinese characters supported) Ming (comparable to serifed typefaces) Freeware (unrestricted) TrueType Unicode
Kurinto Font Folio (Project to support all human languages) 21 typefaces with variants Open Source (OFL) TrueType Unicode 12.1
TH-Times (in TH-Tshyn)[Simplified Chinese page], [English page] Serif Non-commercial TrueType Unicode 15.1
TITUS Cyberbit Basic Serif Non-commercial TrueType, but requires Windows to install Unicode 4.0
Quivira Serif Freeware OpenType Unicode 7.0
GNU Unifont Mono Freeware (GPL) TrueType Unicode 15.0

Browsers

[edit]
Internet Explorer
supports Latin (however not all extended sets), Greek, Cyrillic, Arabic, and Hebrew. Support for East Asian and some Indic scripts is available if support for this has been installed for Windows. As Internet Explorer will only use the default font for other scripts, those are usually not supported (unless the default font does).
Firefox
tries to render any character using all the fonts available on the system so multilingual support is generally good. The default rendering engine can support complex script rendering. Some Linux distributions ship with a Pango-based rendering engine which also does, although this may currently cause some display glitches with justified text.
Opera
tries to render any character using all the fonts available on the system so multilingual support is also good.[5] Opera uses the operating system to perform contextual glyph selection, ligature forming, character stacking, combining character support and other character shaping tasks.[6]
Chrome
does not directly support several languages of South and Southeast Asian countries, but otherwise renders some tofu signs, due to its problem of font fallback mechanism, you may need the Advanced Font Settings extension to optimize. Renders Devanagari (used for Hindi), Bengali, Sinhala, Gurmukhi, and Tibetan scripts in the examples below, but not some of languages of Southeast Asian countries.

Scripts

[edit]

Adlam

[edit]

Adlam is a right-to-left alphabetic script devised by the brothers Ibrahima and Abdoulaye Barry, in order to represent the Fula language (Fulani). It is supported by the following fonts:

Correct rendering Your browser/device
𞤀𞤣𞤤𞤢𞤥

Note: As of August 2018, this script is not being used on the Fula Wikipedia.

Aegean numerals

[edit]

Aegean numerals were used by the Minoan and Mycenaean civilizations. They are supported by the following fonts:

Correct rendering Your browser/device
𐄢𐄡𐄗𐄌

Ahom

[edit]

Ahom script is a script used to write the Ahom language. It is supported by the following fonts:

Correct rendering Your browser/device
𑜇𑜞

Ancient South Arabian

[edit]

Ancient South Arabian script (Old South Arabian) was used to write the Minean, Sabaean, Qatabanian, Hadramite, and Himyaritic languages of Yemen from the 8th century BCE to the 6th century CE. It is supported by the following fonts:

Correct rendering Your browser/device
𐩠𐩭𐩵𐩼𐩥

Armenian

[edit]

The Armenian alphabet is only used to write the Armenian language. It is supported by the following fonts:

Correct rendering Your browser/device
Հայաստան

Avestan

[edit]

The Avestan alphabet is used to write the Avestan language. It is supported by the following fonts:

Correct rendering Your browser/device
𐬯𐬭𐬀𐬊𐬔𐬁

Balinese

[edit]

The Balinese script is used to write the Balinese language. The script is encoded in block "Balinese", code points 1B00–1B7F (Unicode.org chart). It is supported by the following fonts:

Correct rendering
Your browser/device ᭚ᬲ᭄ᬯᬲ᭄ᬢᬶ​ᬧ᭄ᬭᬧ᭄ᬢᬶ​ᬭᬶᬂ​ᬯᬶᬓᬶᬧᬾᬤᬶᬳ​ᬩᬲ​ᬩᬮᬶ᭟
Transliteration Swasti Prapti ring Wikipédia Basa Bali

Bamum

[edit]

Bamum is a series of scripts devised for the Bamum language by King Njoya of Cameroon between 1896 and 1918. It is supported by the following fonts:

Correct rendering Your browser/device
ꚩꚫꛑꚩꚳ ꛆꚧꛂ

Bassa Vah

[edit]

Bassa Vah, also known as simply vah ('throwing a sign' in Bassa) is an alphabetic script for writing the Bassa language of Liberia that was invented by Thomas Flo Lewis. It is supported by the following fonts:

Correct rendering Your browser/device
𖫧

Batak

[edit]

The Batak alphabet is used to write the Batak languages. It is supported by the following fonts:

Correct rendering Your browser/device Transliteration
ᯀᯂ᯲ᯘᯒ aksara

Note: As of August 2018, this script is not in wide use on the Toba Batak test wiki at the Wikimedia Incubator (apart from a few images on the Main Page).

Baybayin / Old Tagalog

[edit]

Baybayin (also known as the Tagalog script in Unicode and sometimes mistakenly referred to as Alibata) is a Brahmic writing system used for several Philippine languages before and early into the Spanish conquest. It is related to other Brahmic scripts currently in use in the Philippines. It is supported by the following fonts:

  • Kurinto Font Folio (9 typefaces that have "Aux" variant fonts)
  • Noto Sans Tagalog, a font made by Google
  • Paul Morrow's Baybayin Fonts. Offers the most extensive list of Baybayin fonts for Windows and Macintosh operating systems
  • Quivira is a proportional serif font that produces very readable text. Supports several scripts, among them the Baybayin script
Correct rendering
Your browser/device ᜀᜅ᜔ ᜊᜏᜆ᜔ ᜆᜂ ᜀᜌ᜔ ᜁᜐᜒᜈᜒᜎᜅ᜔ ᜈ ᜋᜌ᜔ ᜃᜇᜉᜆᜈ᜔,
ᜀᜆ᜔ ᜉᜈ᜔ᜆᜌ᜔ ᜐ ᜇᜒᜄ᜔ᜈᜒᜇᜇ᜔,
ᜀᜆ᜔ ᜃᜇᜉᜆᜈ᜔ ᜀᜅ᜔ ᜆᜂ ᜀᜌ᜔ ᜊᜒᜈᜒᜌᜌᜀᜈ᜔ ᜅ᜔ ᜉᜄᜒᜁᜐᜒᜉ᜔,
ᜀᜆ᜔ ᜃᜇᜓᜈᜓᜅᜈ᜔ ᜈ ᜃᜁᜎᜅᜅ᜔ ᜋᜄ᜔ᜃᜁᜐ ᜐ ᜃᜉᜆᜒᜇᜈ᜔
Transliteration Ang bawat tao ay isinilang na may karapatan, at pantay sa dignidad, at karapatan ang tao ay biniyayaan ng pag-iisip, at karapatan na kailangang magkaisa sa kapatiran.

Bhaiksuki

[edit]

The Bhaiksuki script was historically used to write Buddhist literature in Sanskrit. It is supported by the following font:

Correct rendering Your browser/device
𑰥𑰹𑰎𑰿𑰬𑰲𑰎𑰱

Brahmi

[edit]

The Brahmi script is one of the oldest writing systems used in Ancient India and present South and Central Asia from the 1st millennium BCE. It is supported by the following fonts:

Correct rendering Your browser/device
𑀤𑁂𑀯𑀸𑀦𑀧𑀺𑀬𑁂𑀦

Note: The Brahmi script should not be confused with the family of Brahmic scripts.

Buhid

[edit]

The Buhid script is used to write the Buhid language. It is supported to varying extents by the following fonts:

  • Kurinto Font Folio (11 typefaces that have "Main" variant fonts)
  • Noto Sans Buhid, a font made by Google
  • Quivira NOT RECOMMENDED FOR BUHID: It contains basic Buhid letters but not the ligatures required to correctly render many Buhid syllables
  • Code2000 NOT RECOMMENDED FOR BUHID: It contains basic Buhid letters but not the ligatures required to correctly render many Buhid syllables
Correct rendering Your browser/device Sample syllables
ᝃᝒᝎᝒᝐᝓᝈᝓᝆ kilisunuta

Burmese

[edit]

The Burmese alphabet is used to write the Burmese language. The script is encoded in block "Myanmar", code points 1000-109F (Unicode.org chart). It is supported by the following fonts:

Correct rendering Your browser/device
ဃ + ြ → ဃြ

Canadian Aboriginal Syllabics

[edit]

Canadian Aboriginal syllabics are an abugida used to write a number of First Nations languages in Canada, including Cree, Ojibwe, Naskapi, Inuktitut, Blackfoot, Sayisi, and Carrier. It is supported by the following fonts:

Correct rendering Your browser/device
ᓀᐦᐃᔭᐍᐏᐣ

Note: As of August 2018, this script is not being used on the Atikamekw Wikipedia, plus Ojibwe and Blackfoot test wikis at the Wikimedia Incubator.

Chakma

[edit]

The Chakma script is used to write the Chakma language, and recently for the Pali language.

Correct rendering Your browser/device
𑄌𑄋𑄴𑄟𑄳𑄦 𑄃𑄧𑄏𑄛𑄖𑄴

Cham

[edit]

The Cham alphabet is used to write the Cham language. It is supported by the following fonts:

Correct rendering Your browser/device

Note: As of August 2018, this script is not being used on the Eastern Cham and Western Cham test wikis at the Wikimedia Incubator.

Caucasian Albanian

[edit]

The Caucasian Albanian script was an alphabetic writing system used by the Caucasian Albanians, one of the ancient Northeast Caucasian peoples whose territory comprised parts of present-day Azerbaijan and Dagestan. It is supported by the following fonts:

Correct rendering Your browser/device
𐔰

Cherokee

[edit]

The Cherokee syllabary, used to write the Cherokee language, is supported by the following fonts:

Lowercase Cherokee letters were added to Unicode version 8.0 in June, 2015. Font support for lowercase Cherokee is not yet widespread. Those fonts that do support lowercase are:

Cherokee uppercase letters:

Correct rendering Your browser/device
ᎠᏂᏴᏫᏯ

Cherokee lowercase letters:

Correct rendering Your browser/device
Ꮳꮃꭹ Ꭶꮼꮒꭿꮝꮧ

Coptic

[edit]

The Coptic alphabet is used to write the Coptic language, which was used in Egypt before Arabic. It is currently used solely as a liturgical language, and is supported by the following fonts:

Correct rendering Your browser/device
ⲙⲛⲧⲣⲙⲛⲕⲏⲙⲉ

Cuneiform

[edit]

The cuneiform script was primarily used to write Akkadian (including Assyrian and Babylonian) and Sumerian. It is supported by the following fonts:

Correct rendering Your browser/device
𒅎𒀝𒂵𒌈

Deseret

[edit]

The Deseret alphabet is an alternative alphabet for writing the English language. It is supported by the following fonts:

Correct rendering Your browser/device
𐐔𐐯𐑅𐐨𐑉𐐯𐐻 𐐈𐑊𐑁𐐰𐐺𐐯𐐻

Dives Akuru

[edit]

Dives Akuru is a script that was historically used to write the Maldivian language. It is supported by the following font:

Correct rendering Your browser/device
𑤞𑤱𑤩𑤵𑤭𑤱 𑤀𑤌𑤳𑤧𑤳

Duployan Shorthand

[edit]

The Duployan shorthand, or Duployan stenography (French: Sténographie Duployé), was created by Father Émile Duployé in 1860 for writing French. Historically, it was used for writing the Chinook Jargon language. It is supported by the following font:

Correct rendering Your browser/device
𛰚

East Asian

[edit]
Script Correct rendering Your browser/device
Traditional Chinese 人人生來自由,
在尊嚴和權利上一律平等。
他們有理性和良心,
請以手足關係的精神相對待。
Simplified Chinese 人人生来自由,
在尊严和权利上一律平等。
他们有理性和良心,
请以手足关系的精神相对待。
Japanese すべての人間は、生まれながらにして自由であり、
かつ、尊厳と権利と について平等である。
人間は、理性と良心とを授けられており、
互いに同胞の精神をもって行動しなければならない。
Korean 모든 인간은 태어날 때부터
자유로우며 그 존엄과 권리에
있어 동등하다. 인간은 천부적으로
이성과 양심을 부여받았으며 서로
형제애의 정신으로 행동하여야 한다.

Several Wikipedias use these scripts, including Chinese, Classical Chinese, Cantonese (Yue), Gan, Japanese, and Korean. They are not used (widely) in the Min Nan, Zhuang, or Vietnamese Wikipedias, even though the scripts are sometimes used in those languages, as well.

Hentaigana

[edit]

Hentaigana are obsolete or nonstandard hiragana used occasionally on signage in Japan. They are supported by the following fonts:

Correct rendering Your browser/device
𛂛

Egyptian hieroglyphs

[edit]

Egyptian hieroglyphs are supported by the following fonts:

Glyph stacking and formatting is accomplished via Egyptian Hieroglyph Format Controls, which were added to version 12 of the Unicode standard in March 2019. However the fonts above do not yet support this feature.

Correct rendering Your browser/device
it
n
ra
G25x
n
𓇋𓏏𓐰𓈖𓐰𓇳𓅜𓐍𓐰𓈖

See also Help:WikiHiero syntax.

Elbasan

[edit]

The Elbasan script is a mid 18th-century alphabetic script used for the Albanian language. It is supported by the following fonts:

Correct rendering Your browser/device
𐔀

Ethiopic

[edit]

The Ethiopic syllabary is used in central east Africa for Amharic, Bilen, Tigre, Tigrinya, and other languages. It evolved from the script for classical Ge'ez, which is now strictly a liturgical language. It is supported by the following fonts:

Correct rendering Your browser/device
ኢትዮጵያ

Note: As of August 2018, this script is not being used on the Oromo Wikipedia.

Gothic

[edit]

The Gothic alphabet, which is used to write the Gothic language, is supported by the following fonts:

See also:

Correct rendering Your browser/device
𐌲𐌿𐍄𐌹𐍃𐌺

Grantha

[edit]

The Grantha script, used in Tamil Nadu and Kerala to write Sanskrit, is supported by the following fonts:

Correct rendering Your browser/device
𑌗𑍍𑌰𑌨𑍍𑌥

Gunjala Gondi

[edit]

The Gunjala Gondi script is used to write the Gondi language. It is supported by the following font:

Correct rendering Your browser/device
𑵶𑶍𑶕𑶀𑵵𑶊 𑵶𑶓𑶕𑶂𑶋
𑵵𑶋𑶅𑶋

Hanunó'o

[edit]

Hanunó'o script is used to write the Hanunó'o language. It is supported to varying extents by the following fonts:

  • GNU FreeFont
  • Kurinto Font Folio (11 typefaces that have "Main" variant fonts)
  • Noto Sans Hanunoo, a font made by Google
  • Quivira NOT RECOMMENDED FOR HANUNÓ'O: It contains basic Hanunó'o letters but not the ligatures required to correctly render many Hanunó'o syllables.

After downloading and installing one or more of the fonts above, reload this page as a check. For example, the GNU FreeSans font might not render the characters in the following table correctly on your device and browser, whilst the Noto Sans Hanunoo font might.

Correct rendering Your browser/device Sample syllables
ᜥᜥᜲᜥᜳ nga ngi ngu

Imperial Aramaic

[edit]

The ancient Aramaic alphabet was adapted by Arameans from the Phoenician alphabet and became a distinct script by the 8th century BC. It is supported by the following fonts:

Correct rendering Your browser/device
𐡀

Indic

[edit]

The following table compares how a correctly enabled computer would render the following scripts with how your computer renders them:

Script Correct rendering Your browser/device Help page
Bengali–Assamese ক + িকি Wikipedia:Bangla script display help
Devanāgarī क + िकि Template:Devfonthelp
Gujarati ક + િકિ
Gurmukhī ਕ + ਿਕਿ
Kannada ಕ + ಿಕಿ
Malayalam ക + െകെ
Odia କ + େକେ
Sinhala ඵ + ේඵේ
Tibetan ར + ྐ + ྱརྐྱ
Tamil க + ேகே
Telugu య + ీయీ

These scripts are used in a great many Wikipedias, including the ones for Assamese, Bengali, Bhojpuri, Bishnupriya Manipuri, Central Tibetan, Dzongkha, Gujarati, Kannada, Kashmiri, Goan Konkani, Maithili, Malayalam, Marathi, Nepali, Newar, Odia, Pali, Eastern Punjabi, Sanskrit, Sinhalese, Tamil, Telugu, and Tulu.

They are also used in the Wikimedia Incubator test wikis for Angika, Awadhi, Badaga, Bodo, Chhattisgarhi, Haryanvi, Kanikkaran, Kutchi, Rajasthani, Saurashtra, and Tamang.

Inscriptional Parthian

[edit]

Inscriptional Parthian was used for writing the Parthian language. It is supported by the following fonts:

Correct rendering Your browser/device
𐭀𐭅𐭎 𐭔𐭅𐭂𐭅𐭍 𐭋𐭍

Javanese

[edit]

The Javanese script is used to write the Javanese language. It is supported by Unicode 5.2 and above. The script is a so-called SIL Graphite-script, and is best supported by Firefox. As of recently, however, it can be rendered by the OpenType and TrueType standards, provided the right font is used. It is supported by the following fonts:

Correct rendering
Your browser/device ꧋ꦱꦸꦒꦼꦁꦫꦮꦸꦃꦮꦺꦤ꧀ꦠꦼꦤ꧀ꦲꦶꦁꦮꦶꦏꦶꦥꦺꦝꦶꦪꦃꦗꦮꦶ꧉
Transliteration Sugeng Rawuh Wènten ing Wikipédia Jawi

Kaithi

[edit]

Kaithi, also called "Kayathi" or "Kayasthi", is a historical script used widely in parts of North India. It is supported by the following fonts:

Correct rendering Your browser/device
𑂍𑂶𑂟𑂲

Kaktovik numerals

[edit]

The Kaktovik numerals are a base-20 system of numerical digits created by Alaskan Iñupiat. They are supported by the following fonts:

Correct rendering Your browser/device
𝋄𝋈𝋌

Kawi

[edit]

The Kawi script was used primarily in Java and across much of Maritime Southeast Asia between the 8th century and the 16th century.

Correct rendering Your browser/device
𑼒𑼮𑼶

Kharosthi

[edit]

Kharosthi, also spelled Kharoshthi or Kharoṣṭhī, is an ancient script used in ancient Gandhara and ancient India. It is supported by the following fonts:

  • Noto Sans Kharosthi NOT RECOMMENDED FOR KHAROSTHI: Even though it's a font made by Google, it doesn't render many necessary conjunctions, but Segoe UI does. It also has misplaced vowel marks.
  • Segoe UI Historic (Microsoft Windows font, available in Windows 10 and later)
Correct rendering Your browser/device
𐨤𐨪𐨌𐨪𐨿𐨗𐨸𐨅𐨌𐨏

Khudabadi

[edit]

Khudabadi, also spelled Khudawadi, or Sindhi, is a script used to write Sindhi Language. It is supported by the following fonts:

Correct rendering Your browser/device Transliteration
𑋝𑋡𑋟𑋟𑋐𑋢 Sindhi

Note: As of August 2018, this script is not being used on the Sindhi Wikipedia.

Klingon

[edit]

The Klingon script is used to write the Klingon language, an artistic language of the Star Trek franchise. The script is not encoded in Unicode but a range of code points defined in the ConScript Unicode Registry (CSUR) is in common use. The following fonts support these CSUR code points:


Correct rendering Your browser/device


Lanna

[edit]

The Tai Tham script, also known as the Lanna script, is used to write the Northern Thai language, the Pali language and others. It is supported by the following fonts:

Correct rendering Your browser/device
ᨲ᩠ᩅᩫᨵᩢᨾ᩠ᨾ᩼

Lepcha

[edit]

The Lepcha script is used to write Lepcha, a language spoken by 66,500 people in northern Nepal. It is supported by the following fonts:

Correct rendering Your browser/device
ᰛᰩᰵᰛᰧᰵᰶ

Limbu

[edit]

The Limbu alphabet, used to write the Limbu language, is supported by the following fonts:

Correct rendering Your browser/device
ᤕᤠᤰᤌᤢᤱ

Linear A

[edit]

The undeciphered Linear A script was used in ancient Greece. It is supported by the following fonts:

Correct rendering Your browser/device
𐘀  𐘏  𐘞  𐘮  𐘽  𐙌

Linear B

[edit]

The Linear B script was used for writing Mycenaean Greek, the earliest attested form of the Greek language. It is supported by the following fonts:

Correct rendering Your browser/device
𐁂𐀐𐀷

Lisu (Fraser alphabet)

[edit]

The Fraser alphabet is used only to write the Lisu language. It is supported by the following fonts:

  • DejaVu
  • Miao Unicode
  • Kurinto Font Folio (11 typefaces that have "Main" variant fonts)
  • Noto Sans Lisu, a font made by Google.
  • Segoe UI (Microsoft Windows font, available in Windows 7 and later, but only supports Lisu since Windows 8)
  • TH-Times (completely supports up to Unicode 15.1). The letters are designed as a serif style.
Correct rendering Your browser/device
ꓛꓬꓹ ꓡꓯꓺ ꓡꓯꓺ

Lontara

[edit]

The Lontara script is used to write Buginese, Makassarese, and Mandar. The script is encoded in block "Buginese", code points 1A00–1A1F (Unicode.org chart). It is supported by the following fonts:

Correct rendering Your browser/device Transliteration
ᨅᨔ ᨕᨘᨁᨗ Basa Ugi

Makasar

[edit]

The Makasar script, also known as Ukiri' Jangang-jangang (bird's script) or Old Makasar script, is a historical Indonesian writing system that was used in South Sulawesi to write the Makassarese language between the 17th and 19th centuries until it was supplanted by the Lontara Bugis script. It is supported by the following font:

Noto Serif Makasar, a font made by Google

Correct rendering Your browser/device Transliteration
𑻪𑻢𑻪𑻢 Jangang-jangang

Mandaic

[edit]

The Mandaic alphabet, used to write the Mandaic language and Neo-Mandaic, is supported by the following fonts:

Correct rendering Your browser/device
ࡀࡁࡀࡂࡀ

Marchen

[edit]

The Marchen script, is used to write the Zhang-Zhung language, is supported by the following fonts:

Correct rendering Your browser/device
𑲁𑲠𑱹𑲚

Masaram Gondi

[edit]

Masaram Gondi is a Brahmi-based script devised by Munshi Mangal Singh Masaram in 1918. It is supported by the following font:

Correct rendering Your browser/device
𑴤𑴫𑴦𑴱𑴤 𑴎𑴽𑵀𑴘𑴳

Meitei

[edit]

The Meitei script, used to write the Meetei language, is supported by the following fonts:

Correct rendering Your browser/device
ꯃꯩꯇꯩ ꯂꯣꯟ

Modi

[edit]

The Modi script, used to write the Marathi and Sanskrit languages, is supported by the following font:

Correct rendering Your browser/device
𑘀

Mongolian

[edit]

The Mongolian script is occasionally used to write the Mongolian language on the Internet, though Cyrillic is more common. It is also used to write the Manchu language and Xibe language. It is written from top to bottom in columns ordered from left to right. It is supported by the following fonts:

Correct rendering Your browser/device
ᠮᠣᠩᠭᠣᠯ ᠪᠢᠴᠢᠭ᠌

Note: As of August 2018, this script is not being generally used on the Mongolian Wikipedia (which uses Cyrillic in general).

Nag Mundari

[edit]

Mundari Bani, also known as Nag Mundari, is a writing system used for the Mundari language, a Munda language spoken in eastern India. It is supported by the following fonts:

Correct rendering Your browser/device
𞓧𞓟𞓨𞓜𞓕𞓣𞓚

Newa

[edit]

The Pracalit script is a native Nepalese writing system. It is supported by the following font:

Correct rendering Your browser/device
𑐥𑑂𑐬𑐔𑐮𑐶𑐟 𑐣𑐾𑐥𑐵𑐮

New Tai Lue

[edit]

New Tai Lue script, also known as Simplified Tai Lue, is used to write the Tai Lue language (Tai Lü). It is supported by the following fonts:

Correct rendering Your browser/device
ᦟᦲᧅᦷᦎᦺᦑᦟᦹᧉ

Nüshu

[edit]

Nüshu is a syllabic script derived from Chinese characters that was used exclusively among women in Jiangyong County in Hunan province of southern China. It is supported by the following fonts:

Correct rendering Your browser/device
𛆁𛈬 𛆁𛈬

Note: In this image, the Nüshu characters are written right-to-left.

Nyiakeng Puachue Hmong

[edit]

Nyiakeng Puachue Hmong is an alphabet script devised for White Hmong and Green Hmong in the 1980s by Reverend Chervang Kong for use within his United Christians Liberty Evangelical Church. It is supported by the following fonts:

Correct rendering Your browser/device
𞄀𞄩𞄰𞄁𞄦𞄱𞄂𞄤𞄳𞄬𞄃𞄥𞄳

Ogham

[edit]

The Ogham alphabet was used to write the Old Irish language from the 1st to 9th century AD. It is supported by the following fonts:

Correct rendering Your browser/device
᚛ᚓᚅᚐᚁᚐᚏᚏ᚜

Ol Chiki

[edit]

The Ol Chiki script script was created in 1925 by Raghunath Murmu for the Santali language. It is supported by the following fonts:

Correct rendering Your browser/device Transliteration
ᱚᱞ ᱪᱤᱠᱤ Ol Chiki

Old Hungarian (Hungarian Runes)

[edit]

The Old Hungarian script is an historic script used to write the Hungarian language. It is supported by the following fonts:

Correct rendering Your browser/device Transliteration
𐲥𐲋𐲓𐲉𐲗-𐲘𐲀𐲎𐲀𐲢 𐲢𐲛𐲮𐲀𐲤 SZÉKELY-MAGYAR ROVÁS

Old Permic

[edit]

The Old Permic script was used to write the medieval Komi language. It is supported by the following font:

Correct rendering Your browser/device
𐍑

Old Persian cuneiform

[edit]

The Old Persian cuneiform script was used to write the Old Persian language. The script is encoded in block "Old Persian", code points 103A0–103DF (Unicode.org chart). It is supported by the following fonts:

Correct rendering Your browser/device Transliteration
𐎣𐎲𐎢𐎪𐎡𐎹 Kambujiya (Cambyses II)

Osage

[edit]

The Osage alphabet is used to write Osage, a Native American language spoken in Oklahoma. It is supported by the following fonts:

Correct rendering Your browser/device
𐓏𐒰.𐓓𐒰.𐓓𐒷 𐒻.𐒷

Pahawh Hmong

[edit]

Pahawh Hmong alphabet is a semi-syllabary, invented in 1959 by Shong Lue Yang, to write the Hmong language (White Hmong and Green Hmong). The script is encoded in block "Pahawh Hmong", code points 16B00-16B8F. It is supported by the following fonts:

Correct rendering Your browser/device
𖬌𖬣𖬵 𖬓𖬤 𖬇𖬰𖬧𖬵 𖬀𖬶 𖬖𖬲𖬝 𖬁𖬲𖬬 𖬒𖬰𖬮𖬵 𖬖𖬲𖬤𖬵 𖬇𖬰𖬮𖬰 𖬆𖬞.

Phaistos Disc

[edit]

The Phaistos disc is an artifact discovered on the island of Crete which contains as-yet undeciphered symbols. These symbols are supported by the following fonts:


Correct rendering Your browser/device
𐇑𐇛𐇪𐇝𐇯𐇡𐇪

Psalter Pahlavi

[edit]

Psalter Pahlavi was used for writing Middle Persian on paper. It is supported by the following fonts:

Correct rendering Your browser/device
𐮁𐮃𐮉 𐮆𐮈 𐮌𐮐𐮈𐮈𐮋𐮈 𐮁𐮅𐮅𐮏𐮊𐮈 𐮁𐮅𐮄 𐮆𐮈 𐮌𐮈𐮐𐮈𐮃𐮏
𐮋𐮀𐮊𐮈𐮃𐮈 𐮆𐮈 𐮂𐮌𐮀𐮊𐮈 𐮆𐮈 𐮋𐮌 𐮉𐮌𐮈𐮐𐮈 𐮆𐮈 𐮇𐮊𐮈𐮃𐮈 𐮋𐮌𐮅
𐮎𐮅𐮌 𐮀𐮐𐮋𐮀𐮌𐮏 𐮊𐮀 𐮫 𐮀𐮎𐮅𐮈𐮃𐮂𐮊 𐮎𐮅𐮌
𐮅𐮊 𐮉𐮌𐮐𐮈𐮈 𐮆𐮈𐮋 𐮇𐮅 𐮀𐮋𐮅𐮉

Note: As of August 2018, this script is not being used on the Middle Persian test wiki at the Wikimedia Incubator.

Rohingya

[edit]

The Rohingya alphabet, used to write the Rohingya language, is supported by the following fonts:

Correct rendering Your browser/device
𐴌𐴟𐴇𐴥𐴝𐴚𐴒𐴙𐴝

Runes

[edit]

Runes are supported by the following fonts:

Script Correct rendering Your browser/device
Elder Futhark (2nd to 8th centuries) ᚠᚢᚦᚨᚱᚲ
Anglo-Saxon runes (5th to 11th centuries) ᚠᚢᚦᚩᚱᚳ
Medieval runes (12th to 15th centuries) ᚠᚢᚧᛆᚱᚴ

Sharada

[edit]

The Sharada script is a Brahmic script that is almost extinct. It is used (rarely) to write the Kashmiri language and Sanskrit. It is supported by the following fonts:

Correct rendering Your browser/device Transliteration
𑆑𑆾𑆯𑆶𑆫 Koshur

Note: As of August 2018, this script is not being used on the Kashmiri or Sanskrit Wikipedia.

Shavian

[edit]

The Shavian alphabet is an alternative phonemic alphabet for the English language. It is supported by the following fonts:

Correct rendering Your browser/device
𐑖𐑱𐑝𐑾𐑯 𐑨𐑤𐑓𐑩𐑚𐑧𐑑

Siddham

[edit]

Siddham script is a script used to write Sanskrit language. It is supported by the following fonts:

Correct rendering Your browser/device Transliteration
𑖌𑖼𑖦𑖜𑖰𑖢𑖟𑖿𑖦𑖸𑖮𑗝𑖽 Om Mani Padme Hum

Sogdian

[edit]

The Sogdian alphabet and the Old Sogdian alphabet were used to write the Sogdian language of Central Asia. It is supported by the following fonts:

Correct rendering Your browser/device
𐽓

Sora Sompeng

[edit]

The Sora Sompeng alphabet is a Brahmic script. It is used to write the Sora language, a Munda language spoken by about 300,000 people. It is supported by the following fonts:

Correct rendering Your browser/device
𑃐

Sundanese

[edit]

The Sundanese script is used to write the Sundanese language. The script is encoded in block "Sundanese", code points 1B80–1BBF (Unicode.org chart). It is supported by the following fonts:


Correct rendering Your browser/device Transliteration
ᮜᮓᮢᮀ
ᮃᮚ ᮠᮤᮏᮤ ᮛᮥᮕ ᮞᮒᮧ ᮜᮩᮒᮤᮊ᮪,
ᮆᮀᮊᮀ-ᮆᮀᮊᮀ, ᮆᮀᮊᮀ-ᮆᮀᮊᮀ,
ᮞᮧᮊ᮪ ᮜᮥᮜᮥᮙ᮪ᮎᮒᮔ᮪ ᮓᮤ ᮎᮄ,
ᮃᮛᮤ ᮘᮍᮥᮔ᮪ ᮃᮛᮦᮊ᮪ ᮞᮛᮥᮕ ᮏᮀ
ᮜᮔ᮪ᮎᮂ.
Ladrang Aya hiji rupa sato leutik,
Éngkang-éngkang, éngkang-éngkang,
Sok lulumcatan di cai,
Ari bangun arék sarupa jang lancah.

Sutton SignWriting

[edit]

Sutton SignWriting is used to write any sign language. It is supported with the SignWriting 2010 Typeface which includes two TrueType fonts:

It is supported also in Google Noto font (not thoroughly tested).

Correct rendering Your browser/device
𝧪𝪞𝪨 𝠀𝪛𝪩 𝠀𝪛𝪡 𝧪𝪤

Sylheti Nagari

[edit]

Sylheti Nagari (Silôṭi Nagri) is an endangered script used for writing Sylheti language. It is supported by the following fonts:

Correct rendering Your browser/device Transliteration
ꠍꠤꠟꠐꠤ Silôṭi

Syriac / Aramaic script

[edit]

The Syriac and Aramaic scripts are used to write the Syriac and Aramaic languages. As with most Semitic scripts, these scripts flow from right to left, which can cause letters to appear in the wrong order on some left-to-right systems. The template {{lang}} can fix this issue.[citation needed]

Most operating systems provide support for Syriac scripts natively, but only the Maḏnḥāyā (ܡܕܢܚܝܐ‎) and ʾEsṭrangēlā (ܐܣܛܪܢܓܠܐ‎) varieties have correct rendering.[c] In order to render the Serṭā (ܣܪܛܐ‎) variety, additional fonts are needed. They are supported by the following fonts:

Script Correct rendering Your browser/device
Maḏnḥāyā (Eastern) ܒܪܹܝܼܫܝܼܬ݀ ܐܝܼܬ݂ܲܘܗ݇ܝ ܗ݇ܘܵܐ ܡܹܠܬܵ݀ܐ.
Serṭā (Western) ܒ݁ܪܺܝܫܺܝܬܼ ܐܻܝܬܼܰܘܗ̱ܝ ܗ̱ܘܳܐ ܡܶܠܬܼܳܐ.
ʾEsṭrangēlā ܒܪܝܫܝܬ ܐܝܬܗܘܝ ܗܘܐ ܡܠܬܐ.

Tai Le

[edit]

The Tai Le alphabet is used for the Tai Nuea language (Tai Nüa). It is supported by the following fonts:

Correct rendering Your browser/device Transliteration
ᥖᥭᥰᥘᥫᥴ Tai Le ([tai˦.lə˧˥])

Tai Viet

[edit]

Tai Viet script is used for writing the Tai languages Tai Dam, Tai Dón, and Thai Song. It is supported by the following fonts:

Correct rendering Your browser/device
ꪼꪕꪒꪾ

Tangsa

[edit]

The Tangsa alphabet is used to write the Tangsa language, spoken by the Tangsa people of Myanmar and North-Eastern India. It is supported by the following font:

Correct rendering Your browser/device
𖪢𖩼𖪭𖩽

Tangut

[edit]

The Tangut script was used to write the Tangut language, a Tibeto-Burman language once spoken in the Western Xia, also known as the Tangut Empire. It is supported by the following fonts:

Correct rendering Your browser/device
𗈁𗤻𗖰𗚩

Tifinagh script

[edit]

The Tifinagh alphabet is used to write the Berber languages. IRCAM (Institut Royal de la Culture Amazighe) has a software suite developed for Windows XP that contains a Tifinagh keyboard and a font available for download here. It is supported by the following fonts:

Correct rendering Your browser/device Transliteration
ⵜⵉⴼⵉⵏⴰⵖ tifinagh

This script is used in several test wikis at the Wikimedia Incubator, including Central Atlas Tamazight, Tachelhit (Tasusiyt, Shilha), Riffian, and Shawiya.

Tirhuta script

[edit]

The Tirhuta script is used for the Maithili and Sanskrit languages. It is supported by the following font:

Correct rendering Your browser/device
𑒞𑒱𑒩𑒯𑒳𑒞𑒰

Toto script

[edit]

The Toto script was invented by Dhaniram Toto in 2015 to write the Toto language. It is supported by the following fonts:

Correct rendering Your browser/device
𞊒𞊪𞊒𞊪

Wancho

[edit]

The Wancho script is a writing system for the Wancho language. It is supported by the following font:

Correct rendering Your browser/device
𞋒𞋀𞋉𞋃𞋕

Warang Citi

[edit]

The Warang Citi script is a writing system for the Ho language. It is supported by the following font:

Correct rendering Your browser/device
𑢹𑢷𑢡𑢼𑢪
𑢯𑢢𑢵𑢢

Yezidi script

[edit]

The Yezidi script was used for writing Kurdish, specifically the Kurmanji dialect (Northern Kurdish) for liturgical purposes in Iraq and Georgia. It is supported by the following font:

Correct rendering Your browser/device
𐺊𐺀𐺕𐺣𐺣𐺢𐺀 𐺙𐺦𐺊𐺍𐺀

Yi Syllabary

[edit]

Modern Yi script is a standardized syllabary derived from the classic script in 1974 by the local Chinese government. It is used to write various Yi languages. It is supported by the following fonts:

Correct rendering Your browser/device
ꆈꌠꁱꂷ

Special cases

[edit]

Romanian

[edit]

The Romanian alphabet contains an S-comma (Ș ș) and T-comma (Ț ț). These characters were added to Unicode 3.0 (September 1999) at the request of the Romanian standardization institute. As font support for these characters has been poor in the past, many computer users use the similar characters S-cedilla (Ş ş) and T-cedilla (Ţ ţ) instead. However, on Wikipedia it is recommended to use the correct characters with comma below.

See also

[edit]

References

[edit]

Notes

[edit]
  1. ^ Until June 2005, when MediaWiki 1.5 came into use on the Wikimedia projects, articles on the English Wikipedia were encoded using ISO/IEC 8859-1 (although the additional characters from the Windows-1252 character set were used in practice.) All characters from the ISO/IEC 10646 Universal Character Set could be accessed through numerical entities, as specified by the HTML 4.01 specification. Since then, nearly all pages have been converted to use Unicode directly. Old discussion on the topic can be read at Wikipedia talk:Unicode.
  2. ^ Not to be confused with MS Sans Serif
  3. ^ Microsoft Windows support the ʾEsṭrangēlā variety via Estrangelo Edessa and Segoe UI. Historically, some Linux distributions supported Maḏnḥāyā variety via FreeSans.
[edit]