Home U+0600 to U+06FF Arabic
Glyph for U+062B
Source: Noto Sans Arabic

U+062B Arabic Letter Theh

U+062B was added to Unicode in version 1.1 (1993). It belongs to the block U+0600 to U+06FF Arabic in the U+0000 to U+FFFF Basic Multilingual Plane.

This character is a Other Letter and is mainly used in the Arabic script.

The glyph is not a composition. It has a Neutral East Asian Width. In bidirectional context it acts as Arabic Letter and is not mirrored. The glyph can, under circumstances, be confused with 1 other glyphs. In text U+062B behaves as Alphabetic regarding line breaks. It has type Other Letter for sentence and Alphabetic Letter for word breaks. The Grapheme Cluster Break is Any.

The Wikipedia has the following information about this codepoint:

Ṯāʾ (ث) is one of the six letters the Arabic alphabet added to the twenty-two from the Phoenician alphabet (the others being ḫāʾ, ḏāl, ḍād, ẓāʾ, ġayn). In Modern Standard Arabic it represents the voiceless dental fricative [θ], also found in English as the "th" in words such as "thank" and "thin". In Persian, Urdu, and Kurdish it is pronounced as s as in "sister" in English.

In name and shape, it is a variant of tāʾ (ت). Its numerical value is 500 (see Abjad numerals).

The Arabic letter ث is named ثَاءْ ṯāʾ. It is written in several ways depending in its position in the word:

In contemporary spoken Arabic, pronunciation of ṯāʾ as [θ] is found in the Arabian Peninsula, Iraqi, and Tunisian and other dialects and in highly educated pronunciations of Modern Standard and Classical Arabic. Pronunciation of the letter varies between and within the various varieties of Arabic: while it is consistently pronounced as the voiceless dental plosive [t] in Maghrebi Arabic (except Tunisian and eastern Libyan), on the other hand in the Arabic varieties of the Mashriq (in the broad sense, including Egyptian, Sudanese and Levantine) and Hejazi Arabic, it can be pronounced as either [t] or as the sibilant voiceless alveolar fricative [s]. Depending on the word in question, words pronounced as [s] are generally more technical or "sophisticated." Regardless of these regional differences, the pattern of the speaker's variety of Arabic frequently intrudes into otherwise Modern Standard speech; this is widely accepted, and is the norm when speaking the mesolect known alternately as lugha wusṭā ("middling/compromise language") or ʿAmmiyyat/Dārijat al-Muṯaqqafīn ("Educated/Cultured Colloquial") used in the informal speech of educated Arabs of different countries.

When representing this sound in transliteration of Arabic into Hebrew, it is written as ת׳.


System Representation
UTF-16 06 2B
UTF-32 00 00 06 2B
URL-Quoted %D8%AB
HTML hex reference ث
Wrong windows-1252 Mojibake Ø«
Encoding: ISO-8859-6 (hex bytes) CB
Encoding: WINDOWS-1256 (hex bytes) CB
Adobe Glyph List afii57419
Adobe Glyph List theharabic
digraph tk

Related Characters



Complete Record

Property Value
Age 1.1 (1993)
Block Arabic
General Category Other Letter
Script Arabic
Bidirectional Category Arabic Letter
Combining Class Not Reordered
Decomposition Type None
Decomposition Mapping Glyph for U+062B Arabic Letter Theh
Simple Lowercase Mapping Glyph for U+062B Arabic Letter Theh
Lowercase Mapping Glyph for U+062B Arabic Letter Theh
Simple Uppercase Mapping Glyph for U+062B Arabic Letter Theh
Uppercase Mapping Glyph for U+062B Arabic Letter Theh
Simple Titlecase Mapping Glyph for U+062B Arabic Letter Theh
Titlecase Mapping Glyph for U+062B Arabic Letter Theh
Case Folding Glyph for U+062B Arabic Letter Theh
ASCII Hex Digit
Bidi Control
Bidi Mirrored
Composition Exclusion
Case Ignorable
Changes When Casefolded
Changes When Casemapped
Changes When NFKC Casefolded
Changes When Lowercased
Changes When Titlecased
Changes When Uppercased
Full Composition Exclusion
Default Ignorable Code Point
Emoji Modifier Base
Emoji Component
Emoji Modifier
Emoji Presentation
Extended Pictographic
FC NFKC Closure Glyph for U+062B Arabic Letter Theh
Grapheme Cluster Break Any
Grapheme Base
Grapheme Extend
Grapheme Link
Hex Digit
ID Continue
ID Start
IDS Binary Operator
IDS Trinary Operator and
ID_Compat_Math_Continue 0
ID_Compat_Math_Start 0
InCB None
Indic Mantra Category
Indic Positional Category NA
Indic Syllabic Category Other
Jamo Short Name
Join Control
Logical Order Exception
Noncharacter Code Point
NFC Quick Check Yes
NFD Quick Check Yes
NFKC Casefold Glyph for U+062B Arabic Letter Theh
NFKC Quick Check Yes
NFKC_SCF Glyph for U+062B Arabic Letter Theh
NFKD Quick Check Yes
Other Alphabetic
Other Default Ignorable Code Point
Other Grapheme Extend
Other ID Continue
Other ID Start
Other Lowercase
Other Math
Other Uppercase
Prepended Concatenation Mark
Pattern Syntax
Pattern White Space
Quotation Mark
Regional Indicator
Sentence Break Other Letter
Soft Dotted
Sentence Terminal
Terminal Punctuation
Unified Ideograph
Variation Selector
Word Break Alphabetic Letter
White Space
XID Continue
XID Start
Expands On NFC
Expands On NFD
Expands On NFKC
Expands On NFKD
Bidi Paired Bracket Glyph for U+062B Arabic Letter Theh
Bidi Paired Bracket Type None
East Asian Width Neutral
Hangul Syllable Type Not Applicable
ISO 10646 Comment
Joining Group Beh
Joining Type Dual Joining
Line Break Alphabetic
Numeric Type None
Numeric Value not a number
Simple Case Folding Glyph for U+062B Arabic Letter Theh
Script Extension
Vertical Orientation R