Home: go to the homepage U+3000 to U+303F CJK Symbols and Punctuation
Glyph for U+3001
Source: Noto Sans Mongolian

U+3001 Ideographic Comma

U+3001 was added in Unicode version 1.1 in 1993. It belongs to the block U+3000 to U+303F CJK Symbols and Punctuation in the U+0000 to U+FFFF Basic Multilingual Plane.

This character is a Other Punctuation and is commonly used, that is, in no specific script. It is also used in the scripts Bopomofo, Hangul, Han, Hiragana, Katakana, Yi.

The glyph is not a composition. Its East Asian Width is wide. In bidirectional text it acts as Other Neutral. When changing direction it is not mirrored. It will not end a sentence. U+3001 prohibits a line break before it.

The CLDR project calls this character “ideographic comma” for use in screen reading software. It assigns these additional labels, e.g. for search in emoji pickers: comma, ideographic.

The Wikipedia has the following information about this codepoint:

The comma , is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline of the text. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical. Other fonts give it the appearance of a miniature filled-in figure 9 on the baseline.

The comma is used in many contexts and languages, mainly to separate parts of a sentence such as clauses, and items in lists mainly when there are three or more items listed. The word comma comes from the Greek κόμμα (kómma), which originally meant a cut-off piece, specifically in grammar, a short clause.

A comma-shaped mark is used as a diacritic in several writing systems and is considered distinct from the cedilla. In Byzantine and modern copies of Ancient Greek, the "rough" and "smooth breathings" (ἁ, ἀ) appear above the letter. In Latvian, Romanian, and Livonian, the comma diacritic appears below the letter, as in ș.

In spoken language, a common rule of thumb is that the function of a comma is generally performed by a pause.

In this article, ⟨x⟩ denotes a grapheme (writing) and /x/ denotes a phoneme (sound).

Representations

System Representation
12289
UTF-8 E3 80 81
UTF-16 30 01
UTF-32 00 00 30 01
URL-Quoted %E3%80%81
HTML hex reference 、
Wrong windows-1252 Mojibake 、
Encoding: EUC-KR (hex bytes) A1 A2
Encoding: JIS0208 (hex bytes) A1 A2
Adobe Glyph List ideographiccomma
digraph ,_

Related Characters

Elsewhere

Complete Record

Property Value
Age (age) 1.1 (1993)
Unicode Name (na) IDEOGRAPHIC COMMA
Unicode 1 Name (na1)
Block (blk) CJK Symbols and Punctuation
General Category (gc) Other Punctuation
Script (sc) Common
Bidirectional Category (bc) Other Neutral
Combining Class (ccc) Not Reordered
Decomposition Type (dt) none
Decomposition Mapping (dm) Glyph for U+3001 Ideographic Comma
Lowercase (Lower)
Simple Lowercase Mapping (slc) Glyph for U+3001 Ideographic Comma
Lowercase Mapping (lc) Glyph for U+3001 Ideographic Comma
Uppercase (Upper)
Simple Uppercase Mapping (suc) Glyph for U+3001 Ideographic Comma
Uppercase Mapping (uc) Glyph for U+3001 Ideographic Comma
Simple Titlecase Mapping (stc) Glyph for U+3001 Ideographic Comma
Titlecase Mapping (tc) Glyph for U+3001 Ideographic Comma
Case Folding (cf) Glyph for U+3001 Ideographic Comma
ASCII Hex Digit (AHex)
Alphabetic (Alpha)
Bidi Control (Bidi_C)
Bidi Mirrored (Bidi_M)
Composition Exclusion (CE)
Case Ignorable (CI)
Changes When Casefolded (CWCF)
Changes When Casemapped (CWCM)
Changes When NFKC Casefolded (CWKCF)
Changes When Lowercased (CWL)
Changes When Titlecased (CWT)
Changes When Uppercased (CWU)
Cased (Cased)
Full Composition Exclusion (Comp_Ex)
Default Ignorable Code Point (DI)
Dash (Dash)
Deprecated (Dep)
Diacritic (Dia)
Emoji Modifier Base (EBase)
Emoji Component (EComp)
Emoji Modifier (EMod)
Emoji Presentation (EPres)
Emoji (Emoji)
Extender (Ext)
Extended Pictographic (ExtPict)
FC NFKC Closure (FC_NFKC) Glyph for U+3001 Ideographic Comma
Grapheme Cluster Break (GCB) Any
Grapheme Base (Gr_Base)
Grapheme Extend (Gr_Ext)
Grapheme Link (Gr_Link)
Hex Digit (Hex)
Hyphen (Hyphen)
ID Continue (IDC)
ID Start (IDS)
IDS Binary Operator (IDSB)
IDS Trinary Operator and (IDST)
IDSU (IDSU) 0
ID_Compat_Math_Continue (ID_Compat_Math_Continue) 0
ID_Compat_Math_Start (ID_Compat_Math_Start) 0
Ideographic (Ideo)
InCB (InCB) None
Indic Mantra Category (InMC)
Indic Positional Category (InPC) NA
Indic Syllabic Category (InSC) Other
Jamo Short Name (JSN)
Join Control (Join_C)
Logical Order Exception (LOE)
Math (Math)
Noncharacter Code Point (NChar)
NFC Quick Check (NFC_QC) Yes
NFD Quick Check (NFD_QC) Yes
NFKC Casefold (NFKC_CF) Glyph for U+3001 Ideographic Comma
NFKC Quick Check (NFKC_QC) Yes
NFKC_SCF (NFKC_SCF) Glyph for U+3001 Ideographic Comma
NFKD Quick Check (NFKD_QC) Yes
Other Alphabetic (OAlpha)
Other Default Ignorable Code Point (ODI)
Other Grapheme Extend (OGr_Ext)
Other ID Continue (OIDC)
Other ID Start (OIDS)
Other Lowercase (OLower)
Other Math (OMath)
Other Uppercase (OUpper)
Prepended Concatenation Mark (PCM)
Pattern Syntax (Pat_Syn)
Pattern White Space (Pat_WS)
Quotation Mark (QMark)
Regional Indicator (RI)
Radical (Radical)
Sentence Break (SB) Sentence Continue
Soft Dotted (SD)
Sentence Terminal (STerm)
Terminal Punctuation (Term)
Unified Ideograph (UIdeo)
Variation Selector (VS)
Word Break (WB) Other
White Space (WSpace)
XID Continue (XIDC)
XID Start (XIDS)
Expands On NFC (XO_NFC)
Expands On NFD (XO_NFD)
Expands On NFKC (XO_NFKC)
Expands On NFKD (XO_NFKD)
Bidi Paired Bracket (bpb) Glyph for U+3001 Ideographic Comma
Bidi Paired Bracket Type (bpt) None
East Asian Width (ea) wide
Hangul Syllable Type (hst) Not Applicable
ISO 10646 Comment (isc)
Joining Group (jg) No_Joining_Group
Joining Type (jt) Non Joining
Line Break (lb) Close Punctuation
Numeric Type (nt) none
Numeric Value (nv) not a number
Simple Case Folding (scf) Glyph for U+3001 Ideographic Comma
Script Extension (scx) Bopomofo Hangul Han Hiragana Katakana Yi
Vertical Orientation (vo) Tu