Glyph for U+3001
Source: Noto Sans Mongolian

U+3001 Ideographic Comma

U+3001 was added to Unicode in version 1.1 (1993). It belongs to the block U+3000 to U+303F CJK Symbols and Punctuation in the U+0000 to U+FFFF Basic Multilingual Plane.

This character is a Other Punctuation and is commonly used, that is, in no specific script. It is also used in the scripts Bopomofo, Hangul, Han, Hiragana, Katakana, Yi.

The glyph is not a composition. It has a Wide East Asian Width. In bidirectional context it acts as Other Neutral and is not mirrored. In text U+3001 behaves as Close Punctuation regarding line breaks. It has type Sentence Continue for sentence and Other for word breaks. The Grapheme Cluster Break is Any.

The CLDR project labels this character “ideographic comma” for use in screen reading software. It assigns additional tags, e.g. for search in emoji pickers: comma, ideographic.

The Wikipedia has the following information about this codepoint:

The comma , is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark (’) in many typefaces, but it differs from them in being placed on the baseline of the text. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical. Other fonts give it the appearance of a miniature filled-in figure 9 on the baseline.

The comma is used in many contexts and languages, mainly to separate parts of a sentence such as clauses, and items in lists mainly when there are three or more items listed. The word comma comes from the Greek κόμμα (kómma), which originally meant a cut-off piece, specifically in grammar, a short clause.

A comma-shaped mark is used as a diacritic in several writing systems and is considered distinct from the cedilla. In Byzantine and modern copies of Ancient Greek, the "rough" and "smooth breathings" (ἁ, ἀ) appear above the letter. In Latvian, Romanian, and Livonian, the comma diacritic appears below the letter, as in ș.

For the notation ⟨x⟩ and /x/ used in this article, see grapheme and phoneme respectively.


System Representation
UTF-8 E3 80 81
UTF-16 30 01
UTF-32 00 00 30 01
URL-Quoted %E3%80%81
HTML hex reference 、
Wrong windows-1252 Mojibake 、
Encoding: EUC-KR (hex bytes) A1 A2
Encoding: JIS0208 (hex bytes) A1 A2
Adobe Glyph List ideographiccomma
digraph ,_

