U+FF70 HALFWIDTH KATAKANA-HIRAGANA PROLONGED SOUND MARK

U+FF70 was added to Unicode in version 1.1 (1993). It belongs to the block Halfwidth and Fullwidth Forms in the Basic Multilingual Plane.

This character is a Modifier Letter and is commonly used, that is, in no specific script. It is also used in the scripts Hiragana, Katakana.

The glyph is a Narrow composition of the glyphs ー. It has a Halfwidth East Asian Width. In bidirectional context it acts as Left To Right and is not mirrored. In text U+FF70 behaves as Conditional Japanese Starter regarding line breaks. It has type OLetter for sentence and Katakana for word breaks. The Grapheme Cluster Break is Any.

The Wikipedia has the following information about this codepoint:

The chōonpu (長音符), also known as onbiki (音引き), bōbiki (棒引き), or Katakana-Hiragana Prolonged Sound Mark by the Unicode Consortium, is a Japanese symbol which indicates a chōon, or a long vowel of two morae in length. Its form is a horizontal or vertical line in the center of the text with the width of one kanji or kana character. It is written horizontally in horizontal text and vertically in vertical text. The chōonpu is usually used to indicate a long vowel sound in katakana writing, rarely in hiragana writing, and never in romanized Japanese. The chōonpu is a distinct mark from the dash, and in most Japanese typefaces it can easily be distinguished. In horizontal writing it is similar in appearance to, but should not be confused with, the kanji character 一 ("one").

The symbol is sometimes used with hiragana, for example in the signs of ramen restaurants, which are sometimes written らーめん in hiragana. However, usually, hiragana does not use the chōonpu but another vowel kana to express this sound. The following table shows the usual hiragana equivalents used to form a long vowel, using the ha-gyō (the ha, hi, fu, he, ho sequence) as an example.

When rendering foreign words into katakana, the chōonpu is often used to indicate a terminal "er", such as the English word "number" which becomes ナンバー (nanbaa).

In addition to Japanese, chōonpu are also used in Okinawan writing systems to indicate two morae. The Sakhalin dialect of Ainu also uses chōonpu in its katakana writing for long vowels.

In Unicode, the chōonpu has the value U+30FC (ー), which corresponds to JIS X 0208 kuten code point 01-28, encoded in Shift JIS as 815F. It is normally rendered fullwidth and automatically changes its glyph according to the writing direction. The halfwidth compatibility form has the value U+FF70 (ー), which is converted to Shift JIS value B0.

Representations

System Representation
65392
UTF-8 EF BD B0
UTF-16 FF 70
UTF-32 00 00 FF 70
URL-Quoted %EF%BD%B0
HTML-Escape ー
Wrong windows-1252 Mojibake ï½°

Elsewhere

Complete Record

Property Value
Age (age) 1.1
Unicode Name (na) HALFWIDTH KATAKANA-HIRAGANA PROLONGED SOUND MARK
Unicode 1 Name (na1)
Block (blk) Half_And_Full_Forms
General Category (gc) Modifier Letter
Script (sc) Common
Bidirectional Category (bc) Left To Right
Combining Class (ccc) Not Reordered
Decomposition Type (dt) Narrow
Decomposition Mapping (dm) ー
Lowercase (Lower)
Simple Lowercase Mapping (slc) ー
Lowercase Mapping (lc) ー
Uppercase (Upper)
Simple Uppercase Mapping (suc) ー
Uppercase Mapping (uc) ー
Simple Titlecase Mapping (stc) ー
Titlecase Mapping (tc) ー
Case Folding (cf) ー
ASCII Hex Digit (AHex)
Alphabetic (Alpha)
Bidi Control (Bidi_C)
Bidi Mirrored (Bidi_M)
Bidi Paired Bracket (bpb) ー
Bidi Paired Bracket Type (bpt) None
Cased (Cased)
Composition Exclusion (CE)
Case Ignorable (CI)
Full Composition Exclusion (Comp_Ex)
Changes When Casefolded (CWCF)
Changes When Casemapped (CWCM)
Changes When NFKC Casefolded (CWKCF)
Changes When Lowercased (CWL)
Changes When Titlecased (CWT)
Changes When Uppercased (CWU)
Dash (Dash)
Deprecated (Dep)
Default Ignorable Code Point (DI)
Diacritic (Dia)
East Asian Width (ea) Halfwidth
Extender (Ext)
FC NFKC Closure (FC_NFKC) ー
Grapheme Cluster Break (GCB) Any
Grapheme Base (Gr_Base)
Grapheme Extend (Gr_Ext)
Hex Digit (Hex)
Hangul Syllable Type (hst) Not Applicable
Hyphen (Hyphen)
ID Continue (IDC)
Ideographic (Ideo)
ID Start (IDS)
IDS Binary Operator (IDSB)
IDS Trinary Operator and (IDST)
InMC (InMC)
Indic Positional Category (InPC) NA
Indic Syllabic Category (InSC) Other
ISO 10646 Comment (isc)
Joining Group (jg) No_Joining_Group
Join Control (Join_C)
Jamo Short Name (JSN)
Joining Type (jt) Non Joining
Line Break (lb) Conditional Japanese Starter
Logical Order Exception (LOE)
Math (Math)
Noncharacter Code Point (NChar)
NFC Quick Check (NFC_QC) Yes
NFD Quick Check (NFD_QC) Yes
NFKC Casefold (NFKC_CF) ー
NFKC Quick Check (NFKC_QC) No
NFKD Quick Check (NFKD_QC) No
Numeric Type (nt) None
Numeric Value (nv) NaN
Other Alphabetic (OAlpha)
Other Default Ignorable Code Point (ODI)
Other Grapheme Extend (OGr_Ext)
Other ID Continue (OIDC)
Other ID Start (OIDS)
Other Lowercase (OLower)
Other Math (OMath)
Other Uppercase (OUpper)
Pattern Syntax (Pat_Syn)
Pattern White Space (Pat_WS)
Quotation Mark (QMark)
Radical (Radical)
Sentence Break (SB) OLetter
Simple Case Folding (scf) ー
Script Extension (scx) Hiragana Katakana
Soft Dotted (SD)
STerm (STerm)
Terminal Punctuation (Term)
Unified Ideograph (UIdeo)
Variation Selector (VS)
Word Break (WB) Katakana
White Space (WSpace)
XID Continue (XIDC)
XID Start (XIDS)
Expands On NFC (XO_NFC)
Expands On NFD (XO_NFD)
Expands On NFKC (XO_NFKC)
Expands On NFKD (XO_NFKD)