Home All Planes
Glyph for U+DFFFF
Source: Noto Sans

U+DFFFF NONCHARACTER *

U+DFFFF was added to Unicode in version 2.0 (1996). It belongs to the block - in the U+D0000 to U+DFFFF Plane 14 (unassigned).

This character is a Unassigned and is mainly used in the Unknown script.

The glyph is not a composition. It has a Neutral East Asian Width. In bidirectional context it acts as Boundary Neutral and is not mirrored. In text U+DFFFF behaves as Unknown regarding line breaks. It has type Other for sentence and Other for word breaks. The Grapheme Cluster Break is Any.

This is a so-called “noncharacter”, one of 66 in Unicode. These codepoints are reserved solely for internal use. For further information, see Unicode’s FAQ on noncharacters.

The Wikipedia has the following information about this codepoint:

The Unicode Consortium and the ISO/IEC JTC 1/SC 2/WG 2 jointly collaborate on the list of the characters in the Universal Coded Character Set. The Universal Coded Character Set, most commonly called the Universal Character Set (abbr. UCS, official designation: ISO/IEC 10646), is an international standard to map characters, discrete symbols used in natural language, mathematics, music, and other domains, to unique machine-readable data values. By creating this mapping, the UCS enables computer software vendors to interoperate, and transmit—interchange—UCS-encoded text strings from one to another. Because it is a universal map, it can be used to represent multiple languages at the same time. This avoids the confusion of using multiple legacy character encodings, which can result in the same sequence of codes having multiple interpretations depending on the character encoding in use, resulting in mojibake if the wrong one is chosen.

UCS has a potential capacity of over 1 million characters. Each UCS character is abstractly represented by a code point, an integer between 0 and 1,114,111 (1,114,112 = 220 + 216 or 17 × 216 = 0x110000 code points), used to represent each character within the internal logic of text processing software. As of Unicode 15.0, released in September 2022, 293,168 (26%) of these code points are allocated, 149,251 (13%) have been assigned characters, 137,468 (12.3%) are reserved for private use, 2,048 are used to enable the mechanism of surrogates, and 66 are designated as noncharacters, leaving the remaining 820,944 (74%) unallocated. The number of encoded characters is made up as follows:

  • 149,014 graphical characters (some of which do not have a visible glyph, but are still counted as graphical)
  • 237 special purpose characters for control and formatting.

ISO maintains the basic mapping of characters from character name to code point. Often, the terms character and code point will be used interchangeably. However, when a distinction is made, a code point refers to the integer of the character: what one might think of as its address. Meanwhile, a character in ISO/IEC 10646 includes the combination of the code point and its name, Unicode adds many other useful properties to the character set, such as block, category, script, and directionality.

In addition to the UCS, the supplementary Unicode Standard, (not a joint project with ISO, but rather a publication of the Unicode Consortium,) provides other implementation details such as:

  1. mappings between UCS and other character sets
  2. different collations of characters and character strings for different languages
  3. an algorithm for laying out bidirectional text ("the BiDi algorithm"), where text on the same line may shift between left-to-right ("LTR") and right-to-left ("RTL")
  4. a case-folding algorithm

Computer software end users enter these characters into programs through various input methods, for example, physical keyboards or virtual character palettes.

The UCS can be divided in various ways, such as by plane, block, character category, or character property.

Representations

System Representation
917503
UTF-8 F3 9F BF BF
UTF-16 DB 3F DF FF
UTF-32 00 0D FF FF
URL-Quoted %F3%9F%BF%BF
HTML-Escape 
Wrong windows-1252 Mojibake �

Elsewhere

Complete Record

Property Value
Age 2.0 (1996)
Unicode Name
Unicode 1 Name
General Category Unassigned
Script Unknown
Bidirectional Category Boundary Neutral
Combining Class Not Reordered
Decomposition Type None
Decomposition Mapping Glyph for U+DFFFF Noncharacter*
Lowercase
Simple Lowercase Mapping Glyph for U+DFFFF Noncharacter*
Lowercase Mapping Glyph for U+DFFFF Noncharacter*
Uppercase
Simple Uppercase Mapping Glyph for U+DFFFF Noncharacter*
Uppercase Mapping Glyph for U+DFFFF Noncharacter*
Simple Titlecase Mapping Glyph for U+DFFFF Noncharacter*
Titlecase Mapping Glyph for U+DFFFF Noncharacter*
Case Folding Glyph for U+DFFFF Noncharacter*
ASCII Hex Digit
Alphabetic
Bidi Control
Bidi Mirrored
Composition Exclusion
Case Ignorable
Changes When Casefolded
Changes When Casemapped
Changes When NFKC Casefolded
Changes When Lowercased
Changes When Titlecased
Changes When Uppercased
Cased
Full Composition Exclusion
Default Ignorable Code Point
Dash
Deprecated
Diacritic
Emoji Modifier Base
Emoji Component
Emoji Modifier
Emoji Presentation
Emoji
Extender
Extended Pictographic
FC NFKC Closure
Grapheme Cluster Break Any
Grapheme Base
Grapheme Extend
Grapheme Link
Hex Digit
Hyphen
ID Continue
ID Start
IDS Binary Operator
IDS Trinary Operator and
Ideographic
Indic Positional Category NA
Indic Syllabic Category Other
Jamo Short Name
Join Control
Logical Order Exception
Math
Noncharacter Code Point
NFC Quick Check 1
NFD Quick Check 1
NFKC Casefold Glyph for U+DFFFF Noncharacter*
NFKC Quick Check 1
NFKD Quick Check 1
Other Alphabetic
Other Default Ignorable Code Point
Other Grapheme Extend
Other ID Continue
Other ID Start
Other Lowercase
Other Math
Other Uppercase
Prepended Concatenation Mark
Pattern Syntax
Pattern White Space
Quotation Mark
Regional Indicator
Radical
Sentence Break Other
Soft Dotted
Sentence Terminal
Terminal Punctuation
Unified Ideograph
Variation Selector
Word Break Other
White Space
XID Continue
XID Start
Expands On NFC
Expands On NFD
Expands On NFKC
Expands On NFKD
Bidi Mirrored Glyph
Bidi Paired Bracket Type None
East Asian Width Neutral
Hangul Syllable Type Not Applicable
ISO 10646 Comment
Joining Group No_Joining_Group
Joining Type Non Joining
Line Break Unknown
Numeric Type None
Numeric Value not a number
Simple Case Folding Glyph for U+DFFFF Noncharacter*
Script Extension Unknown
Vertical Orientation R