Nameprep
Nameprep is the process of case-folding a string to lowercase and removal of some generally invisible code points before it is suitable to represent a domain name, or other such canonical name. It is used by the Internationalizing Domain Names in Applications (IDNA) standard, using the Unicode standard for NFKC normalization.
Nameprep is defined in RFC 3491, "Nameprep: A Stringprep Profile for Internationalized Domain Names (IDN)", as a profile of stringprep, which is described in RFC 3454, "Preparation of Internationalized Strings ("stringprep")."
It does not map lookalike characters to a single character nor prohibit the use of lookalike characters. There are good reasons for this, such as the fact that same sets of characters may be lookalikes in some fonts but not in others, and the fact that any decision on which character to map to will obviously provide a bias towards users of one script; but it also has potentially grave implications for security if not considered by the designers and administrators of systems based on nameprep (the best known example of this being VeriSign's handling of IDNA names in .com and .net).
See also
- Homoglyph
- Unicode
- Internationalization
- International Components for Unicode (ICU contains an implementation of nameprep)
- Internationalized domain name
- IDN homograph attack or "lookalike" character spoofing based on a URL's appearance as read by a web user or as entered by a web user ( read in a page font, entered in the user's font of choice.) Note: this is not URI ambiguity in encoding. Examples are provided in both of the above articles.