CJK Unified Ideographs Extension B
CJK Unified Ideographs Extension B is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese submitted to the Ideographic Research Group between 1998 and 2000, plus seven gongche characters for kunqu added in Unicode 13.0, and two characters for the Macao Supplementary Character Set added in Unicode 14.0.[3]
CJK Unified Ideographs Extension B | |
---|---|
Range | U+20000..U+2A6DF (42,720 code points) |
Plane | SIP |
Scripts | Han |
Assigned | 42,720 code points |
Unused | 0 reserved code points |
Unicode version history | |
3.1 (2001) | 42,711 (+42,711) |
13.0 (2020) | 42,718 (+7) |
14.0 (2021) | 42,720 (+2) |
Unicode documentation | |
Code chart ∣ Web page | |
Note: [1][2] |
The block has dozens of variation sequences defined for standardized variants.[4]
It also has thousands of ideographic variation sequences registered in the Unicode Ideographic Variation Database (IVD).[5][6] These sequences specify the desired glyph variant for a given Unicode character.
It was the only CJK Unified Ideographs Extension block with a UCS2003 source identifier. Since Extension B contained too many characters, the original code charts were produced with a single glyph for all regions. The glyphs were designed by Beijing Zhongyi Electronic Ltd. After the introduction of multi-column code charts on Unicode 5.2, the original glyphs were retained under the UCS2003 source identifier; they were then removed in Unicode 14.0, being redundant as well as misleading.[7] The glyphs are packaged in the "SimSun-ExtB" font distributed with the Simplified Chinese versions of Windows, and do not adhere to the glyphs for the Mainland China region.
Known issues
Unifiable variants and exact duplicates in Extension B
Also in CJK Unified Ideographs Extension B, hundreds of glyph variants were encoded.[8] In addition to the deliberate encoding of close glyph variants, six exact duplicates (where the same character has inadvertently been encoded twice) and two semi-duplicates (where the CJK-B character represents a de facto disunification of two glyph forms unified in the corresponding BMP character) were encoded by mistake:[9]
- U+34A8 㒨 = U+20457 𠑗 : U+20457 is the same as the China-source glyph for U+34A8, but it is significantly different from the Taiwan-source glyph for U+34A8
- U+3DB7 㶷 = U+2420E 𤈎 : same glyph shapes
- U+8641 虁 = U+27144 𧅄 : U+27144 is the same as the Korean-source glyph for U+8641, but it is significantly different from the Chinese Mainland-, Taiwan- and Japan-source glyphs for U+8641
- U+204F2 𠓲 = U+23515 𣔕 : same glyph shapes, but ordered under different radicals
- U+249BC 𤦼 = U+249E9 𤧩 : same glyph shapes
- U+24BD2 𤯒 = U+2A415 𪐕 : same glyph shapes, but ordered under different radicals
- U+26842 𦡂 = U+26866 𦡦 : same glyph shapes
- U+FA23 﨣 = U+27EAF 𧺯 : same glyph shapes (U+FA23 﨣 is a unified CJK ideograph, despite its name "CJK COMPATIBILITY IDEOGRAPH-FA23.")
Block
History
The following Unicode-related documents record the purpose and process of defining specific characters in the CJK Unified Ideographs Extension B block:
Version | Final code points[lower-alpha 1] | Count | L2 ID | WG2 ID | IRG ID | Document |
---|---|---|---|---|---|---|
3.1 | U+20000..2A6D6 | 42,711 | L2/98-260 | Ng, Nelson; Kung, Michael (1998-05-26), "CJK UNIFIED IDEOGRAPHS EXTENSION B", Report on IRG meeting #11 | ||
L2/99-239 | Addition of three hundred and fourteen KANJIs (from JIS X0213), 1999-07-15 | |||||
L2/99-310 | Addition of three hundred and thirteen KANJIs (from JIS X0213), 1999-08-23 | |||||
L2/99-335 | N2109 | N674 | Zhang, Zhoucai (1999-09-03), SuperCJK, version 9.0 with Kangxi and HYD data | |||
L2/99-336 | N2105 | N675 | CJK Unified Ideographs Extension B WD 6.0, 1999-09-03 | |||
L2/99-316 | Whistler, Ken (1999-09-13), Comments on JCS proposal | |||||
L2/99-312 | excerpt of usages and sources of proposed KANJIs in contemporary Japanese, 1999-10-06 | |||||
L2/99-366 | Suignard, Michel (1999-11-24), Text for CD ballot of ISO/IEC 10646 part 2 | |||||
L2/99-366.1 | Cover page for N3393, 1999-11-24 | |||||
L2/99-366.2 | Suignard, Michel (1999-11-24), Text of CD 10646-2 | |||||
L2/99-366.3 | Suignard, Michel (1999-11-24), CJK Ext. B pages 001-100 | |||||
L2/99-366.4 | Suignard, Michel (1999-11-24), CJK Ext. B pages 101-200 | |||||
L2/99-366.5 | Suignard, Michel (1999-11-24), CJK Ext. B pages 201-300 | |||||
L2/99-366.6 | Suignard, Michel (1999-11-24), CJK Ext. B pages 301-335 | |||||
L2/99-366.7 | Suignard, Michel (1999-11-24), Special Purpose Plane and Annexes | |||||
L2/99-366.8 | Suignard, Michel (1999-11-24), Mapping of CJK Ext. B characters | |||||
L2/99-385 | N2144 | N713R | Jenkins, John (1999-12-08), Clarification of the Non-Cognate Rule | |||
L2/00-010 | N2103 | Umamaheswaran, V. S. (2000-01-05), "10.3", Minutes of WG 2 meeting 37, Copenhagen, Denmark: 1999-09-13—16 | ||||
L2/00-021R (pdf, rtf) | ISO CD 10646 Part-2 vote -- A proposal to move JIS X 0213 Kanji characters on Extension-B into BMP, 2000-01-21 | |||||
L2/00-030 | Enomoto, Yoshi (2000-01-31), Background of the proposal (for encoding of 302 ideographs from JIS X 0213) | |||||
L2/00-036 | Umamaheswaran, V. S.; Sargent, Murray (2000-02-03), Expert contribution on the placement of additional unified ideographs from JIS X0213, HK, and Korea | |||||
L2/01-026 (pdf, doc) | N2298 | N758 | CJK Unified Ideographs Extension B, PreDIS R1 For ISO/IEC DIS 10646-2:2000, 2000-11-21 | |||
L2/01-136 | N2334 (pdf, doc) | Sato, T. K. (2001-03-28), Notification of an error and request for a correction regarding mapping information for a particular JIS X 0213 character in CJK UNIFIED IDEOGRAPHS EXTENSION-B | ||||
L2/01-163 | N2347 | N785 | CJK Unified Ideographs Extension B PreIS For ISO/IEC 10646-2:2000, 2001-03-30 | |||
L2/01-162 | N2349 (pdf, doc) | N787 | Zhang, Zhoucai (2001-04-02), Clarification On Versions of CJK Unified Ideographs Extension B As Well As SuperCJK | |||
L2/02-122 | N2427 | Ksar, Mike (2002-03-18), Proposal to add 1 Hanja code of D P R of Korea into 10646-2:2001 | ||||
L2/02-201 | N2448 | N924 | Error Correction, 2002-05-08 | |||
L2/02-416 | N2518 | Proposal to add 2 hanja codes of D P R of Korea into 10646-2:2001, 2002-11-01 | ||||
L2/03-017 | Late DPRK Comments on SC 2 N 3625, 10646-2: 2001/FPDAM 1, 2002-12-09 | |||||
L2/03-287 | Cook, Richard (2003-08-24), 16 UniHan.txt errors | |||||
L2/03-301 | Cook, Richard (2003-08-27), 24 more UniHan.txt errors | |||||
L2/03-311 | West, Andrew (2003-09-17), Unicode 4.0.1 Beta Review, comments from Andrew C. West | |||||
L2/03-399 | Fok, Anthony (2003-10-13), Unihan reported errors / changes re kHKSCS entries | |||||
L2/03-398 | Nguyen, D. (2003-10-29), Unihan reported errors / changes re kCowles | |||||
L2/03-453 | Minutes of the Editorial Group Ad Hoc Discussion, 2003-12-17 | |||||
L2/04-008 | N2695 | N1026 | China's confirmation on fonts for CJK_B 21E2D and 21E45, 2004-01-05 | |||
L2/04-208 | N2774R | N1064 | Proposal to add 6 KP source references to existing CJK Unified Ideographs, 2004-05-25 | |||
L2/04-281 | N2830 | Suignard, Michel (2004-06-23), CJK Ideograph source visual references information | ||||
L2/04-417 | Cook, Richard (2004-11-18), Extension B font versioning: preliminary work | |||||
L2/05-022 | Cook, Richard (2005-01-25), Extension B font versioning: follow-up report, part 1 [text] | |||||
L2/05-023 | Cook, Richard (2005-01-25), Extension B font versioning: follow-up report, part 2 [tables] | |||||
N3353 (pdf, doc) | Umamaheswaran, V. S. (2007-10-10), "M51.9", Unconfirmed minutes of WG 2 meeting 51 Hanzhou, China; 2007-04-24/27 | |||||
L2/07-208 | N3285 | Proposal to replace 11 KP source references to existing ISO/IEC 10646:2003, 2007-07-18 | ||||
L2/08-234 | N1406 | Cook, Richard; Bishop, Thomas; Lunde, Ken (2008-06-06), Han Unification Issues | ||||
L2/08-310 | Cook, Richard (2008-08-12), Fonts for Extension B and C and IRG | |||||
L2/10-215 | Lunde, Ken (2010-06-22), "Hanyo-Denshi" IVD Collection (PRI 167) to Adobe-Japan1-6 Mapping Table | |||||
N3903 (pdf, doc) | "M57.07 (CJK Ext. B glyphs from 2nd edition)", Unconfirmed minutes of WG2 meeting 57, 2011-03-31 | |||||
L2/11-243 | N4111 | Sources for Orphaned CJK Ideographs, 2011-06-14 | ||||
L2/11-254 | Constable, Peter (2011-06-20), "Update to UTR #45 U-Source Ideographs requested", UTC Liaison Report from WG2 | |||||
N4103 | "Resolution 58.05", Unconfirmed minutes of WG 2 meeting 58, 2012-01-03 | |||||
L2/14-260 | N4621 | Suignard, Michel (2014-10-23), CJK chart and source references update | ||||
L2/16-052 | N4603 (pdf, doc) | Umamaheswaran, V. S. (2015-09-01), "M63.05", Unconfirmed minutes of WG 2 meeting 63 | ||||
L2/17-180 | N2202 | Chan, Eiso (2017-06-02), Request for consideration to add kIRG_GSource values to thirteen ideographs and change two G-source glyphs for the Table of General Standard Chinese Characters [Affects U+20164] | ||||
L2/17-362 | Moore, Lisa (2018-02-02), "Consensus 153-C16", UTC #153 Minutes | |||||
N4974 | N2301 | Request of TCA's Horizontal Extension for Chemical Terminology [Affects U+20BBF, U+20C02, U+20CED, U+26B4C, U+26CBE, U+26E3D, U+28834, U+289A1, U+289C0, U+28A0F, and U+28B46], 2018-06-12 | ||||
N4987 | Proposal on China's Horizontal Extension for 14 CJK Ideographs [Affects U+20164, 24A7D, 25ED7, 2677C and 26C21], 2018-06-13 | |||||
N4988 | Proposal on Updating 11 G glyphs of CJK Unified Ideographs to ISO/IEC 10646 [Affects U+21D4C, 2278B, 23AB8 and 2459B], 2018-06-13 | |||||
N2336 | Modify the G glyph for U+23517, 2018-09-10 | |||||
N5016 | N2349 | Shin, Sanghyun; Cho, Sungduk; Pyo, Seungju; Kim, Kyongsok (2018-12-13), Request to move character K6-1022 in Horizontal Extension of KS X 1027-5 from U+3EAC to U+248F2 | ||||
N5020 (pdf, doc) | Umamaheswaran, V. S. (2019-01-11), "10.4.6, 10.4.8, and 10.4.9", Unconfirmed minutes of WG 2 meeting 67 | |||||
N2369 | Chan, Eiso (2019-05-06), Feedback on IRGN2369 [Affects U+20219 U+21249, U+21827, U+22C3A, U+2327B, U+2363B, U+23839, U+23FD5, U+24261, U+2548E, and U+26C9E] | |||||
N5086 | N2379 | Proposal of China's horizontal extension for technical used characters [Affects U+23496, U+2355E, U+236ED, U+24726, U+26FE1, U+27334 and U+2A38C], 2019-05-10 | ||||
L2/19-237 | N5068 | Editorial Report on Miscellaneous Issues (meeting IRG#52) [Affects U+23517, U+248F2, and U+26657], 2019-05-17 | ||||
L2/19-244 | N5107 | TCA's UNC Proposal for WG2 submission [Affects U+27C0E], 2019-05-24 | ||||
L2/19-241 | N5083 | N2391 | Errata report for WG2 submission_TCA [Affects U+26657], 2019-05-31 | |||
N5082 | N2391 | Updated G Font of U+23517, 2019-05-31 | ||||
L2/22-238 | Bai, Yi; Sim, CheonHyeong (2022-10-16), Proposal to consider adding CodeCharts support for kIRG_KPSource representative glyphs [Affects U+23CC0, 23CD9, 249D6, 249E8, and 24D6A] | |||||
L2/22-256 | N2580R | T-Source Glyph Correction and Horizontal Extension [Affects U+2486F], 2022-10-18 | ||||
L2/22-259 | N2556R2 | Chan, Eiso; Collins, Lee; Việt, Ngô Trung (2022-10-20), IRGN2556R2 V-Source Glyph and Codes Updates [Affects U+20302, 2087A, 20C00, 230B7, 2339E, 236EF, 237C3, 23B87, 23E5E, 2585E, 29516, 26A5A, 26A5B, 26A73, 26A82, 26A83, 26A90, 26AA6, 26AA8, 26AD8, 27350, 279F8, and 284A3] | ||||
L2/22-247 | Lunde, Ken (2022-11-01), "03) 2022-08-08 07:25:17 CDT [Affects 29530], 24) L2/22-256, 26) L2/22-259, and 35) L2/22-238", CJK & Unihan Group Recommendations for UTC #173 Meeting | |||||
L2/22-241 | Constable, Peter (2022-11-09), "E.1 03) 2022-08-08 07:25:17 CDT, E.1 24) L2/22-256, E.1 26) L2/22-259, and E.1 35) L2/22-238", Approved Minutes of UTC Meeting 173 | |||||
L2/23-011 | Lunde, Ken (2023-01-11), "11) L2/22-238: Proposal to consider adding CodeCharts support for kIRG_KPSource representative glyphs [Affects U+23CC0, 23CD9, 249D6, 249E8, and 24D6A]", CJK & Unihan Group Recommendations for UTC #174 Meeting | |||||
L2/23-089 | N2609 | Chung, Jaemin (2023-03-15), G glyphs for U+25D89 and U+28BBA | ||||
L2/23-076 | Constable, Peter (2023-05-01), "E.1 Section 27", UTC #175 Minutes, Accept the proposal to change the G-source representative glyphs for U+25D89 and U+28BBA | |||||
13.0 | U+2A6D7..2A6DD | 7 | L2/17-087 | Chan, Eiso; Wang, Xiaolei; Le, Hou; You, Jerry (2017-04-03), Proposal to encode characters for Gongche Notation | ||
L2/17-103 | Moore, Lisa (2017-05-18), "E.5", UTC #151 Minutes | |||||
L2/18-063 | N2296 | Lunde, Ken (2018-02-22), Proposal to remove the UCS2003 representative glyphs from the Extension B code charts | ||||
N2299 | Chan, Eiso (2018-04-22), Request to discuss how to handle seven unencoded Gongche characters for Kunqu Opera | |||||
L2/18-168 | Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Moore, Lisa; Liang, Hai; Chapman, Chris; Cook, Richard (2018-04-28), "24. Extension B", Recommendations to UTC #155 April-May 2018 on Script Proposals | |||||
L2/18-245 | N4967 | Chan, Eiso; You, Jerry; Wang, Xiaolei; Le, Hou (2018-06-01), Updated proposal on Gongche characters for Kunqu Opera | ||||
L2/18-241 | Anderson, Deborah; et al. (2018-07-25), "17", Recommendations to UTC # 156 July 2018 on Script Proposals | |||||
L2/18-183 | Moore, Lisa (2018-11-20), "B.4.1", UTC #156 Minutes | |||||
N5020 (pdf, doc) | Umamaheswaran, V. S. (2019-01-11), "10.2.3", Unconfirmed minutes of WG 2 meeting 67 | |||||
N5122 | "M68.01", Unconfirmed minutes of WG 2 meeting 68, 2019-12-31 | |||||
L2/19-243 | N5106 | Suignard, Michel (2019-06-20), "Gongche", Disposition of comments on ISO/IEC CD.2 10646 6th edition | ||||
L2/19-270 | Moore, Lisa (2019-10-07), "Consensus 160-C9", UTC #160 Minutes | |||||
L2/20-080 | Lunde, Ken (2020-03-16), Proposal to remove the UCS2003 representative glyphs from the Extension B code charts - Redux | |||||
L2/20-102 | Moore, Lisa (2020-05-06), "Consensus 163-C12", UTC #163 Minutes | |||||
14.0 | U+2A6DE..2A6DF | 2 | L2/20-203 | N5140 | N2437 | Submission of 5 Macao SARG UNC Characters and One TCA UNC character, 2020-08-08 |
L2/20-235 | Lunde, Ken (2020-09-22), "4)UAX #38 / Unihan Database Documents", Unihan Ad Hoc Recommendations for UTC #165 Meeting | |||||
L2/20-237 | Moore, Lisa (2020-10-27), "Consensus 165-C9", UTC #165 Minutes | |||||
|
References
- "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
- "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.
- "18.1: Han (§ Blocks Containing Han Ideographs)" (PDF). The Unicode Standard: Core Specification. Version 15.0. Unicode Consortium. pp. 741–744. 2022. ISBN 978-1-936213-32-0.
- "Unicode Character Database: Standardized Variation Sequences". The Unicode Consortium.
- "Ideographic Variation Database". Unicode Consortium.
- "UTS #37, Unicode Ideographic Variation Database". Unicode Consortium.
- "The Unicode Standard, Version 14.0" (PDF). Unicode Consortium.
- "unifiable glyph variants" (PDF). Archived from the original (PDF) on 2006-05-15. Retrieved 2017-12-01.
- Cook, Richard (6 October 2003). "Defect Report on Duplicate Encoded CJK Forms" (PDF). ISO/IEC JTC1/SC2/WG2. Retrieved 2012-03-28.