ENSIKLOPEDIA

Kembali ke Ensiklopedia Arsip Wikipedia Indonesia

CJK Compatibility Ideographs

CJK Compatibility Ideographs
Range	U+F900..U+FAFF (512 code points)
Plane	BMP
Scripts	Han
Assigned	472 code points
Unused	40 reserved code points
Source standards	KS X 1001 Big5 IBM 32 JIS X 0213 ARIB STD-B24 KPS 10721-2000
Unicode version history

1.0.1 (1992)	302 (+302)
3.2 (2002)	361 (+59)
4.1 (2005)	467 (+106)
5.2 (2009)	470 (+3)
6.1 (2012)	472 (+2)

Unicode documentation
Code chart ∣ Web page
Note: ^[1]^[2] Range was initially part of the Private Use Area in Unicode 1.0.0,^[3] and removed from it in Unicode 1.0.1.

CJK Compatibility Ideographs is a Unicode block created to contain mostly Han characters that were encoded in multiple locations in other established character encodings, in addition to their CJK Unified Ideographs assignments, in order to retain round-trip compatibility between Unicode and those encodings. However, it also contains 12 unified ideographs sourced from Japanese character sets from IBM.

The block has dozens of ideographic variation sequences registered in the Unicode Ideographic Variation Database (IVD).^[4]^[5] These sequences specify the desired glyph variant for a given Unicode character.

Character sources

Sources for the original collection of CJK Compatibility Ideographs include:

South Korean KS X 1001 (U+F900–U+FA0B, 268 characters; see that page for the explanation)
Taiwanese Big5 (U+FA0C–U+FA0D, 2 characters)
"IBM 32": 32 Japanese characters from IBM (U+FA0E–U+FA2D; see below)

In ensuing versions of the standard, more characters have been added to the block from:

South Korean KS X 1001 (U+FA2E–U+FA2F, 2 characters)
Japanese JIS X 0213 (U+FA30–U+FA6A, 59 characters)
Japanese ARIB STD-B24 (U+FA6B–U+FA6D, 3 characters)
North Korean KPS 10721-2000 (U+FA70–U+FAD9, 106 characters)

The "IBM 32" characters

IBM Japanese double-byte EBCDIC includes several kanji which do not exist in, or do not round-trip from, JIS X 0208. These were included as gaiji in extensions to Shift JIS and EUC-JP from IBM (e.g. code page 942), NEC, the Open Software Foundation, and Microsoft (e.g. Windows code page 932). However, they were not used as a source for the original Unified Repertoire and Ordering (URO). Instead, 32 of the IBM extension kanji, those which had not been included in the URO from other sources, were included in the CJK Compatibility Ideographs block in the range U+FA0E–U+FA2D.

Of these 32 characters:

19 are unifiable with characters in the URO, and are therefore compatibility ideographs in the strict sense.
12 are kokuji characters which are actually unified ideographs (with the Unified_Ideograph property, and which do not change upon normalisation). In spite of their inclusion in the CJK Compatibility Ideographs block and their algorithmically generated character names beginning with "CJK COMPATIBILITY IDEOGRAPH", they are not duplicates of characters in the original CJK Unified Ideographs block in any respect;^[6]^[7] 11 of these 12 are completely non-duplicate, while U+FA23 﨣 CJK COMPATIBILITY IDEOGRAPH-FA23 was later unintentionally duplicated in CJK Unified Ideographs Extension B as U+27EAF 𧺯 CJK UNIFIED IDEOGRAPH-27EAF. They are placed there because they do not have a URO encoding, yet IBM 32 is one of the encodings where duplicate encodings are of concern. All of them are rarely used or are variants of common kanji. They are as follows:

U+FA0E 﨎 CJK COMPATIBILITY IDEOGRAPH-FA0E
U+FA0F 﨏 CJK COMPATIBILITY IDEOGRAPH-FA0F
U+FA11 﨑 CJK COMPATIBILITY IDEOGRAPH-FA11
U+FA13 﨓 CJK COMPATIBILITY IDEOGRAPH-FA13
U+FA14 﨔 CJK COMPATIBILITY IDEOGRAPH-FA14
U+FA1F 﨟 CJK COMPATIBILITY IDEOGRAPH-FA1F
U+FA21 﨡 CJK COMPATIBILITY IDEOGRAPH-FA21
U+FA23 﨣 CJK COMPATIBILITY IDEOGRAPH-FA23
U+FA24 﨤 CJK COMPATIBILITY IDEOGRAPH-FA24
U+FA27 﨧 CJK COMPATIBILITY IDEOGRAPH-FA27
U+FA28 﨨 CJK COMPATIBILITY IDEOGRAPH-FA28
U+FA29 﨩 CJK COMPATIBILITY IDEOGRAPH-FA29

Uniquely, (U+FA20 蘒 CJK COMPATIBILITY IDEOGRAPH-FA20) is intended to be encoded as the kyūjitai form of a kokuji which received a separate encoding for a variant that is straightforwardly the (extended) shinjitai form U+8612 蘒 CJK UNIFIED IDEOGRAPH-8612. The URO only encoded the shinjitai form, and uses its stroke count to place it in this position. It is furthermore one variant of the many variants of the jinmeiyō kanji U+8429 萩 CJK UNIFIED IDEOGRAPH-8429 (i.e. Kummerowia). U+FA20 was assigned a normalisation to U+8612, even though the 龜 and 亀 components, while both forms of radical 213, are not usually considered unifiable.^[8]

Block

CJK Compatibility Ideographs^[1]^[2]^[3] Official Unicode Consortium code chart (PDF)
	0	1	2	3	4	5	6	7	8	9	A	B	C	D	E	F
U+F90x	豈	更	車	賈	滑	串	句	龜	龜	契	金	喇	奈	懶	癩	羅
U+F91x	蘿	螺	裸	邏	樂	洛	烙	珞	落	酪	駱	亂	卵	欄	爛	蘭
U+F92x	鸞	嵐	濫	藍	襤	拉	臘	蠟	廊	朗	浪	狼	郎	來	冷	勞
U+F93x	擄	櫓	爐	盧	老	蘆	虜	路	露	魯	鷺	碌	祿	綠	菉	錄
U+F94x	鹿	論	壟	弄	籠	聾	牢	磊	賂	雷	壘	屢	樓	淚	漏	累
U+F95x	縷	陋	勒	肋	凜	凌	稜	綾	菱	陵	讀	拏	樂	諾	丹	寧
U+F96x	怒	率	異	北	磻	便	復	不	泌	數	索	參	塞	省	葉	說
U+F97x	殺	辰	沈	拾	若	掠	略	亮	兩	凉	梁	糧	良	諒	量	勵
U+F98x	呂	女	廬	旅	濾	礪	閭	驪	麗	黎	力	曆	歷	轢	年	憐
U+F99x	戀	撚	漣	煉	璉	秊	練	聯	輦	蓮	連	鍊	列	劣	咽	烈
U+F9Ax	裂	說	廉	念	捻	殮	簾	獵	令	囹	寧	嶺	怜	玲	瑩	羚
U+F9Bx	聆	鈴	零	靈	領	例	禮	醴	隸	惡	了	僚	寮	尿	料	樂
U+F9Cx	燎	療	蓼	遼	龍	暈	阮	劉	杻	柳	流	溜	琉	留	硫	紐
U+F9Dx	類	六	戮	陸	倫	崙	淪	輪	律	慄	栗	率	隆	利	吏	履
U+F9Ex	易	李	梨	泥	理	痢	罹	裏	裡	里	離	匿	溺	吝	燐	璘
U+F9Fx	藺	隣	鱗	麟	林	淋	臨	立	笠	粒	狀	炙	識	什	茶	刺
U+FA0x	切	度	拓	糖	宅	洞	暴	輻	行	降	見	廓	兀	嗀	﨎	﨏
U+FA1x	塚	﨑	晴	﨓	﨔	凞	猪	益	礼	神	祥	福	靖	精	羽	﨟
U+FA2x	蘒	﨡	諸	﨣	﨤	逸	都	﨧	﨨	﨩	飯	飼	館	鶴	郞	隷
U+FA3x	侮	僧	免	勉	勤	卑	喝	嘆	器	塀	墨	層	屮	悔	慨	憎
U+FA4x	懲	敏	既	暑	梅	海	渚	漢	煮	爫	琢	碑	社	祉	祈	祐
U+FA5x	祖	祝	禍	禎	穀	突	節	練	縉	繁	署	者	臭	艹	艹	著
U+FA6x	褐	視	謁	謹	賓	贈	辶	逸	難	響	頻	恵	𤋮	舘
U+FA7x	並	况	全	侀	充	冀	勇	勺	喝	啕	喙	嗢	塚	墳	奄	奔
U+FA8x	婢	嬨	廒	廙	彩	徭	惘	慎	愈	憎	慠	懲	戴	揄	搜	摒
U+FA9x	敖	晴	朗	望	杖	歹	殺	流	滛	滋	漢	瀞	煮	瞧	爵	犯
U+FAAx	猪	瑱	甆	画	瘝	瘟	益	盛	直	睊	着	磌	窱	節	类	絛
U+FABx	練	缾	者	荒	華	蝹	襁	覆	視	調	諸	請	謁	諾	諭	謹
U+FACx	變	贈	輸	遲	醙	鉶	陼	難	靖	韛	響	頋	頻	鬒	龜	𢡊
U+FADx	𢡄	𣏕	㮝	䀘	䀹	𥉉	𥳐	𧻓	齃	龎
U+FAEx
U+FAFx
Notes 1.^As of Unicode version 17.0 2.^Yellow background: CJK unified ideographs (not compatibility ideographs) 3.^Grey areas indicate non-assigned code points

History

The following Unicode-related documents record the purpose and process of defining specific characters in the CJK Compatibility Ideographs block:

Version	Final code points^[a]	Count	L2 ID	WG2 ID	IRG ID	Document
1.0.1	U+F900..FA2D	302		N782		Ksar, Mike (1991-10-12), Attachment to N 767 WG2-Paris meeting copies of working papers
			L2/03-399			Fok, Anthony (2003-10-13), Unihan reported errors / changes re kHKSCS entries
			L2/03-367	N2667		Suignard, Michel; Muller, Eric; Jenkins, John (2003-10-22), CJK Ideograph source references corrections
			L2/03-398			Nguyen, D. (2003-10-29), Unihan reported errors / changes re kCowles
			L2/03-417			Muller, Eric (2003-10-31), Variation sequences for CJK Compatibility characters
			L2/06-309R			Karlsson, Kent (2006-11-07), Bug in DerivedNumericValues.txt
			L2/06-324R2			Moore, Lisa (2006-11-29), "Consensus 109-C18", UTC #109 Minutes, Add numeric values to 8 compatibility ideographs to match their canonical characters.
			L2/08-238			Cook, Richard; Lunde, Ken (2008-06-09), Recommendation For IRG To Use IVD Collections
			L2/08-373	N3525		Lunde, Ken; Muller, Eric (2008-10-06), Handling CJK compatibility characters with variation sequences
			L2/08-425			Cook, Richard; Lunde, Ken (2008-11-18), IRG Use of IVD Collections
			L2/09-003R			Moore, Lisa (2009-02-12), "WG2 — Compatibility Ideographs", UTC #118 / L2 #215 Minutes
			L2/09-080	N3590		Muller, Eric (2009-03-11), Difficulties with compatibility ideographs
			L2/09-290			Muller, Eric (2009-08-07), Draft IVD registration for Compatibility Characters
			L2/11-243	N4111		Sources for Orphaned CJK Ideographs, 2011-06-14
			L2/11-254			Constable, Peter (2011-06-20), "Update to UTR #45 U-Source Ideographs requested", UTC Liaison Report from WG2
				N4103		"Resolution 58.05", Unconfirmed minutes of WG 2 meeting 58, 2012-01-03
			L2/17-090			Chung, Jaemin (2017-04-07), Proposal to add informative notes and cross-reference to U+F92C and U+F9B8
			L2/17-103			Moore, Lisa (2017-05-18), "B.4.1 Proposal to add informative notes and cross-reference to U+F92C and U+F9B8", UTC #151 Minutes
3.2	U+FA30..FA6A	59	L2/99-016	N1935		Paterson, Bruce (1998-11-30), Editorial corrigenda on CJK compatibility ideographs, and other items
			L2/99-240			Addition of fifty six KANJIs for compatibility, 1999-07-15
			L2/99-232	N2003		Umamaheswaran, V. S. (1999-08-03), "7.2.2.1 Editorial corrigenda on CJK compatibility", Minutes of WG 2 meeting 36, Fukuoka, Japan, 1999-03-09--15
			L2/99-311			Addition of fifty six KANJIs for compatibility, 1999-08-23
			L2/99-313	N2095		Sato, T. K. (1999-09-08), Addition of CJK ideographs which are already "unified"
			L2/99-316			Whistler, Ken (1999-09-13), Comments on JCS proposal
			L2/99-322			Collins, Lee (1999-10-11), Comments on JCS compatibility characters in L2/99-310 through L2/99-313
			L2/99-365			Moore, Lisa (1999-11-23), Comments on JCS Proposals
			L2/99-383	N2142	N710	The response to WG2 resolution M37.16: CJK compatibility ideographs from JIS (WG2 N2104), 1999-12-09
			L2/00-010	N2103		Umamaheswaran, V. S. (2000-01-05), "8.8", Minutes of WG 2 meeting 37, Copenhagen, Denmark: 1999-09-13—16
			L2/99-260R			Moore, Lisa (2000-02-07), "JCS Proposals", Minutes of the UTC/L2 meeting in Mission Viejo, October 26-28, 1999
			L2/00-101	N2197		Sato, T. K. (2000-03-15), Update: CJK COMPATIBILITY IDEOGRAPH request
			L2/00-172	N2221		Sato, T. K. (2000-04-20), JIS COMPATIBILITY IDEOGRAPHS (draft for ammendment-1) [sic]
				N2221R		JIS COMPATIBILITY IDEOGRAPHS (draft for ammendment-1) [sic] revised, 2000-06-01
			L2/00-190			Moore, Lisa (2000-06-22), UTC Rescinds Acceptance of Four Duplicate Radicals from JIS X 213
			L2/00-234	N2203 (rtf, txt)		Umamaheswaran, V. S. (2000-07-21), "7.3", Minutes from the SC2/WG2 meeting in Beijing, 2000-03-21 -- 24
			L2/00-337	N2273		JIS compatibility ideographs, 2000-09-19
			L2/00-378	N2295		Sato, T. K. (2000-10-26), Feedback from Japan on N2281 -- working draft on pDAM 1 -- CJK Compatibility
			L2/01-420			Whistler, Ken (2001-10-30), "1. SC2 M11-04", WG2 (Singapore) Resolution Consent Docket for UTC
			L2/01-405R			Moore, Lisa (2001-12-12), "Consensus 89-C20", Minutes from the UTC/L2 meeting in Mountain View, November 6-9, 2001
			L2/06-321			Whistler, Ken (2006-10-03), UCD Bug re JIS 0213
			L2/06-324R2			Moore, Lisa (2006-11-29), "Consensus 109-C16", UTC #109 Minutes, Give U+FA30..U+FA6A the ideographic property, and fix the wordbreak property.
4.1	U+FA70..FAD9	106	L2/01-050	N2253		Umamaheswaran, V. S. (2001-01-21), "7.2.4 Proposal to add the Hanja column to 10646-1", Minutes of the SC2/WG2 meeting in Athens, September 2000
			L2/01-350	N2375		Proposal to add 160 Compatibility Hanja code table of D P R of Korea into CJK Compatibility Ideographs, 2001-09-03
			L2/02-154	N2403		Umamaheswaran, V. S. (2002-04-22), "TC 2", Draft minutes of WG 2 meeting 41, Hotel Phoenix, Singapore, 2001-10-15/19
				N2478		"Korea (DPRK):T2, USA T5", Proposed Disposition of comments on SC2 N 3584 (PDAM text for Amendment 2 to ISO/IEC 10646-1:2000), 2002-05-08
			L2/02-232	N2493		Sato, T. K.; Kobayashi, Tatsuo; Pak, Tong Gi (2002-05-22), Proposal to add 122 compatibility Hanja code table of the D P R of Korea into the CJK Compatibility Ideographs of ISO/IEC 10646-1:2000
				N2541		"USA T.8", Proposed disposition of comments on SC2 N 3624 (FPDAM text for Amendment 2 to ISO/IEC 10646-1:2000), 2002-12-02
				N2540		Freytag, Asmus (2002-12-05), Corrections to CJK Compatibility Ideographs Table in FPDAM
			L2/02-465	N2566		Collins, Lee; Freytag, Asmus (2002-12-09), Review of DPRK Compatibility Ideographs
			L2/02-471	N2572		CJK Compatibility Ideographs (Unicode 3.2, page 399), 2002-12-18
			L2/02-472	N2573		Report of DPRK compatibility characters ad hoc meeting, 2002-12-11
			L2/02-468	N2569		Suignard, Michel (2002-12-12), "USA T.5 e, USA T.8", Proposed disposition of comments on SC2 N 3624 (FPDAM text for Amendment 2 to ISO/IEC 10646-1:2000)
			L2/03-023	N2569R		Suignard, Michel (2003-01-27), "USA T.5 e, USA T.8", Disposition of Comments Report on 10646-1/FPDAM 2
			L2/03-346			Chang, Cora (2003-10-20), Analysis of characters in WG2 documents N2572, N2573
			L2/03-346.1			Chang, Cora (2003-10-20), Analysis of characters in WG2 documents N2572, N2573 [spreadsheet without glyphs]
			L2/04-207	N2776	N1062	Proposal to add 106 Compatibility Hanjas of D P R of Korea to CJK Compatibility Ideographs, 2004-05-25
			L2/04-330			Whistler, Ken (2004-08-03), "E", WG2 Consent Docket
			L2/04-316			Moore, Lisa (2004-08-19), "100-C12", UTC #100 Minutes
			L2/05-050R	N2924R		Freytag, Asmus (2005-01-28), Charts - Amendments 1 and 2 to ISO/IEC 10646:2003
			L2/10-367	N3899		KP1-0000, 2010-09-30
			L2/11-243	N4111		Sources for Orphaned CJK Ideographs, 2011-06-14
			L2/11-254			Constable, Peter (2011-06-20), "Update to UTR #45 U-Source Ideographs requested", UTC Liaison Report from WG2
				N4103		"Resolution 58.05", Unconfirmed minutes of WG 2 meeting 58, 2012-01-03
5.2	U+FA6B..FA6D	3		N3353 (pdf, doc)		Umamaheswaran, V. S. (2007-10-10), "M51.10", Unconfirmed minutes of WG 2 meeting 51 Hanzhou, China; 2007-04-24/27
			L2/07-387			Proposal to encode six CJK Ideographs in UCS, 2007-10-17
			L2/08-184	N3318R (pdf, appendix)		Revised proposal to encode six CJK Ideographs in UCS, 2008-03-25
			L2/08-318	N3453 (pdf, doc)		Umamaheswaran, V. S. (2008-08-13), "M52.2k", Unconfirmed minutes of WG 2 meeting 52
			L2/08-161R2			Moore, Lisa (2008-11-05), "Consensus 115-C14", UTC #115 Minutes
6.1	U+FA2E..FA2F	2	L2/10-087	N3747		A solution proposed by R.O.Korea for incorrectly mapped compatibility chars, 2010-03-19
			L2/10-108			Moore, Lisa (2010-05-19), "Consensus 123-C8", UTC #123 / L2 #220 Minutes
				N3803 (pdf, doc)		"M56.08l", Unconfirmed minutes of WG 2 meeting no. 56, 2010-09-24
↑ Proposed code points and characters names may differ from final code points and names

References

↑ "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
↑ "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.
↑ "3.5: Private Use Area" (PDF). The Unicode Standard, Version 1.0, Volume 1. Unicode Consortium. 1991. pp. 118–119. ISBN 0-201-56788-1.
↑ "Ideographic Variation Database". Unicode Consortium.
↑ "UTS #37, Unicode Ideographic Variation Database". Unicode Consortium.
↑ "PropList.txt". Unicode Consortium.
↑ Freytag, Asmus; McGowan, Rick; Whistler, Ken (2021-06-14). "Known Anomalies in Unicode Character Names". Unicode Consortium. Unicode Technical Note #27. These 12 characters are unified CJK ideographs, not compatibility ideographs, despite their names.
↑ Ideographic Research Group (2024-11-19). "UCS Ideograph Non-Unifiable Component Variations Summary List (NUCV)". UCV & NUCV Lists (PDF). ISO/IEC JTC1/SC2/WG2/IRG N2746.

CJK ideographs in Unicode^[a]

Block name	Plane	Chart range	Characters	Han unification	Scripts contained in block
CJK Unified Ideographs CJK Unified Ideographs Extension A CJK Unified Ideographs Extension B CJK Unified Ideographs Extension C CJK Unified Ideographs Extension D CJK Unified Ideographs Extension E CJK Unified Ideographs Extension F CJK Unified Ideographs Extension G CJK Unified Ideographs Extension H CJK Unified Ideographs Extension I CJK Unified Ideographs Extension J CJK Radicals Supplement Kangxi Radicals Ideographic Description Characters CJK Symbols and Punctuation CJK Strokes Enclosed CJK Letters and Months CJK Compatibility CJK Compatibility Ideographs CJK Compatibility Forms Enclosed Ideographic Supplement CJK Compatibility Ideographs Supplement	0 BMP 0 BMP 2 SIP 2 SIP 2 SIP 2 SIP 2 SIP 3 TIP 3 TIP 2 SIP 3 TIP 0 BMP 0 BMP 0 BMP 0 BMP 0 BMP 0 BMP 0 BMP 0 BMP 0 BMP 1 SMP 2 SIP	4E00–9FFF 3400–4DBF 20000–2A6DF 2A700–2B73F 2B740–2B81F 2B820–2CEAF 2CEB0–2EBEF 30000–3134F 31350–323AF 2EBF0–2EE5F 323B0–3347F 2E80–2EFF 2F00–2FDF 2FF0–2FFF 3000–303F 31C0–31EF 3200–32FF 3300–33FF F900–FAFF FE30–FE4F 1F200–1F2FF 2F800–2FA1F	20,992 6,592 42,720 4,160 222 5,774 7,473 4,939 4,192 622 4,298 115 214 16 64 39 255 256 472 32 64 542	Unified Unified Unified Unified Unified Unified Unified Unified Unified Unified Unified Not unified Not unified Not unified Not unified Not unified Not unified Not unified 12 are unified Not unified Not unified Not unified	Han Han Han Han Han Han Han Han Han Han Han Han Han Common Han, Hangul, Common, Inherited Common Hangul, Katakana, Common Katakana, Common Han Common Hiragana, Common Han
Totals		22	104,053

^
As of version 17.0

Character sources

The "IBM 32" characters

Block

History

See also

References