Top Banner
Problems with Non-roman Character (Korean) Problems with Non-roman Character (Korean) Searching Searching Prepared by Prepared by Young Ki Young Ki Lee Lee Senior Cataloging Senior Cataloging Specialist Specialist Korean/Chinese Korean/Chinese Team Team RCCD RCCD Library of Library of Congress Congress
39

Problems with Non-roman Character (Korean) Searching Prepared by Prepared by Young Ki Lee Young Ki Lee Senior Cataloging Specialist Senior Cataloging Specialist.

Mar 27, 2015

Download

Documents

Elijah Stevens
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • Slide 1

Problems with Non-roman Character (Korean) Searching Prepared by Prepared by Young Ki Lee Young Ki Lee Senior Cataloging Specialist Senior Cataloging Specialist Korean/Chinese Team Korean/Chinese Team RCCD RCCD Library of Congress Library of Congress Slide 2 Topics to be covered 1.Non-roman script (Korean) searching under CJK data fields without spacing 2.No Unified index (Normalization) between Hangul (Korean) and Hancha (Chinese character) 3.Microsoft Korean IME 4.Display of search results 5.CJK Compatibility Database Slide 3 Title Word Search for Title Word Search for Search ( : the border): -the number of hits on this ti: search is 363 -the ratio of relevant hits only 13 % (13 out of 99) in the 1 st group (Books 1970-1993) -the records which have the word in any position in the title fields (includes between subfields) are picked up by System, such as : / : / /, : /, etc. -In Voyager (currently with space), same search (tkey ) retrieves only 9 hits Slide 4 Search9 Slide 5 Title Word Search for Title Word Search for Search ( : the border): -the number of hits on this ti: search is 360 -the ratio of relevant hits only 13 % (13 out of 99) in the 1 st group (Books 1970-1993) -the records which have the word in any position in the title fields (includes between subfields) are picked up by System, such as : / : / /, : /, etc. -In Voyager (currently with space), same search (tkey ) retrieves only 9 hits Slide 6 Title Word Search for Title Word Search for Search ( : the border): -the number of hits on this ti: search is 360 -the ratio of relevant hits only 13 % (13 out of 99) in the 1 st group (Books 1970- 1993) -the records which have the word in any position in the title fields (includes between subfields) are retrieved, such as = / = / /, = /, etc. -In Voyager (currently with space), same search (tkey ) retrieves only 9 hits Slide 7 Title Word Search for Title Word Search for Search ( : the border): -the number of hits on this ti: search is 360 -the ratio of relevant hits only 13 % (13 out of 99) in the 1 st group (Books 1970- 1993) -the records which have the word in any position in the title fields (includes between subfields) are retrieved, such as =, =, etc. -In Voyager (currently with space), same search (tkey ) retrieves only 9 hits Slide 8 Title Word Search for Title Word Search for Search ( : the border): -the number of hits on this ti: search is 360 -the ratio of relevant hits only 13 % (13 out of 99) in the 1 st group (Books 1970- 1993) -the records which have the word in any position in the title fields (includes between subfields) are retrieved, such as = / = / = / /, = /, etc. -In Voyager (currently with space), same search (tkey ) retrieves only 9 hits Slide 9 Title Word Search for Title Word Search for Search ( : the border): -the number of hits on this ti: search is 360 -the ratio of relevant hits only 13 % (13 out of 99) in the 1 st group (Books 1970-1993) -the records which have the word in any position in the title fields (includes between subfields) are retrieved, such as = / = / /, = /, etc. -In Voyager (currently with space), same search (tkey ) retrieves only 9 hits Slide 10 Title Word Search for Title Word Search for Search ( : the border): -the number of hits on this ti: search is 360 -the ratio of relevant hits only 13 % (13 out of 99) in the 1 st group (Books 1970-1993) -the records which have the word in any position in the title fields (includes between subfields) are retrieved, such as = / = / / = /, etc. -In Voyager (currently with space), same search (tkey ) retrieves only 9 hits Slide 11 7 Slide 12 Title Word Search for Title Word Search for Search ( : the border): -the number of hits on this ti: search is 360 -the ratio of relevant hits only 13 % (13 out of 99) in the 1 st group (Books 1970-1993) -the records which have the word in any position in the title fields (includes between subfields) are retrieved, such as = / = / /, = /, etc. -In LC Online Catalog: (currently with space), title word search retrieves only 9 hits Slide 13 Title Word Search for Title Word Search for Search ( : philology): -In OCLC, the number of hits on ti: search is 308 -the ratio of relevant hits is only 37% (36 out of 95) in the first group (Books 1900-1991) -Includes = = = / = / = = = / = /, = /, etc., = /, etc. -In Voyager (currently with space), same search (tkey ) retrieves 32 hits Slide 14 Title Word Search for Title Word Search for Search ( : name of ancient Korean country) Search ( : name of ancient Korean country) retrieves irrelevant records, such as retrieves irrelevant records, such as = / / / / / = / / / / / CD-ROM = CD-ROM/ / / / /CD-ROM = CD-ROM/ / / / / = / / = / / = / / / / / / = / / / / / / = / / = / / 5 5 = / / /5 / / / / / / / / = / / /5 / / / / / / / / = / / /, etc. = / / /, etc. Slide 15 2 Slide 16 4 Slide 17 7 Slide 18 Kochoson8 Slide 19 komunso1 Slide 20 Komunso2 Slide 21 Komunso3 Slide 22 Title Word Search for Title Word Search for ( : Korean Economy): ti: search ( : Korean Economy): ti: search -search : the number of hits 300 -search : the number of hits 652 -search : the number of hits 3 -search : the number of hits 0 -search Hanguk kyongje : the number of hits 1,490 Title Phrase search for : ti= search Slide 23 Title Word Search for Title Word Search for ( : Korean Economy): ti: search ( : Korean Economy): ti: search -search : the number of hits 295 -search : the number of hits 652 -search : the number of hits 3 -search : the number of hits 0 -search Hanguk kyongje : the number of hits 1,490 Title Phrase search for : ti= search Slide 24 Title Word Search for Title Word Search for ( : Korean Economy): ti: search ( : Korean Economy): ti: search -search : the number of hits 295 -search : the number of hits 652 -search : the number of hits 3 -search : the number of hits 0 -search Hanguk kyongje : the number of hits 1,490 Title Phrase search for : ti= search Slide 25 Title Word Search for Title Word Search for ( : Korean Economy): ti: search ( : Korean Economy): ti: search -search : the number of hits 295 -search : the number of hits 652 -search : the number of hits 3 -search : the number of hits 0 -search Hanguk kyongje : the number of hits 1,490 Title Phrase search for : ti= search Slide 26 Title Word Search for Title Word Search for ( : Korean Economy): ti: search ( : Korean Economy): ti: search -search : the number of hits 295 -search : the number of hits 652 -search : the number of hits 3 -search : the number of hits 0 -search Hanguk kyongje : the number of hits 1,499 Title Phrase search for : ti= search Slide 27 Title Phrase Search for Title Phrase Search for ( : Korean Economy): ti: search ( : Korean Economy): ti: search -search : the number of hits 295 -search : the number of hits 652 -search : the number of hits 3 -search : the number of hits 0 -search Hanguk kyongje : the number of hits 1,490 -search # : the number of hits : 461 (ti: AND ti: ) Title Phrase search for : ti= search Slide 28 Search ti: nodongja or or or Search ti: nodongja or or or Slide 29 Slide 30 Korean IME Problems 1. Personal name search with invalid character from Korean IME -Search in pn: : 0 hit. (F9E1) is invalid character from Korean IME -Search in pn: : 157 hits. (674E) is valid MARC21 character 2. Title search with invalid character from Korean IME 2. Title search with invalid character from Korean IME -Search in ti: : 0 hit. (F941) is invalid character from Korean IME -Search in ti: : 21,393 hits. (8AD6) is valid MARC21 character 3. Korean Family name 3. Korean Family name -No MARC 21 equivalent Slide 31 Display Order 1. Browse search: sorted by Unicode value number roman Japanese Hancha Hangul 2.Keyword search: sorted by alphabet order of Romanization form number -- Romanization 3.Display order : character by character on designated value Slide 32 sort2 Unicode total strokes radical (# : stroke) : 9280: 14 167 (gold) 8 : 9580 : 8 169 (gate) 8 : 990A: 15 184 (eat) 6 : 9B42 14 194 (ghost) 10 : AC00 Slide 33 sort3 Slide 34 Display Order 1. Browse search: sorted by Unicode value number roman Japanese Hancha Hangul 2.Keyword search: sorted by alphabet order of Romanization form number -- Romanization 3.Display order : character by character on designated value NOT word by word Slide 35 Slide 36 sort1 : C9C4 : CE68 : C911 : C778 Slide 37 Display Order 1.Browse search: sorted by Unicode value number roman Japanese Hancha Hangul 2.Keyword search: sorted by alphabet order of Romanization form number -- Romanization 3.Display order : character by character on designated value NOT word by word Slide 38 CJK Compatibility Database 1. The CJK Compatibility Database includes more than 450 non-MARC21 Chinese, Japanese and Korean characters, Hangul syllables and diacritic marks, matched with their MARC21 equivalents. 2. The database is intended to enable catalogers to quickly and conveniently replace a non-MARC21 character with its MARC21 equivalent. 3. The list of characters in the database was initially identified by LC staff, and was supplemented by entries in a similar database at Yale University. 4. The database is a cooperative undertaking, and is intended for the use of all CJK catalogers. If you encounter a non-MARC21 character in the course of your work, please report it to us so that it can be added to the database. Notify Young Ki Lee, Senior Cataloging Specialist, Korean/Chinese Team, Library of Congress, at [email protected]. Slide 39 Thank you