Search (408 results, page 1 of 21)

Kleineberg, M.: Context analysis and context indexing : formal pragmatics in knowledge organization (2014) 0.33

0.3264357 = product of:
  0.9140199 = sum of:
    0.07947999 = product of:
      0.23843996 = sum of:
        0.23843996 = weight(_text_:3a in 1826) [ClassicSimilarity], result of:
          0.23843996 = score(doc=1826,freq=2.0), product of:
            0.25455406 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03002521 = queryNorm
            0.93669677 = fieldWeight in 1826, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.078125 = fieldNorm(doc=1826)
      0.33333334 = coord(1/3)
    0.23843996 = weight(_text_:2f in 1826) [ClassicSimilarity], result of:
      0.23843996 = score(doc=1826,freq=2.0), product of:
        0.25455406 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03002521 = queryNorm
        0.93669677 = fieldWeight in 1826, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.078125 = fieldNorm(doc=1826)
    0.11921998 = product of:
      0.23843996 = sum of:
        0.23843996 = weight(_text_:3a in 1826) [ClassicSimilarity], result of:
          0.23843996 = score(doc=1826,freq=2.0), product of:
            0.25455406 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03002521 = queryNorm
            0.93669677 = fieldWeight in 1826, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.078125 = fieldNorm(doc=1826)
      0.5 = coord(1/2)
    0.23843996 = weight(_text_:2f in 1826) [ClassicSimilarity], result of:
      0.23843996 = score(doc=1826,freq=2.0), product of:
        0.25455406 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03002521 = queryNorm
        0.93669677 = fieldWeight in 1826, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.078125 = fieldNorm(doc=1826)
    0.23843996 = weight(_text_:2f in 1826) [ClassicSimilarity], result of:
      0.23843996 = score(doc=1826,freq=2.0), product of:
        0.25455406 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03002521 = queryNorm
        0.93669677 = fieldWeight in 1826, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.078125 = fieldNorm(doc=1826)
  0.35714287 = coord(5/14)

Source: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=5&ved=0CDQQFjAE&url=http%3A%2F%2Fdigbib.ubka.uni-karlsruhe.de%2Fvolltexte%2Fdocuments%2F3131107&ei=HzFWVYvGMsiNsgGTyoFI&usg=AFQjCNE2FHUeR9oQTQlNC4TPedv4Mo3DaQ&sig2=Rlzpr7a3BLZZkqZCXXN_IA&bvm=bv.93564037,d.bGg&cad=rja

Popper, K.R.: Three worlds : the Tanner lecture on human values. Deliverd at the University of Michigan, April 7, 1978 (1978) 0.26

0.26114854 = product of:
  0.7312159 = sum of:
    0.06358399 = product of:
      0.19075197 = sum of:
        0.19075197 = weight(_text_:3a in 230) [ClassicSimilarity], result of:
          0.19075197 = score(doc=230,freq=2.0), product of:
            0.25455406 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03002521 = queryNorm
            0.7493574 = fieldWeight in 230, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0625 = fieldNorm(doc=230)
      0.33333334 = coord(1/3)
    0.19075197 = weight(_text_:2f in 230) [ClassicSimilarity], result of:
      0.19075197 = score(doc=230,freq=2.0), product of:
        0.25455406 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03002521 = queryNorm
        0.7493574 = fieldWeight in 230, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.0625 = fieldNorm(doc=230)
    0.095375985 = product of:
      0.19075197 = sum of:
        0.19075197 = weight(_text_:3a in 230) [ClassicSimilarity], result of:
          0.19075197 = score(doc=230,freq=2.0), product of:
            0.25455406 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03002521 = queryNorm
            0.7493574 = fieldWeight in 230, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0625 = fieldNorm(doc=230)
      0.5 = coord(1/2)
    0.19075197 = weight(_text_:2f in 230) [ClassicSimilarity], result of:
      0.19075197 = score(doc=230,freq=2.0), product of:
        0.25455406 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03002521 = queryNorm
        0.7493574 = fieldWeight in 230, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.0625 = fieldNorm(doc=230)
    0.19075197 = weight(_text_:2f in 230) [ClassicSimilarity], result of:
      0.19075197 = score(doc=230,freq=2.0), product of:
        0.25455406 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03002521 = queryNorm
        0.7493574 = fieldWeight in 230, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.0625 = fieldNorm(doc=230)
  0.35714287 = coord(5/14)

Source: https%3A%2F%2Ftannerlectures.utah.edu%2F_documents%2Fa-to-z%2Fp%2Fpopper80.pdf&usg=AOvVaw3f4QRTEH-OEBmoYr2J_c7H

British Library / FAST/Dewey Review Group: Consultation on subject indexing and classification standards applied by the British Library (2015) 0.06

0.061677463 = product of:
  0.17269689 = sum of:
    0.056933407 = weight(_text_:subject in 2810) [ClassicSimilarity], result of:
      0.056933407 = score(doc=2810,freq=10.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.5301652 = fieldWeight in 2810, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.046875 = fieldNorm(doc=2810)
    0.028549349 = weight(_text_:classification in 2810) [ClassicSimilarity], result of:
      0.028549349 = score(doc=2810,freq=4.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.29856625 = fieldWeight in 2810, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=2810)
    0.02849856 = product of:
      0.05699712 = sum of:
        0.05699712 = weight(_text_:schemes in 2810) [ClassicSimilarity], result of:
          0.05699712 = score(doc=2810,freq=2.0), product of:
            0.16067243 = queryWeight, product of:
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.03002521 = queryNorm
            0.35474116 = fieldWeight in 2810, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.046875 = fieldNorm(doc=2810)
      0.5 = coord(1/2)
    0.030166224 = weight(_text_:bibliographic in 2810) [ClassicSimilarity], result of:
      0.030166224 = score(doc=2810,freq=2.0), product of:
        0.11688946 = queryWeight, product of:
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.03002521 = queryNorm
        0.2580748 = fieldWeight in 2810, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.046875 = fieldNorm(doc=2810)
    0.028549349 = weight(_text_:classification in 2810) [ClassicSimilarity], result of:
      0.028549349 = score(doc=2810,freq=4.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.29856625 = fieldWeight in 2810, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=2810)
  0.35714287 = coord(5/14)

Abstract: A broad-based review of the subject and classification schemes used on British Library records began in late 2014. The review was undertaken in response to a number of drivers including: - An increasing demand on available resources due to the rapidly expanding digital publishing arena, and continuing steady state in print publication patterns - Increased demands on metadata to meet changing audience expectations.
Content: The Library is consulting with stakeholders concerning the potential impact of these proposals. No firm decisions have yet been taken regarding either of these standards. FAST 1. The British Library proposes to adopt FAST selectively to extend the scope of subject indexing of current and legacy content. 2. The British Library proposes to implement FAST as a replacement for LCSH in all current cataloguing, subject to mitigation of the risks identified above, in particular the question of sustainability. DDC 3. The British Library proposes to implement Abridged DDC selectively to extend the scope of subject indexing of current and legacy content.
Source: http://www.bl.uk/bibliographic/pdfs/british-library-consultation-fast-abridged-dewey.pdf

Functional Requirements for Subject Authority Data (FRSAD) : a conceptual model (2009) 0.05

0.054242246 = product of:
  0.15187828 = sum of:
    0.06574103 = weight(_text_:subject in 3573) [ClassicSimilarity], result of:
      0.06574103 = score(doc=3573,freq=30.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.612182 = fieldWeight in 3573, product of:
          5.477226 = tf(freq=30.0), with freq of:
            30.0 = termFreq=30.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03125 = fieldNorm(doc=3573)
    0.013458292 = weight(_text_:classification in 3573) [ClassicSimilarity], result of:
      0.013458292 = score(doc=3573,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.14074548 = fieldWeight in 3573, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03125 = fieldNorm(doc=3573)
    0.01899904 = product of:
      0.03799808 = sum of:
        0.03799808 = weight(_text_:schemes in 3573) [ClassicSimilarity], result of:
          0.03799808 = score(doc=3573,freq=2.0), product of:
            0.16067243 = queryWeight, product of:
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.03002521 = queryNorm
            0.2364941 = fieldWeight in 3573, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.03125 = fieldNorm(doc=3573)
      0.5 = coord(1/2)
    0.04022163 = weight(_text_:bibliographic in 3573) [ClassicSimilarity], result of:
      0.04022163 = score(doc=3573,freq=8.0), product of:
        0.11688946 = queryWeight, product of:
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.03002521 = queryNorm
        0.34409973 = fieldWeight in 3573, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.03125 = fieldNorm(doc=3573)
    0.013458292 = weight(_text_:classification in 3573) [ClassicSimilarity], result of:
      0.013458292 = score(doc=3573,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.14074548 = fieldWeight in 3573, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03125 = fieldNorm(doc=3573)
  0.35714287 = coord(5/14)

Abstract: Subject access to information has been the predominant approach of users to satisfy their information needs. Research demonstrates that the integration of controlled vocabulary information with an information retrieval system helps users perform more effective subject searches. This integration becomes possible when subject authority data (information about subjects from authority files) are linked to bibliographic files and are made available to users. The purpose of authority control is to ensure consistency in representing a value-a name of a person, a place name, or a subject term-in the elements used as access points in information retrieval. For example, "World War, 1939-1945" has been established as an authorized subject heading in the Library of Congress Subject Headings (LCSH). When using LCSH, in cataloging or indexing, all publications about World War II are assigned the established heading regardless of whether a publication refers to the war as the "European War, 1939-1945", "Second World War", "World War 2", "World War II", "WWII", "World War Two", or "2nd World War." The synonymous expressions are referred to by the authorized heading. This ensures that all publications about World War II can be retrieved by and displayed under the same subject heading, either in an individual institution's own catalog or database or in a union catalog that contains bibliographic records from a number of individual libraries or databases. In almost all large bibliographic databases, authority control is achieved manually or semi-automatically by means of an authority file. The file contains records of headings or access points - names, titles, or subjects - that have been authorized for use in bibliographic records. In addition to ensuring consistency in subject representation, a subject authority record also records and maintains semantic relationships among subject terms and/or their labels. Records in a subject authority file are connected through semantic relationships, which may be expressed statically in subject authority records or generated dynamically according to the specific needs (e.g., presenting the broader and narrower terms) of printed or online display of thesauri, subject headings lists, classification schemes, and other knowledge organization systems.
Editor: IFLA Working Group on Functional Requirements for Subject Authority Records (FRSAR)

Bourdon, F.; Landry, P.: Best practices for subject access to national bibliographies : interim report by the Working Group on Guidelines for Subject Access by National Bibliographic Agencies (2007) 0.05

0.04665659 = product of:
  0.16329806 = sum of:
    0.066422306 = weight(_text_:subject in 698) [ClassicSimilarity], result of:
      0.066422306 = score(doc=698,freq=10.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.61852604 = fieldWeight in 698, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.0546875 = fieldNorm(doc=698)
    0.023552012 = weight(_text_:classification in 698) [ClassicSimilarity], result of:
      0.023552012 = score(doc=698,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.24630459 = fieldWeight in 698, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0546875 = fieldNorm(doc=698)
    0.04977173 = weight(_text_:bibliographic in 698) [ClassicSimilarity], result of:
      0.04977173 = score(doc=698,freq=4.0), product of:
        0.11688946 = queryWeight, product of:
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.03002521 = queryNorm
        0.4258017 = fieldWeight in 698, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.0546875 = fieldNorm(doc=698)
    0.023552012 = weight(_text_:classification in 698) [ClassicSimilarity], result of:
      0.023552012 = score(doc=698,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.24630459 = fieldWeight in 698, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0546875 = fieldNorm(doc=698)
  0.2857143 = coord(4/14)

Abstract: The working group to establish guidelines for subject access by national bibliographic agencies was set up in 2005 in order to analyse the question of subject access and propose key elements for an indexing policy for national bibliographies. The group's mandate is to put forward recommendations based on best practices for subject access to national bibliographies. The group is presently assessing the elements which should be included in an indexing policy and will present an initial version of its recommendations in 2008.
Content: Vortrag anlässlich: WORLD LIBRARY AND INFORMATION CONGRESS: 73RD IFLA GENERAL CONFERENCE AND COUNCIL 19-23 August 2007, Durban, South Africa. - 89 - Bibliography with National Libraries and Classification and Indexing

Mitchell, J.S.; Zeng, M.L.; Zumer, M.: Modeling classification systems in multicultural and multilingual contexts (2012) 0.05

0.045787815 = product of:
  0.16025734 = sum of:
    0.044100422 = weight(_text_:subject in 1967) [ClassicSimilarity], result of:
      0.044100422 = score(doc=1967,freq=6.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.41066417 = fieldWeight in 1967, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.046875 = fieldNorm(doc=1967)
    0.049448926 = weight(_text_:classification in 1967) [ClassicSimilarity], result of:
      0.049448926 = score(doc=1967,freq=12.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.5171319 = fieldWeight in 1967, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=1967)
    0.049448926 = weight(_text_:classification in 1967) [ClassicSimilarity], result of:
      0.049448926 = score(doc=1967,freq=12.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.5171319 = fieldWeight in 1967, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=1967)
    0.017259069 = product of:
      0.034518138 = sum of:
        0.034518138 = weight(_text_:22 in 1967) [ClassicSimilarity], result of:
          0.034518138 = score(doc=1967,freq=4.0), product of:
            0.10514317 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03002521 = queryNorm
            0.32829654 = fieldWeight in 1967, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=1967)
      0.5 = coord(1/2)
  0.2857143 = coord(4/14)

Abstract: This paper reports on the second part of an initiative of the authors on researching classification systems with the conceptual model defined by the Functional Requirements for Subject Authority Data (FRSAD) final report. In an earlier study, the authors explored whether the FRSAD conceptual model could be extended beyond subject authority data to model classification data. The focus of the current study is to determine if classification data modeled using FRSAD can be used to solve real-world discovery problems in multicultural and multilingual contexts. The paper discusses the relationships between entities (same type or different types) in the context of classification systems that involve multiple translations and /or multicultural implementations. Results of two case studies are presented in detail: (a) two instances of the DDC (DDC 22 in English, and the Swedish-English mixed translation of DDC 22), and (b) Chinese Library Classification. The use cases of conceptual models in practice are also discussed.
Source: Beyond libraries - subject metadata in the digital environment and semantic web. IFLA Satellite Post-Conference, 17-18 August 2012, Tallinn

Koch, T.; Ardö, A.: Automatic classification of full-text HTML-documents from one specific subject area : DESIRE II D3.6a, Working Paper 2 (2000) 0.04

0.038544483 = product of:
  0.17987426 = sum of:
    0.048010457 = weight(_text_:subject in 1667) [ClassicSimilarity], result of:
      0.048010457 = score(doc=1667,freq=4.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.4470745 = fieldWeight in 1667, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.0625 = fieldNorm(doc=1667)
    0.0659319 = weight(_text_:classification in 1667) [ClassicSimilarity], result of:
      0.0659319 = score(doc=1667,freq=12.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.6895092 = fieldWeight in 1667, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0625 = fieldNorm(doc=1667)
    0.0659319 = weight(_text_:classification in 1667) [ClassicSimilarity], result of:
      0.0659319 = score(doc=1667,freq=12.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.6895092 = fieldWeight in 1667, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0625 = fieldNorm(doc=1667)
  0.21428572 = coord(3/14)

Content: 1 Introduction / 2 Method overview / 3 Ei thesaurus preprocessing / 4 Automatic classification process: 4.1 Matching -- 4.2 Weighting -- 4.3 Preparation for display / 5 Results of the classification process / 6 Evaluations / 7 Software / 8 Other applications / 9 Experiments with universal classification systems / References / Appendix A: Ei classification service: Software / Appendix B: Use of the classification software as subject filter in a WWW harvester.

Robbio, A. de; Maguolo, D.; Marini, A.: Scientific and general subject classifications in the digital world (2001) 0.04
```
0.03805627 = product of:
  0.13319694 = sum of:
    0.05092278 = weight(_text_:subject in 2) [ClassicSimilarity], result of:
      0.05092278 = score(doc=2,freq=18.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.4741941 = fieldWeight in 2, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03125 = fieldNorm(doc=2)
    0.026916584 = weight(_text_:classification in 2) [ClassicSimilarity], result of:
      0.026916584 = score(doc=2,freq=8.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.28149095 = fieldWeight in 2, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03125 = fieldNorm(doc=2)
    0.028440988 = weight(_text_:bibliographic in 2) [ClassicSimilarity], result of:
      0.028440988 = score(doc=2,freq=4.0), product of:
        0.11688946 = queryWeight, product of:
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.03002521 = queryNorm
        0.24331525 = fieldWeight in 2, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.03125 = fieldNorm(doc=2)
    0.026916584 = weight(_text_:classification in 2) [ClassicSimilarity], result of:
      0.026916584 = score(doc=2,freq=8.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.28149095 = fieldWeight in 2, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03125 = fieldNorm(doc=2)
  0.2857143 = coord(4/14)
```
Abstract

In the present work we discuss opportunities, problems, tools and techniques encountered when interconnecting discipline-specific subject classifications, primarily organized as search devices in bibliographic databases, with general classifications originally devised for book shelving in public libraries. We first state the fundamental distinction between topical (or subject) classifications and object classifications. Then we trace the structural limitations that have constrained subject classifications since their library origins, and the devices that were used to overcome the gap with genuine knowledge representation. After recalling some general notions on structure, dynamics and interferences of subject classifications and of the objects they refer to, we sketch a synthetic overview on discipline-specific classifications in Mathematics, Computing and Physics, on one hand, and on general classifications on the other. In this setting we present The Scientific Classifications Page, which collects groups of Web pages produced by a pool of software tools for developing hypertextual presentations of single or paired subject classifications from sequential source files, as well as facilities for gathering information from KWIC lists of classification descriptions. Further we propose a concept-oriented methodology for interconnecting subject classifications, with the concrete support of a relational analysis of the whole Mathematics Subject Classification through its evolution since 1959. Finally, we recall a very basic method for interconnection provided by coreference in bibliographic records among index elements from different systems, and point out the advantages of establishing the conditions of a more widespread application of such a method. A part of these contents was presented under the title Mathematics Subject Classification and related Classifications in the Digital World at the Eighth International Conference Crimea 2001, "Libraries and Associations in the Transient World: New Technologies and New Forms of Cooperation", Sudak, Ukraine, June 9-17, 2001, in a special session on electronic libraries, electronic publishing and electronic information in science chaired by Bernd Wegner, Editor-in-Chief of Zentralblatt MATH.

Object

INSPEC Classification

Choi, I.: Visualizations of cross-cultural bibliographic classification : comparative studies of the Korean Decimal Classification and the Dewey Decimal Classification (2017) 0.04

0.037514914 = product of:
  0.1313022 = sum of:
    0.041207436 = weight(_text_:classification in 3869) [ClassicSimilarity], result of:
      0.041207436 = score(doc=3869,freq=12.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.43094325 = fieldWeight in 3869, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3869)
    0.0237488 = product of:
      0.0474976 = sum of:
        0.0474976 = weight(_text_:schemes in 3869) [ClassicSimilarity], result of:
          0.0474976 = score(doc=3869,freq=2.0), product of:
            0.16067243 = queryWeight, product of:
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.03002521 = queryNorm
            0.2956176 = fieldWeight in 3869, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3869)
      0.5 = coord(1/2)
    0.02513852 = weight(_text_:bibliographic in 3869) [ClassicSimilarity], result of:
      0.02513852 = score(doc=3869,freq=2.0), product of:
        0.11688946 = queryWeight, product of:
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.03002521 = queryNorm
        0.21506234 = fieldWeight in 3869, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3869)
    0.041207436 = weight(_text_:classification in 3869) [ClassicSimilarity], result of:
      0.041207436 = score(doc=3869,freq=12.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.43094325 = fieldWeight in 3869, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3869)
  0.2857143 = coord(4/14)

Abstract: The changes in KO systems induced by sociocultural influences may include those in both classificatory principles and cultural features. The proposed study will examine the Korean Decimal Classification (KDC)'s adaptation of the Dewey Decimal Classification (DDC) by comparing the two systems. This case manifests the sociocultural influences on KOSs in a cross-cultural context. Therefore, the study aims at an in-depth investigation of sociocultural influences by situating a KOS in a cross-cultural environment and examining the dynamics between two classification systems designed to organize information resources in two distinct sociocultural contexts. As a preceding stage of the comparison, the analysis was conducted on the changes that result from the meeting of different sociocultural feature in a descriptive method. The analysis aims to identify variations between the two schemes in comparison of the knowledge structures of the two classifications, in terms of the quantity of class numbers that represent concepts and their relationships in each of the individual main classes. The most effective analytic strategy to show the patterns of the comparison was visualizations of similarities and differences between the two systems. Increasing or decreasing tendencies in the class through various editions were analyzed. Comparing the compositions of the main classes and distributions of concepts in the KDC and DDC discloses the differences in their knowledge structures empirically. This phase of quantitative analysis and visualizing techniques generates empirical evidence leading to interpretation.

Automatic classification research at OCLC (2002) 0.04

0.03687797 = product of:
  0.12907289 = sum of:
    0.04079328 = weight(_text_:classification in 1563) [ClassicSimilarity], result of:
      0.04079328 = score(doc=1563,freq=6.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.42661208 = fieldWeight in 1563, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1563)
    0.03324832 = product of:
      0.06649664 = sum of:
        0.06649664 = weight(_text_:schemes in 1563) [ClassicSimilarity], result of:
          0.06649664 = score(doc=1563,freq=2.0), product of:
            0.16067243 = queryWeight, product of:
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.03002521 = queryNorm
            0.41386467 = fieldWeight in 1563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1563)
      0.5 = coord(1/2)
    0.04079328 = weight(_text_:classification in 1563) [ClassicSimilarity], result of:
      0.04079328 = score(doc=1563,freq=6.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.42661208 = fieldWeight in 1563, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1563)
    0.014238005 = product of:
      0.02847601 = sum of:
        0.02847601 = weight(_text_:22 in 1563) [ClassicSimilarity], result of:
          0.02847601 = score(doc=1563,freq=2.0), product of:
            0.10514317 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03002521 = queryNorm
            0.2708308 = fieldWeight in 1563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1563)
      0.5 = coord(1/2)
  0.2857143 = coord(4/14)

Abstract: OCLC enlists the cooperation of the world's libraries to make the written record of humankind's cultural heritage more accessible through electronic media. Part of this goal can be accomplished through the application of the principles of knowledge organization. We believe that cultural artifacts are effectively lost unless they are indexed, cataloged and classified. Accordingly, OCLC has developed products, sponsored research projects, and encouraged the participation in international standards communities whose outcome has been improved library classification schemes, cataloging productivity tools, and new proposals for the creation and maintenance of metadata. Though cataloging and classification requires expert intellectual effort, we recognize that at least some of the work must be automated if we hope to keep pace with cultural change
Date: 5. 5.2003 9:22:09

Louie, A.J.; Maddox, E.L.; Washington, W.: Using faceted classification to provide structure for information architecture (2003) 0.04

0.03587399 = product of:
  0.12555896 = sum of:
    0.02546139 = weight(_text_:subject in 2471) [ClassicSimilarity], result of:
      0.02546139 = score(doc=2471,freq=2.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.23709705 = fieldWeight in 2471, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.046875 = fieldNorm(doc=2471)
    0.03496567 = weight(_text_:classification in 2471) [ClassicSimilarity], result of:
      0.03496567 = score(doc=2471,freq=6.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.3656675 = fieldWeight in 2471, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=2471)
    0.030166224 = weight(_text_:bibliographic in 2471) [ClassicSimilarity], result of:
      0.030166224 = score(doc=2471,freq=2.0), product of:
        0.11688946 = queryWeight, product of:
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.03002521 = queryNorm
        0.2580748 = fieldWeight in 2471, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.046875 = fieldNorm(doc=2471)
    0.03496567 = weight(_text_:classification in 2471) [ClassicSimilarity], result of:
      0.03496567 = score(doc=2471,freq=6.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.3656675 = fieldWeight in 2471, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=2471)
  0.2857143 = coord(4/14)

Abstract: This is a short, but very thorough and very interesting, report on how the writers built a faceted classification for some legal information and used it to structure a web site with navigation and searching. There is a good summary of why facets work well and how they fit into bibliographic control in general. The last section is about their implementation of a web site for the Washington State Bar Association's Council for Legal Public Education. Their classification uses three facets: Purpose (the general aim of the document, e.g. Resources for K-12 Teachers), Topic (the subject of the document), and Type (the legal format of the document). See Example Web Sites, below, for a discussion of the site and a problem with its design.

Goldberg, J.: Classification of religion in LCC (2000) 0.04

0.03527055 = product of:
  0.1645959 = sum of:
    0.047104023 = weight(_text_:classification in 5402) [ClassicSimilarity], result of:
      0.047104023 = score(doc=5402,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.49260917 = fieldWeight in 5402, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.109375 = fieldNorm(doc=5402)
    0.070387855 = weight(_text_:bibliographic in 5402) [ClassicSimilarity], result of:
      0.070387855 = score(doc=5402,freq=2.0), product of:
        0.11688946 = queryWeight, product of:
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.03002521 = queryNorm
        0.6021745 = fieldWeight in 5402, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.109375 = fieldNorm(doc=5402)
    0.047104023 = weight(_text_:classification in 5402) [ClassicSimilarity], result of:
      0.047104023 = score(doc=5402,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.49260917 = fieldWeight in 5402, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.109375 = fieldNorm(doc=5402)
  0.21428572 = coord(3/14)

Footnote: Vortrag, IFLA General Conference, Divison IV Bibliographic Control, Jerusalem, 2000

SKOS Simple Knowledge Organization System Primer (2009) 0.04

0.035095256 = product of:
  0.122833386 = sum of:
    0.02546139 = weight(_text_:subject in 4795) [ClassicSimilarity], result of:
      0.02546139 = score(doc=4795,freq=2.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.23709705 = fieldWeight in 4795, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.046875 = fieldNorm(doc=4795)
    0.02018744 = weight(_text_:classification in 4795) [ClassicSimilarity], result of:
      0.02018744 = score(doc=4795,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.21111822 = fieldWeight in 4795, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=4795)
    0.05699712 = product of:
      0.11399424 = sum of:
        0.11399424 = weight(_text_:schemes in 4795) [ClassicSimilarity], result of:
          0.11399424 = score(doc=4795,freq=8.0), product of:
            0.16067243 = queryWeight, product of:
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.03002521 = queryNorm
            0.7094823 = fieldWeight in 4795, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.046875 = fieldNorm(doc=4795)
      0.5 = coord(1/2)
    0.02018744 = weight(_text_:classification in 4795) [ClassicSimilarity], result of:
      0.02018744 = score(doc=4795,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.21111822 = fieldWeight in 4795, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=4795)
  0.2857143 = coord(4/14)

Abstract: SKOS (Simple Knowledge Organisation System) provides a model for expressing the basic structure and content of concept schemes such as thesauri, classification schemes, subject heading lists, taxonomies, folksonomies, and other types of controlled vocabulary. As an application of the Resource Description Framework (RDF) SKOS allows concepts to be documented, linked and merged with other data, while still being composed, integrated and published on the World Wide Web. This document is an implementors guide for those who would like to represent their concept scheme using SKOS. In basic SKOS, conceptual resources (concepts) can be identified using URIs, labelled with strings in one or more natural languages, documented with various types of notes, semantically related to each other in informal hierarchies and association networks, and aggregated into distinct concept schemes. In advanced SKOS, conceptual resources can be mapped to conceptual resources in other schemes and grouped into labelled or ordered collections. Concept labels can also be related to each other. Finally, the SKOS vocabulary itself can be extended to suit the needs of particular communities of practice.

Mai, F.; Galke, L.; Scherp, A.: Using deep learning for title-based semantic subject indexing to reach competitive performance to full-text (2018) 0.03

0.034503423 = product of:
  0.12076197 = sum of:
    0.030006537 = weight(_text_:subject in 4093) [ClassicSimilarity], result of:
      0.030006537 = score(doc=4093,freq=4.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.27942157 = fieldWeight in 4093, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4093)
    0.023791125 = weight(_text_:classification in 4093) [ClassicSimilarity], result of:
      0.023791125 = score(doc=4093,freq=4.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.24880521 = fieldWeight in 4093, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4093)
    0.023791125 = weight(_text_:classification in 4093) [ClassicSimilarity], result of:
      0.023791125 = score(doc=4093,freq=4.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.24880521 = fieldWeight in 4093, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4093)
    0.04317318 = product of:
      0.08634636 = sum of:
        0.08634636 = weight(_text_:texts in 4093) [ClassicSimilarity], result of:
          0.08634636 = score(doc=4093,freq=6.0), product of:
            0.16460659 = queryWeight, product of:
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.03002521 = queryNorm
            0.524562 = fieldWeight in 4093, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4093)
      0.5 = coord(1/2)
  0.2857143 = coord(4/14)

Abstract: For (semi-)automated subject indexing systems in digital libraries, it is often more practical to use metadata such as the title of a publication instead of the full-text or the abstract. Therefore, it is desirable to have good text mining and text classification algorithms that operate well already on the title of a publication. So far, the classification performance on titles is not competitive with the performance on the full-texts if the same number of training samples is used for training. However, it is much easier to obtain title data in large quantities and to use it for training than full-text data. In this paper, we investigate the question how models obtained from training on increasing amounts of title training data compare to models from training on a constant number of full-texts. We evaluate this question on a large-scale dataset from the medical domain (PubMed) and from economics (EconBiz). In these datasets, the titles and annotations of millions of publications are available, and they outnumber the available full-texts by a factor of 20 and 15, respectively. To exploit these large amounts of data to their full potential, we develop three strong deep learning classifiers and evaluate their performance on the two datasets. The results are promising. On the EconBiz dataset, all three classifiers outperform their full-text counterparts by a large margin. The best title-based classifier outperforms the best full-text method by 9.9%. On the PubMed dataset, the best title-based method almost reaches the performance of the best full-text classifier, with a difference of only 2.9%.

SKOS Core Guide (2005) 0.03

0.032913495 = product of:
  0.11519723 = sum of:
    0.02546139 = weight(_text_:subject in 4689) [ClassicSimilarity], result of:
      0.02546139 = score(doc=4689,freq=2.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.23709705 = fieldWeight in 4689, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.046875 = fieldNorm(doc=4689)
    0.02018744 = weight(_text_:classification in 4689) [ClassicSimilarity], result of:
      0.02018744 = score(doc=4689,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.21111822 = fieldWeight in 4689, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=4689)
    0.049360957 = product of:
      0.098721914 = sum of:
        0.098721914 = weight(_text_:schemes in 4689) [ClassicSimilarity], result of:
          0.098721914 = score(doc=4689,freq=6.0), product of:
            0.16067243 = queryWeight, product of:
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.03002521 = queryNorm
            0.6144297 = fieldWeight in 4689, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.046875 = fieldNorm(doc=4689)
      0.5 = coord(1/2)
    0.02018744 = weight(_text_:classification in 4689) [ClassicSimilarity], result of:
      0.02018744 = score(doc=4689,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.21111822 = fieldWeight in 4689, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=4689)
  0.2857143 = coord(4/14)

Abstract: SKOS Core provides a model for expressing the basic structure and content of concept schemes such as thesauri, classification schemes, subject heading lists, taxonomies, 'folksonomies', other types of controlled vocabulary, and also concept schemes embedded in glossaries and terminologies. The SKOS Core Vocabulary is an application of the Resource Description Framework (RDF), that can be used to express a concept scheme as an RDF graph. Using RDF allows data to be linked to and/or merged with other data, enabling data sources to be distributed across the web, but still be meaningfully composed and integrated. This document is a guide using the SKOS Core Vocabulary, for readers who already have a basic understanding of RDF concepts. This edition of the SKOS Core Guide [SKOS Core Guide] is a W3C Public Working Draft. It is the authoritative guide to recommended usage of the SKOS Core Vocabulary at the time of publication.

Godby, C. J.; Stuler, J.: ¬The Library of Congress Classification as a knowledge base for automatic subject categorization (2001) 0.03

0.032580506 = product of:
  0.15204236 = sum of:
    0.058800567 = weight(_text_:subject in 1567) [ClassicSimilarity], result of:
      0.058800567 = score(doc=1567,freq=6.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.5475522 = fieldWeight in 1567, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.0625 = fieldNorm(doc=1567)
    0.046620894 = weight(_text_:classification in 1567) [ClassicSimilarity], result of:
      0.046620894 = score(doc=1567,freq=6.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.48755667 = fieldWeight in 1567, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0625 = fieldNorm(doc=1567)
    0.046620894 = weight(_text_:classification in 1567) [ClassicSimilarity], result of:
      0.046620894 = score(doc=1567,freq=6.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.48755667 = fieldWeight in 1567, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0625 = fieldNorm(doc=1567)
  0.21428572 = coord(3/14)

Abstract: This paper describes a set of experiments in adapting a subset of the Library of Congress Classification for use as a database for automatic classification. A high degree of concept integrity was obtained when subject headings were mapped from OCLC's WorldCat database and filtered using the log-likelihood statistic
Footnote: Paper, IFLA Preconference "Subject Retrieval in a Networked Environment", Dublin, OH, August 2001.

Seeliger, F.: ¬A tool for systematic visualization of controlled descriptors and their relation to others as a rich context for a discovery system (2015) 0.03

0.032490525 = product of:
  0.09097347 = sum of:
    0.024005229 = weight(_text_:subject in 2547) [ClassicSimilarity], result of:
      0.024005229 = score(doc=2547,freq=4.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.22353725 = fieldWeight in 2547, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03125 = fieldNorm(doc=2547)
    0.013458292 = weight(_text_:classification in 2547) [ClassicSimilarity], result of:
      0.013458292 = score(doc=2547,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.14074548 = fieldWeight in 2547, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03125 = fieldNorm(doc=2547)
    0.020110816 = weight(_text_:bibliographic in 2547) [ClassicSimilarity], result of:
      0.020110816 = score(doc=2547,freq=2.0), product of:
        0.11688946 = queryWeight, product of:
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.03002521 = queryNorm
        0.17204987 = fieldWeight in 2547, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.03125 = fieldNorm(doc=2547)
    0.013458292 = weight(_text_:classification in 2547) [ClassicSimilarity], result of:
      0.013458292 = score(doc=2547,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.14074548 = fieldWeight in 2547, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03125 = fieldNorm(doc=2547)
    0.019940836 = product of:
      0.039881673 = sum of:
        0.039881673 = weight(_text_:texts in 2547) [ClassicSimilarity], result of:
          0.039881673 = score(doc=2547,freq=2.0), product of:
            0.16460659 = queryWeight, product of:
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.03002521 = queryNorm
            0.2422848 = fieldWeight in 2547, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.03125 = fieldNorm(doc=2547)
      0.5 = coord(1/2)
  0.35714287 = coord(5/14)

Abstract: The discovery service (a search engine and service called WILBERT) used at our library at the Technical University of Applied Sciences Wildau (TUAS Wildau) is comprised of more than 8 million items. If we were to record all licensed publications in this tool to a higher level of articles, including their bibliographic records and full texts, we would have a holding estimated at a hundred million documents. A lot of features, such as ranking, autocompletion, multi-faceted classification, refining opportunities reduce the number of hits. However, it is not enough to give intuitive support for a systematic overview of topics related to documents in the library. John Naisbitt once said: "We are drowning in information, but starving for knowledge." This quote is still very true today. Two years ago, we started to develop micro thesauri for MINT topics in order to develop an advanced indexing of the library stock. We use iQvoc as a vocabulary management system to create the thesaurus. It provides an easy-to-use browser interface that builds a SKOS thesaurus in the background. The purpose of this is to integrate the thesauri in WILBERT in order to offer a better subject-related search. This approach especially supports first-year students by giving them the possibility to browse through a hierarchical alignment of a subject, for instance, logistics or computer science, and thereby discover how the terms are related. It also supports the students with an insight into established abbreviations and alternative labels. Students at the TUAS Wildau were involved in the developmental process of the software regarding the interface and functionality of iQvoc. The first steps have been taken and involve the inclusion of 3000 terms in our discovery tool WILBERT.

Vatant, B.; Dunsire, G.: Use case vocabulary merging (2010) 0.03
```
0.031378385 = product of:
  0.10982434 = sum of:
    0.044909675 = weight(_text_:subject in 4336) [ClassicSimilarity], result of:
      0.044909675 = score(doc=4336,freq=14.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.41819993 = fieldWeight in 4336, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03125 = fieldNorm(doc=4336)
    0.013458292 = weight(_text_:classification in 4336) [ClassicSimilarity], result of:
      0.013458292 = score(doc=4336,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.14074548 = fieldWeight in 4336, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03125 = fieldNorm(doc=4336)
    0.03799808 = product of:
      0.07599616 = sum of:
        0.07599616 = weight(_text_:schemes in 4336) [ClassicSimilarity], result of:
          0.07599616 = score(doc=4336,freq=8.0), product of:
            0.16067243 = queryWeight, product of:
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.03002521 = queryNorm
            0.4729882 = fieldWeight in 4336, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.03125 = fieldNorm(doc=4336)
      0.5 = coord(1/2)
    0.013458292 = weight(_text_:classification in 4336) [ClassicSimilarity], result of:
      0.013458292 = score(doc=4336,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.14074548 = fieldWeight in 4336, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03125 = fieldNorm(doc=4336)
  0.2857143 = coord(4/14)
```
Abstract

The publication of library legacy includes publication of structuring vocabularies such as thesauri, classifications, subject headings. Different sources use different vocabularies, different in structure, width, depth and scope, and languages. Federated access to distributed data collections is currently possible if they rely on the same vocabularies. Mapping techniques and standards supporting them (such as SKOS mapping properties, OWL sameAs and equivalentClass) are still largely experimental, even in the linked data land. Libraries use a variety of controlled subject vocabulary and classification schemes to index items in their collections. Although most collections will employ only a single scheme, different schemes may be chosen to index different collections within a library or in separate libraries; schemes are chosen on the basis of language, subject focus (general or specific), granularity (specificity), user expectation, and availability and support (cost, currency, completeness, tools). For example, a typical academic library will operate separate metadata systems for the library's main collections, special collections (e.g. manuscripts, archives, audiovisual), digital collections, and one or more institutional repositories for teaching and research output; each of these systems may employ a different subject vocabulary, with little or no interoperability between terms and concepts. Users expect to have a single point-of-search in resource discovery services focussed on their local institutional collections. Librarians have to use complex and expensive resource discovery platforms to meet user expectations. Library communities continue to develop resource discovery services for consortia with a geographical, subject, sector (public, academic, school, special libraries), and/or domain (libraries, archives, museums) focus. Services are based on distributed searching (e.g. via Z39.50) or metadata aggregations (e.g. OCLC's WorldCat and OAISter). As a result, the number of different subject schemes encountered in such services is increasing. Trans-national consortia (e.g. Europeana) add to the complexity of the environment by including subject vocabularies in multiple languages. Users expect single point-of-search in consortial resource discovery service involving multiple organisations and large-scale metadata aggregations. Users also expect to be able to search for subjects using their own language and terms in an unambiguous, contextualised manner.
Slavic, A.: Mapping intricacies : UDC to DDC (2010) 0.03
```
0.03063678 = product of:
  0.08578298 = sum of:
    0.02372225 = weight(_text_:subject in 3370) [ClassicSimilarity], result of:
      0.02372225 = score(doc=3370,freq=10.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.22090214 = fieldWeight in 3370, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.01953125 = fieldNorm(doc=3370)
    0.018808534 = weight(_text_:classification in 3370) [ClassicSimilarity], result of:
      0.018808534 = score(doc=3370,freq=10.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.19669779 = fieldWeight in 3370, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.01953125 = fieldNorm(doc=3370)
    0.0118744 = product of:
      0.0237488 = sum of:
        0.0237488 = weight(_text_:schemes in 3370) [ClassicSimilarity], result of:
          0.0237488 = score(doc=3370,freq=2.0), product of:
            0.16067243 = queryWeight, product of:
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.03002521 = queryNorm
            0.1478088 = fieldWeight in 3370, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.01953125 = fieldNorm(doc=3370)
      0.5 = coord(1/2)
    0.01256926 = weight(_text_:bibliographic in 3370) [ClassicSimilarity], result of:
      0.01256926 = score(doc=3370,freq=2.0), product of:
        0.11688946 = queryWeight, product of:
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.03002521 = queryNorm
        0.10753117 = fieldWeight in 3370, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.01953125 = fieldNorm(doc=3370)
    0.018808534 = weight(_text_:classification in 3370) [ClassicSimilarity], result of:
      0.018808534 = score(doc=3370,freq=10.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.19669779 = fieldWeight in 3370, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.01953125 = fieldNorm(doc=3370)
  0.35714287 = coord(5/14)
```
Content

"Last week, I received an email from Yulia Skora in Ukraine who is working on the mapping between UDC Summary and BBK (Bibliographic Library Classification) Summary. It reminded me of yet another challenging area of work. When responding to Yulia I realised that the issues with mapping, for instance, UDC Summary to Dewey Summaries [pdf] are often made more difficult because we have to deal with classification summaries in both systems and we cannot use a known exactMatch in many situations. In 2008, following advice received from colleagues in the HILT project, two of our colleagues quickly mapped 1000 classes of Dewey Summaries to UDC Master Reference File as a whole. This appeared to be relatively simple. The mapping in this case is simply an answer to a question "and how would you say e.g. Art metal work in UDC?" But when in 2009 we realised that we were going to release 2000 classes of UDC Summary as linked data, we decided to wait until we had our UDC Summary set defined and completed to be able to publish it mapped to the Dewey Summaries. As we arrived at this stage, little did we realise how much more complex the reversed mapping of UDC Summary to Dewey Summaries would turn out to be. Mapping the Dewey Summaries to UDC highlighted situations in which the logic and structure of two systems do not agree. Especially because Dewey tends to enumerate combinations of subject and attributes that do not always logically belong together. For instance, 850 Literatures of Italian, Sardinian, Dalmatian, Romanian, Rhaeto-Romanic languages Italian literature. This class mixes languages from three different subgroups of Romance languages. Italian and Sardinian belong to Italo Romance sub-family; Romanian and Dalmatian are Balkan Romance languages and Rhaeto Romance is the third subgroup that includes Friulian Ladin and Romanch. As UDC literature is based on a strict classification of language families, Dewey class 850 has to be mapped to 3 narrower UDC classes 821.131 Literature of Italo-Romance Languages , 821.132 Literature of Rhaeto-Romance languages and 821.135 Literature of Balkan-Romance Languages, or to a broader class 821.13 Literature of Romance languages. Hence we have to be sure that we have all these classes listed in the UDC Summary to be able to express UDC-DDC many-to-one, specific-to-broader relationships.
Another challenge appears when, e.g., mapping Dewey class 890 Literatures of other specific languages and language families, which does not make sense in UDC in which all languages and literatures have equal status. Standard UDC schedules do not have a selection of preferred literatures and other literatures. In principle, UDC does not allow classes entitled 'others' which do not have defined semantic content. If entities are subdivided and there is no provision for an item outside the listed subclasses then this item is subsumed to a top class or a broader class where all unspecifiied or general members of that class may be expected. If specification is needed this can be divised by adding an alphabetical extension to the broader class. Here we have to find and list in the UDC Summary all literatures that are 'unpreferred' i.e. lumped in the 890 classes and map them again as many-to-one specific-to-broader match. The example below illustrates another interesting case. Classes Dewey 061 and UDC 06 cover roughy the same semantic field but in the subdivision the Dewey Summaries lists a combination of subject and place and as an enumerative classification, provides ready made numbers for combinations of place that are most common in an average (American?) library. This is a frequent approach in the schemes created with the physical book arrangement, i.e. library schelves, in mind. UDC, designed as an indexing language for information retrieval, keeps subject and place in separate tables and allows for any concept of place such as, e.g. (7) North America to be used in combination with any subject as these may coincide in documents. Thus combinations such as Newspapers in North America, or Organizations in North America would not be offered as ready made combinations. There is no selection of 'preferred' or 'most needed countries' or languages or cultures in the standard UDC edition: <Tabelle>
If we map the Dewey Summaries to UDC in general and do not have to worry about a reverse relationship the situation is very simple as shown above. Mapping of UDC Summary to Dewey Summaries requires more thought. Firstly, UDC class (7) North America (common auxiliary of place) which simply represents the place has to be mapped to all occurences in which this place is 'built in' to the Dewey subjects: 063 Organization of North America 073 Journalism of North America 917 Geography of North America 970 History of North America 277 Christianity in North America 317 General Statistics in North America 557 Earth Sciences of North America The type of mapping from what is a general UDC concept of place (7) North America to a specific subject is clearly a broader-to-narrow match. Mapping of, for instance , UDC class 07 Newspapers. The press (includes journalism) to DDC class of 073 Journalim of North America is again broad-to-narrow match.
Precombined subjects, such as those shown above from Dewey, may be expressed in UDC Summary as examples of combination within various records. To express an exact match UDC class 07 has to contain example of combination 07(7) Journals. The Press - North America. In some cases we have, therefore, added examples to UDC Summary that represent exact match to Dewey Summaries. It is unfortunate that DDC has so many classes on the top level that deal with a selection of countries or languages that are given a preferred status in the scheme, and repeating these preferences in examples of combinations of UDC emulates an unwelcome cultural bias which we have to balance out somehow. This brings us to another challenge.. UDC 913(7) Regional Geography - North America [contains 2 concepts each of which has its URI] is an exact match to Dewey 917 [represented as one concept, 1 URI]. It seems that, because they represent an exact match to Dewey numbers, these UDC examples of combinations may also need a separate URIs so that they can be published as SKOS data. Albeit challenging, mapping proves to be a very useful exercise and I am looking forward to future work here especially in relation to our plans to map UDC Summary to Colon Classification. We are discussing this project with colleagues from DRTC in Bangalore (India)."
Dolin, R.; Agrawal, D.; El Abbadi, A.; Pearlman, J.: Using automated classification for summarizing and selecting heterogeneous information sources (1998) 0.03
```
0.030435055 = product of:
  0.10652269 = sum of:
    0.018003922 = weight(_text_:subject in 1253) [ClassicSimilarity], result of:
      0.018003922 = score(doc=1253,freq=4.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.16765293 = fieldWeight in 1253, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.0234375 = fieldNorm(doc=1253)
    0.031919144 = weight(_text_:classification in 1253) [ClassicSimilarity], result of:
      0.031919144 = score(doc=1253,freq=20.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.33380723 = fieldWeight in 1253, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0234375 = fieldNorm(doc=1253)
    0.024680478 = product of:
      0.049360957 = sum of:
        0.049360957 = weight(_text_:schemes in 1253) [ClassicSimilarity], result of:
          0.049360957 = score(doc=1253,freq=6.0), product of:
            0.16067243 = queryWeight, product of:
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.03002521 = queryNorm
            0.30721486 = fieldWeight in 1253, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.0234375 = fieldNorm(doc=1253)
      0.5 = coord(1/2)
    0.031919144 = weight(_text_:classification in 1253) [ClassicSimilarity], result of:
      0.031919144 = score(doc=1253,freq=20.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.33380723 = fieldWeight in 1253, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0234375 = fieldNorm(doc=1253)
  0.2857143 = coord(4/14)
```
Abstract

Information retrieval over the Internet increasingly requires the filtering of thousands of heterogeneous information sources. Important sources of information include not only traditional databases with structured data and queries, but also increasing numbers of non-traditional, semi- or unstructured collections such as Web sites, FTP archives, etc. As the number and variability of sources increases, new ways of automatically summarizing, discovering, and selecting collections relevant to a user's query are needed. One such method involves the use of classification schemes, such as the Library of Congress Classification (LCC), within which a collection may be represented based on its content, irrespective of the structure of the actual data or documents. For such a system to be useful in a large-scale distributed environment, it must be easy to use for both collection managers and users. As a result, it must be possible to classify documents automatically within a classification scheme. Furthermore, there must be a straightforward and intuitive interface with which the user may use the scheme to assist in information retrieval (IR). Our work with the Alexandria Digital Library (ADL) Project focuses on geo-referenced information, whether text, maps, aerial photographs, or satellite images. As a result, we have emphasized techniques which work with both text and non-text, such as combined textual and graphical queries, multi-dimensional indexing, and IR methods which are not solely dependent on words or phrases. Part of this work involves locating relevant online sources of information. In particular, we have designed and are currently testing aspects of an architecture, Pharos, which we believe will scale up to 1.000.000 heterogeneous sources. Pharos accommodates heterogeneity in content and format, both among multiple sources as well as within a single source. That is, we consider sources to include Web sites, FTP archives, newsgroups, and full digital libraries; all of these systems can include a wide variety of content and multimedia data formats. Pharos is based on the use of hierarchical classification schemes. These include not only well-known 'subject' (or 'concept') based schemes such as the Dewey Decimal System and the LCC, but also, for example, geographic classifications, which might be constructed as layers of smaller and smaller hierarchical longitude/latitude boxes. Pharos is designed to work with sophisticated queries which utilize subjects, geographical locations, temporal specifications, and other types of information domains. The Pharos architecture requires that hierarchically structured collection metadata be extracted so that it can be partitioned in such a way as to greatly enhance scalability. Automated classification is important to Pharos because it allows information sources to extract the requisite collection metadata automatically that must be distributed.
We are currently experimenting with newsgroups as collections. We have built an initial prototype which automatically classifies and summarizes newsgroups within the LCC. (The prototype can be tested below, and more details may be found at http://pharos.alexandria.ucsb.edu/). The prototype uses electronic library catalog records as a `training set' and Latent Semantic Indexing (LSI) for IR. We use the training set to build a rich set of classification terminology, and associate these terms with the relevant categories in the LCC. This association between terms and classification categories allows us to relate users' queries to nodes in the LCC so that users can select appropriate query categories. Newsgroups are similarly associated with classification categories. Pharos then matches the categories selected by users to relevant newsgroups. In principle, this approach allows users to exclude newsgroups that might have been selected based on an unintended meaning of a query term, and to include newsgroups with relevant content even though the exact query terms may not have been used. This work is extensible to other types of classification, including geographical, temporal, and image feature. Before discussing the methodology of the collection summarization and selection, we first present an online demonstration below. The demonstration is not intended to be a complete end-user interface. Rather, it is intended merely to offer a view of the process to suggest the "look and feel" of the prototype. The demo works as follows. First supply it with a few keywords of interest. The system will then use those terms to try to return to you the most relevant subject categories within the LCC. Assuming that the system recognizes any of your terms (it has over 400,000 terms indexed), it will give you a list of 15 LCC categories sorted by relevancy ranking. From there, you have two choices. The first choice, by clicking on the "News" links, is to get a list of newsgroups which the system has identified as relevant to the LCC category you select. The other choice, by clicking on the LCC ID links, is to enter the LCC hierarchy starting at the category of your choice and navigate the tree until you locate the best category for your query. From there, again, you can get a list of newsgroups by clicking on the "News" links. After having shown this demonstration to many people, we would like to suggest that you first give it easier examples before trying to break it. For example, "prostate cancer" (discussed below), "remote sensing", "investment banking", and "gershwin" all work reasonably well.

Search (408 results, page 1 of 21)

Authors

Years

Types

Themes

Subjects