Search (82 results, page 1 of 5)

  • × theme_ss:"Semantisches Umfeld in Indexierung u. Retrieval"
  1. Gnoli, C.; Santis, R. de; Pusterla, L.: Commerce, see also Rhetoric : cross-discipline relationships as authority data for enhanced retrieval (2015) 0.04
    0.035463274 = product of:
      0.12412146 = sum of:
        0.03761707 = weight(_text_:classification in 2299) [ClassicSimilarity], result of:
          0.03761707 = score(doc=2299,freq=10.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.39339557 = fieldWeight in 2299, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2299)
        0.0237488 = product of:
          0.0474976 = sum of:
            0.0474976 = weight(_text_:schemes in 2299) [ClassicSimilarity], result of:
              0.0474976 = score(doc=2299,freq=2.0), product of:
                0.16067243 = queryWeight, product of:
                  5.3512506 = idf(docFreq=569, maxDocs=44218)
                  0.03002521 = queryNorm
                0.2956176 = fieldWeight in 2299, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.3512506 = idf(docFreq=569, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2299)
          0.5 = coord(1/2)
        0.02513852 = weight(_text_:bibliographic in 2299) [ClassicSimilarity], result of:
          0.02513852 = score(doc=2299,freq=2.0), product of:
            0.11688946 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.03002521 = queryNorm
            0.21506234 = fieldWeight in 2299, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2299)
        0.03761707 = weight(_text_:classification in 2299) [ClassicSimilarity], result of:
          0.03761707 = score(doc=2299,freq=10.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.39339557 = fieldWeight in 2299, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2299)
      0.2857143 = coord(4/14)
    
    Abstract
    Subjects in a classification scheme are often related to other subjects belonging to different hierarchies. This problem was identified already by Hugh of Saint Victor (1096?-1141). Still with present-time bibliographic classifications, a user browsing the class of architecture under the hierarchy of arts may miss relevant items classified in building or in civil engineering under the hierarchy of applied sciences. To face these limitations we have developed SciGator, a browsable interface to explore the collections of all scientific libraries at the University of Pavia. Besides showing subclasses of a given class, the interface points users to related classes in the Dewey Decimal Classification, or in other local schemes, and allows for expanded queries that include them. This is made possible by using a special field for related classes in the database structure which models classification authority data. Ontologically, many relationships between classes in different hierarchies are cases of existential dependence. Dependence can occur between disciplines in such disciplinary classifications as Dewey (e.g. architecture existentially depends on building), or between phenomena in such phenomenon-based classifications as the Integrative Levels Classification (e.g. fishing as a human activity existentially depends on fish as a class of organisms). We provide an example of its representation in OWL and discuss some details of it.
    Source
    Classification and authority control: expanding resource discovery: proceedings of the International UDC Seminar 2015, 29-30 October 2015, Lisbon, Portugal. Eds.: Slavic, A. u. M.I. Cordeiro
  2. Jun, W.: ¬A knowledge network constructed by integrating classification, thesaurus and metadata in a digital library (2003) 0.03
    0.030942764 = product of:
      0.10829967 = sum of:
        0.016974261 = weight(_text_:subject in 1254) [ClassicSimilarity], result of:
          0.016974261 = score(doc=1254,freq=2.0), product of:
            0.10738805 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.03002521 = queryNorm
            0.15806471 = fieldWeight in 1254, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.03125 = fieldNorm(doc=1254)
        0.035607297 = weight(_text_:classification in 1254) [ClassicSimilarity], result of:
          0.035607297 = score(doc=1254,freq=14.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.37237754 = fieldWeight in 1254, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03125 = fieldNorm(doc=1254)
        0.020110816 = weight(_text_:bibliographic in 1254) [ClassicSimilarity], result of:
          0.020110816 = score(doc=1254,freq=2.0), product of:
            0.11688946 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.03002521 = queryNorm
            0.17204987 = fieldWeight in 1254, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.03125 = fieldNorm(doc=1254)
        0.035607297 = weight(_text_:classification in 1254) [ClassicSimilarity], result of:
          0.035607297 = score(doc=1254,freq=14.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.37237754 = fieldWeight in 1254, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03125 = fieldNorm(doc=1254)
      0.2857143 = coord(4/14)
    
    Abstract
    Knowledge management in digital libraries is a universal problem. Keyword-based searching is applied everywhere no matter whether the resources are indexed databases or full-text Web pages. In keyword matching, the valuable content description and indexing of the metadata, such as the subject descriptors and the classification notations, are merely treated as common keywords to be matched with the user query. Without the support of vocabulary control tools, such as classification systems and thesauri, the intelligent labor of content analysis, description and indexing in metadata production are seriously wasted. New retrieval paradigms are needed to exploit the potential of the metadata resources. Could classification and thesauri, which contain the condensed intelligence of generations of librarians, be used in a digital library to organize the networked information, especially metadata, to facilitate their usability and change the digital library into a knowledge management environment? To examine that question, we designed and implemented a new paradigm that incorporates a classification system, a thesaurus and metadata. The classification and the thesaurus are merged into a concept network, and the metadata are distributed into the nodes of the concept network according to their subjects. The abstract concept node instantiated with the related metadata records becomes a knowledge node. A coherent and consistent knowledge network is thus formed. It is not only a framework for resource organization but also a structure for knowledge navigation, retrieval and learning. We have built an experimental system based on the Chinese Classification and Thesaurus, which is the most comprehensive and authoritative in China, and we have incorporated more than 5000 bibliographic records in the computing domain from the Peking University Library. The result is encouraging. In this article, we review the tools, the architecture and the implementation of our experimental system, which is called Vision.
  3. Wang, Z.; Khoo, C.S.G.; Chaudhry, A.S.: Evaluation of the navigation effectiveness of an organizational taxonomy built on a general classification scheme and domain thesauri (2014) 0.02
    0.021092122 = product of:
      0.0984299 = sum of:
        0.03496567 = weight(_text_:classification in 1251) [ClassicSimilarity], result of:
          0.03496567 = score(doc=1251,freq=6.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.3656675 = fieldWeight in 1251, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=1251)
        0.02849856 = product of:
          0.05699712 = sum of:
            0.05699712 = weight(_text_:schemes in 1251) [ClassicSimilarity], result of:
              0.05699712 = score(doc=1251,freq=2.0), product of:
                0.16067243 = queryWeight, product of:
                  5.3512506 = idf(docFreq=569, maxDocs=44218)
                  0.03002521 = queryNorm
                0.35474116 = fieldWeight in 1251, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.3512506 = idf(docFreq=569, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1251)
          0.5 = coord(1/2)
        0.03496567 = weight(_text_:classification in 1251) [ClassicSimilarity], result of:
          0.03496567 = score(doc=1251,freq=6.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.3656675 = fieldWeight in 1251, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=1251)
      0.21428572 = coord(3/14)
    
    Abstract
    This paper presents an evaluation study of the navigation effectiveness of a multifaceted organizational taxonomy that was built on the Dewey Decimal Classification and several domain thesauri in the area of library and information science education. The objective of the evaluation was to detect deficiencies in the taxonomy and to infer problems of applied construction steps from users' navigation difficulties. The evaluation approach included scenario-based navigation exercises and postexercise interviews. Navigation exercise errors and underlying reasons were analyzed in relation to specific components of the taxonomy and applied construction steps. Guidelines for the construction of the hierarchical structure and categories of an organizational taxonomy using existing general classification schemes and domain thesauri were derived from the evaluation results.
  4. Gnoli, C.; Pusterla, L.; Bendiscioli, A.; Recinella, C.: Classification for collections mapping and query expansion (2016) 0.02
    0.021092122 = product of:
      0.0984299 = sum of:
        0.03496567 = weight(_text_:classification in 3102) [ClassicSimilarity], result of:
          0.03496567 = score(doc=3102,freq=6.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.3656675 = fieldWeight in 3102, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=3102)
        0.02849856 = product of:
          0.05699712 = sum of:
            0.05699712 = weight(_text_:schemes in 3102) [ClassicSimilarity], result of:
              0.05699712 = score(doc=3102,freq=2.0), product of:
                0.16067243 = queryWeight, product of:
                  5.3512506 = idf(docFreq=569, maxDocs=44218)
                  0.03002521 = queryNorm
                0.35474116 = fieldWeight in 3102, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.3512506 = idf(docFreq=569, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3102)
          0.5 = coord(1/2)
        0.03496567 = weight(_text_:classification in 3102) [ClassicSimilarity], result of:
          0.03496567 = score(doc=3102,freq=6.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.3656675 = fieldWeight in 3102, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=3102)
      0.21428572 = coord(3/14)
    
    Abstract
    Dewey Decimal Classification has been used to organize materials owned by the three scientific libraries at the University of Pavia, and to allow integrated browsing in their union catalogue through SciGator, a home built web-based user interface. Classification acts as a bridge between collections located in different places and shelved according to different local schemes. Furthermore, cross-discipline relationships recorded in the system allow for expanded queries that increase recall. Advantages and possible improvements of such a system are discussed.
  5. Kopácsi, S. et al.: Development of a classification server to support metadata harmonization in a long term preservation system (2016) 0.02
    0.018778171 = product of:
      0.087631464 = sum of:
        0.03364573 = weight(_text_:classification in 3280) [ClassicSimilarity], result of:
          0.03364573 = score(doc=3280,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.35186368 = fieldWeight in 3280, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.078125 = fieldNorm(doc=3280)
        0.03364573 = weight(_text_:classification in 3280) [ClassicSimilarity], result of:
          0.03364573 = score(doc=3280,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.35186368 = fieldWeight in 3280, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.078125 = fieldNorm(doc=3280)
        0.020340007 = product of:
          0.040680014 = sum of:
            0.040680014 = weight(_text_:22 in 3280) [ClassicSimilarity], result of:
              0.040680014 = score(doc=3280,freq=2.0), product of:
                0.10514317 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03002521 = queryNorm
                0.38690117 = fieldWeight in 3280, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=3280)
          0.5 = coord(1/2)
      0.21428572 = coord(3/14)
    
    Source
    Metadata and semantics research: 10th International Conference, MTSR 2016, Göttingen, Germany, November 22-25, 2016, Proceedings. Eds.: E. Garoufallou
  6. Fidel, R.; Efthimiadis, E.N.: Terminological knowledge structure for intermediary expert systems (1995) 0.02
    0.018342271 = product of:
      0.08559726 = sum of:
        0.028549349 = weight(_text_:classification in 5695) [ClassicSimilarity], result of:
          0.028549349 = score(doc=5695,freq=4.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.29856625 = fieldWeight in 5695, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=5695)
        0.02849856 = product of:
          0.05699712 = sum of:
            0.05699712 = weight(_text_:schemes in 5695) [ClassicSimilarity], result of:
              0.05699712 = score(doc=5695,freq=2.0), product of:
                0.16067243 = queryWeight, product of:
                  5.3512506 = idf(docFreq=569, maxDocs=44218)
                  0.03002521 = queryNorm
                0.35474116 = fieldWeight in 5695, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.3512506 = idf(docFreq=569, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5695)
          0.5 = coord(1/2)
        0.028549349 = weight(_text_:classification in 5695) [ClassicSimilarity], result of:
          0.028549349 = score(doc=5695,freq=4.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.29856625 = fieldWeight in 5695, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=5695)
      0.21428572 = coord(3/14)
    
    Abstract
    To provide advice for online searching about term selection and query expansion, an intermediary expert system should indicate a terminological knowledge structure. Terminological attributes could provide the foundation of a knowledge base, and knowledge acquisition could rely on knowledge base techniques coupled with statistical techniques. The strategies of expert searchers would provide 1 source of knowledge. The knowledge structure would include 3 constructs for each term: frequency data, a hedge, and a position in a classification scheme. Switching vocabularies could provide a meta-scheme and facilitate the interoperability of databases in similar subjects. To develop such knowledge structure, research should focus on terminological attributes, word and phrase disambiguation, automated text processing, and the role of thesauri and classification schemes in indexing and retrieval. It should develop techniques that combine knowledge base and statistical methods and that consider user preferences
  7. ALA / Subcommittee on Subject Relationships/Reference Structures: Final Report to the ALCTS/CCS Subject Analysis Committee (1997) 0.02
    0.017777555 = product of:
      0.082961924 = sum of:
        0.059409913 = weight(_text_:subject in 1800) [ClassicSimilarity], result of:
          0.059409913 = score(doc=1800,freq=32.0), product of:
            0.10738805 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.03002521 = queryNorm
            0.5532265 = fieldWeight in 1800, product of:
              5.656854 = tf(freq=32.0), with freq of:
                32.0 = termFreq=32.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1800)
        0.011776006 = weight(_text_:classification in 1800) [ClassicSimilarity], result of:
          0.011776006 = score(doc=1800,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.12315229 = fieldWeight in 1800, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1800)
        0.011776006 = weight(_text_:classification in 1800) [ClassicSimilarity], result of:
          0.011776006 = score(doc=1800,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.12315229 = fieldWeight in 1800, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1800)
      0.21428572 = coord(3/14)
    
    Abstract
    The SAC Subcommittee on Subject Relationships/Reference Structures was authorized at the 1995 Midwinter Meeting and appointed shortly before Annual Conference. Its creation was one result of a discussion of how (and why) to promote the display and use of broader-term subject heading references, and its charge reads as follows: To investigate: (1) the kinds of relationships that exist between subjects, the display of which are likely to be useful to catalog users; (2) how these relationships are or could be recorded in authorities and classification formats; (3) options for how these relationships should be presented to users of online and print catalogs, indexes, lists, etc. By the summer 1996 Annual Conference, make some recommendations to SAC about how to disseminate the information and/or implement changes. At that time assess the need for additional time to investigate these issues. The Subcommittee's work on each of the imperatives in the charge was summarized in a report issued at the 1996 Annual Conference (Appendix A). Highlights of this work included the development of a taxonomy of 165 subject relationships; a demonstration that, using existing MARC coding, catalog systems could be programmed to generate references they do not currently support; and an examination of reference displays in several CD-ROM database products. Since that time, work has continued on identifying term relationships and display options; on tracking research, discussion, and implementation of subject relationships in information systems; and on compiling a list of further research needs.
    Content
    Enthält: Appendix A: Subcommittee on Subject Relationships/Reference Structures - REPORT TO THE ALCTS/CCS SUBJECT ANALYSIS COMMITTEE - July 1996 Appendix B (part 1): Taxonomy of Subject Relationships. Compiled by Dee Michel with the assistance of Pat Kuhr - June 1996 draft (alphabetical display) (Separat in: http://web2.ala.org/ala/alctscontent/CCS/committees/subjectanalysis/subjectrelations/msrscu2.pdf) Appendix B (part 2): Taxonomy of Subject Relationships. Compiled by Dee Michel with the assistance of Pat Kuhr - June 1996 draft (hierarchical display) Appendix C: Checklist of Candidate Subject Relationships for Information Retrieval. Compiled by Dee Michel, Pat Kuhr, and Jane Greenberg; edited by Greg Wool - June 1997 Appendix D: Review of Reference Displays in Selected CD-ROM Abstracts and Indexes by Harriette Hemmasi and Steven Riel Appendix E: Analysis of Relationships in Six LC Subject Authority Records by Harriette Hemmasi and Gary Strawn Appendix F: Report of a Preliminary Survey of Subject Referencing in OPACs by Gregory Wool Appendix G: LC Subject Referencing in OPACs--Why Bother? by Gregory Wool Appendix H: Research Needs on Subject Relationships and Reference Structures in Information Access compiled by Jane Greenberg and Steven Riel with contributions from Dee Michel and others edited by Gregory Wool Appendix I: Bibliography on Subject Relationships compiled mostly by Dee Michel with additional contributions from Jane Greenberg, Steven Riel, and Gregory Wool
  8. Prieto-Díaz, R.: ¬A faceted approach to building ontologies (2002) 0.01
    0.014107771 = product of:
      0.065836266 = sum of:
        0.02546139 = weight(_text_:subject in 2259) [ClassicSimilarity], result of:
          0.02546139 = score(doc=2259,freq=2.0), product of:
            0.10738805 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.03002521 = queryNorm
            0.23709705 = fieldWeight in 2259, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.046875 = fieldNorm(doc=2259)
        0.02018744 = weight(_text_:classification in 2259) [ClassicSimilarity], result of:
          0.02018744 = score(doc=2259,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.21111822 = fieldWeight in 2259, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=2259)
        0.02018744 = weight(_text_:classification in 2259) [ClassicSimilarity], result of:
          0.02018744 = score(doc=2259,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.21111822 = fieldWeight in 2259, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=2259)
      0.21428572 = coord(3/14)
    
    Abstract
    An ontology is "an explicit conceptualization of a domain of discourse, and thus provides a shared and common understanding of the domain." We have been producing ontologies for millennia to understand and explain our rationale and environment. From Plato's philosophical framework to modern day classification systems, ontologies are, in most cases, the product of extensive analysis and categorization. Only recently has the process of building ontologies become a research topic of interest. Today, ontologies are built very much ad-hoc. A terminology is first developed providing a controlled vocabulary for the subject area or domain of interest, then it is organized into a taxonomy where key concepts are identified, and finally these concepts are defined and related to create an ontology. The intent of this paper is to show that domain analysis methods can be used for building ontologies. Domain analysis aims at generic models that represent groups of similar systems within an application domain. In this sense, it deals with categorization of common objects and operations, with clear, unambiguous definitions of them and with defining their relationships.
  9. Morato, J.; Llorens, J.; Genova, G.; Moreiro, J.A.: Experiments in discourse analysis impact on information classification and retrieval algorithms (2003) 0.01
    0.010747734 = product of:
      0.07523414 = sum of:
        0.03761707 = weight(_text_:classification in 1083) [ClassicSimilarity], result of:
          0.03761707 = score(doc=1083,freq=10.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.39339557 = fieldWeight in 1083, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1083)
        0.03761707 = weight(_text_:classification in 1083) [ClassicSimilarity], result of:
          0.03761707 = score(doc=1083,freq=10.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.39339557 = fieldWeight in 1083, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1083)
      0.14285715 = coord(2/14)
    
    Abstract
    Researchers in indexing and retrieval systems have been advocating the inclusion of more contextual information to improve results. The proliferation of full-text databases and advances in computer storage capacity have made it possible to carry out text analysis by means of linguistic and extra-linguistic knowledge. Since the mid 80s, research has tended to pay more attention to context, giving discourse analysis a more central role. The research presented in this paper aims to check whether discourse variables have an impact on modern information retrieval and classification algorithms. In order to evaluate this hypothesis, a functional framework for information analysis in an automated environment has been proposed, where the n-grams (filtering) and the k-means and Chen's classification algorithms have been tested against sub-collections of documents based on the following discourse variables: "Genre", "Register", "Domain terminology", and "Document structure". The results obtained with the algorithms for the different sub-collections were compared to the MeSH information structure. These demonstrate that n-grams does not appear to have a clear dependence on discourse variables, though the k-means classification algorithm does, but only on domain terminology and document structure, and finally Chen's algorithm has a clear dependence on all of the discourse variables. This information could be used to design better classification algorithms, where discourse variables should be taken into account. Other minor conclusions drawn from these results are also presented.
  10. Hancock-Beaulieu, M.: Evaluating the impact of an online library catalogue on subject searching behaviour at the catalogue and at the shelves (1990) 0.01
    0.010328798 = product of:
      0.07230158 = sum of:
        0.03675035 = weight(_text_:subject in 5691) [ClassicSimilarity], result of:
          0.03675035 = score(doc=5691,freq=6.0), product of:
            0.10738805 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.03002521 = queryNorm
            0.34222013 = fieldWeight in 5691, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5691)
        0.035551235 = weight(_text_:bibliographic in 5691) [ClassicSimilarity], result of:
          0.035551235 = score(doc=5691,freq=4.0), product of:
            0.11688946 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.03002521 = queryNorm
            0.30414405 = fieldWeight in 5691, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5691)
      0.14285715 = coord(2/14)
    
    Abstract
    The second half of a 'before and after' study to evaluate the impact of an online catalogue on subject searching behaviour is reported. A holistic approach is adopted encompassing both catalogue use and browsing at the shelves for catalogue users and non-users. Verbal and non-verbal data were elicited from searchers using a combined methodology including talk-aloud technique, observation and a screen logging facility. An extensive qualitative analysis was carried out correlating expressed topics, search formulation strategies and documents retrieved at the shelves. The online catalogue environment does not appear to have increased the extent of subject searching nor the use of the bibliographic tool. The manual PRECIS index supported a contextual approach for broad and more interactive search formulations whereas the OPAC encouraged a matching approach and narrow formulations with fewer but user generated formulations. The success rate of the online catalogue was slightly better than that of the manual tools but fewer items were retrieved at the shelves. Non-users of the bibliographic tools seemed to be just as successful. To improve retrieval effectiveness it is suggested that online catalogues should cater for both matching and contextual approaches to searching. Recent research indicates that a more interactive process could be promoted by providing query expansion through a combination of searching aids for matching, for search formulation assistance and for structured contextual retrieval
  11. Ru, C.; Tang, J.; Li, S.; Xie, S.; Wang, T.: Using semantic similarity to reduce wrong labels in distant supervision for relation extraction (2018) 0.01
    0.009613066 = product of:
      0.06729146 = sum of:
        0.03364573 = weight(_text_:classification in 5055) [ClassicSimilarity], result of:
          0.03364573 = score(doc=5055,freq=8.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.35186368 = fieldWeight in 5055, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5055)
        0.03364573 = weight(_text_:classification in 5055) [ClassicSimilarity], result of:
          0.03364573 = score(doc=5055,freq=8.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.35186368 = fieldWeight in 5055, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5055)
      0.14285715 = coord(2/14)
    
    Abstract
    Distant supervision (DS) has the advantage of automatically generating large amounts of labelled training data and has been widely used for relation extraction. However, there are usually many wrong labels in the automatically labelled data in distant supervision (Riedel, Yao, & McCallum, 2010). This paper presents a novel method to reduce the wrong labels. The proposed method uses the semantic Jaccard with word embedding to measure the semantic similarity between the relation phrase in the knowledge base and the dependency phrases between two entities in a sentence to filter the wrong labels. In the process of reducing wrong labels, the semantic Jaccard algorithm selects a core dependency phrase to represent the candidate relation in a sentence, which can capture features for relation classification and avoid the negative impact from irrelevant term sequences that previous neural network models of relation extraction often suffer. In the process of relation classification, the core dependency phrases are also used as the input of a convolutional neural network (CNN) for relation classification. The experimental results show that compared with the methods using original DS data, the methods using filtered DS data performed much better in relation extraction. It indicates that the semantic similarity based method is effective in reducing wrong labels. The relation extraction performance of the CNN model using the core dependency phrases as input is the best of all, which indicates that using the core dependency phrases as input of CNN is enough to capture the features for relation classification and could avoid negative impact from irrelevant terms.
  12. Green, R.: See-also relationships in the Dewey Decimal Classification (2011) 0.01
    0.009516451 = product of:
      0.06661515 = sum of:
        0.033307575 = weight(_text_:classification in 4615) [ClassicSimilarity], result of:
          0.033307575 = score(doc=4615,freq=4.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.34832728 = fieldWeight in 4615, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4615)
        0.033307575 = weight(_text_:classification in 4615) [ClassicSimilarity], result of:
          0.033307575 = score(doc=4615,freq=4.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.34832728 = fieldWeight in 4615, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4615)
      0.14285715 = coord(2/14)
    
    Abstract
    This paper investigates the semantics of topical, associative see-also relationships in schedule and table entries of the Dewey Decimal Classification (DDC) system. Based on the see-also relationships in a random sample of 100 classes containing one or more of these relationships, a semi-structured inventory of sources of see-also relationships is generated, of which the most important are lexical similarity, complementarity, facet difference, and relational configuration difference. The premise that see-also relationships based on lexical similarity may be language-specific is briefly examined. The paper concludes with recommendations on the continued use of see-also relationships in the DDC.
  13. Moreira, W.; Martínez-Ávila, D.: Concept relationships in knowledge organization systems : elements for analysis and common research among fields (2018) 0.01
    0.009516451 = product of:
      0.06661515 = sum of:
        0.033307575 = weight(_text_:classification in 5166) [ClassicSimilarity], result of:
          0.033307575 = score(doc=5166,freq=4.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.34832728 = fieldWeight in 5166, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5166)
        0.033307575 = weight(_text_:classification in 5166) [ClassicSimilarity], result of:
          0.033307575 = score(doc=5166,freq=4.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.34832728 = fieldWeight in 5166, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5166)
      0.14285715 = coord(2/14)
    
    Abstract
    Knowledge organization systems have been studied in several fields and for different and complementary aspects. Among the aspects that concentrate common interests, in this article we highlight those related to the terminological and conceptual relationships among the components of any knowledge organization system. This research aims to contribute to the critical analysis of knowledge organization systems, especially ontologies, thesauri, and classification systems, by the comprehension of its similarities and differences when dealing with concepts and their ways of relating to each other as well as to the conceptual design that is adopted.
    Source
    Cataloging and classification quarterly. 56(2018) no.1, S.19-39
  14. Salaba, A.; Zeng, M.L.: Extending the "Explore" user task beyond subject authority data into the linked data sphere (2014) 0.01
    0.009384072 = product of:
      0.065688506 = sum of:
        0.0514505 = weight(_text_:subject in 1465) [ClassicSimilarity], result of:
          0.0514505 = score(doc=1465,freq=6.0), product of:
            0.10738805 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.03002521 = queryNorm
            0.4791082 = fieldWeight in 1465, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1465)
        0.014238005 = product of:
          0.02847601 = sum of:
            0.02847601 = weight(_text_:22 in 1465) [ClassicSimilarity], result of:
              0.02847601 = score(doc=1465,freq=2.0), product of:
                0.10514317 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03002521 = queryNorm
                0.2708308 = fieldWeight in 1465, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1465)
          0.5 = coord(1/2)
      0.14285715 = coord(2/14)
    
    Abstract
    "Explore" is a user task introduced in the Functional Requirements for Subject Authority Data (FRSAD) final report. Through various case scenarios, the authors discuss how structured data, presented based on Linked Data principles and using knowledge organisation systems (KOS) as the backbone, extend the explore task within and beyond subject authority data.
    Source
    Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik
  15. Zeng, M.L.; Gracy, K.F.; Zumer, M.: Using a semantic analysis tool to generate subject access points : a study using Panofsky's theory and two research samples (2014) 0.01
    0.009018113 = product of:
      0.06312679 = sum of:
        0.05092278 = weight(_text_:subject in 1464) [ClassicSimilarity], result of:
          0.05092278 = score(doc=1464,freq=8.0), product of:
            0.10738805 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.03002521 = queryNorm
            0.4741941 = fieldWeight in 1464, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.046875 = fieldNorm(doc=1464)
        0.0122040035 = product of:
          0.024408007 = sum of:
            0.024408007 = weight(_text_:22 in 1464) [ClassicSimilarity], result of:
              0.024408007 = score(doc=1464,freq=2.0), product of:
                0.10514317 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03002521 = queryNorm
                0.23214069 = fieldWeight in 1464, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1464)
          0.5 = coord(1/2)
      0.14285715 = coord(2/14)
    
    Abstract
    This paper attempts to explore an approach of using an automatic semantic analysis tool to enhance the "subject" access to materials that are not included in the usual library subject cataloging process. Using two research samples the authors analyzed the access points supplied by OpenCalais, a semantic analysis tool. As an aid in understanding how computerized subject analysis might be approached, this paper suggests using the three-layer framework that has been accepted and applied in image analysis, developed by Erwin Panofsky.
    Source
    Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik
  16. Caro Castro, C.; Travieso Rodríguez, C.: Ariadne's thread : knowledge structures for browsing in OPAC's (2003) 0.01
    0.008752408 = product of:
      0.061266854 = sum of:
        0.03638099 = weight(_text_:subject in 2768) [ClassicSimilarity], result of:
          0.03638099 = score(doc=2768,freq=12.0), product of:
            0.10738805 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.03002521 = queryNorm
            0.33878064 = fieldWeight in 2768, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.02734375 = fieldNorm(doc=2768)
        0.024885865 = weight(_text_:bibliographic in 2768) [ClassicSimilarity], result of:
          0.024885865 = score(doc=2768,freq=4.0), product of:
            0.11688946 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.03002521 = queryNorm
            0.21290085 = fieldWeight in 2768, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.02734375 = fieldNorm(doc=2768)
      0.14285715 = coord(2/14)
    
    Abstract
    Subject searching is the most common but also the most conflictive searching for end user. The aim of this paper is to check how users expressions match subject headings and to prove if knowledge structure used in online catalogs enhances searching effectiveness. A bibliographic revision about difficulties in subject access and proposed methods to improve it is also presented. For the empirical analysis, transaction logs from two university libraries, online catalogs (CISNE and FAMA) were collected. Results show that more than a quarter of user queries are effective due to an alphabetical subject index approach and browsing through hypertextual links. 1. Introduction Since the 1980's, online public access catalogs (OPAC's) have become usual way to access bibliographic information. During the last two decades the technological development has helped to extend their use, making feasible the access for a whole of users that is getting more and more extensive and heterogeneous, and also to incorporate information resources in electronic formats and to interconnect systems. However, technology seems to have developed faster than our knowledge about the tasks where it has been applied and than the evolution of our capacities for adapting to it. The conceptual model of OPAC has been hardly modified recently, and for interacting with them, users still need to combine the same skills and basic knowledge than at the beginning of its introduction (Borgman, 1986, 2000): a) conceptual knowledge to translate the information need into an appropriate query because of a well-designed mental model of the system, b) semantic and syntactic knowledge to be able to implement that query (access fields, searching type, Boolean logic, etc.) and c) basic technical skills in computing. At present many users have the essential technical skills to make use, with more or less expertise, of a computer. This number is substantially reduced when it is referred to the conceptual, semantic and syntactic knowledge that is necessary to achieve a moderately satisfactory search. An added difficulty arises in subject searching, as users should concrete their unknown information needs in terms that the information retrieval system can understand. Many researches have focused an unskilled searchers' difficulties to enter an effective query. The mental models influence, users assumption about characteristics, structure, contents and operation of the system they interact with have been analysed (Dillon, 2000; Dimitroff, 2000). Another issue that implies difficulties is vocabulary: how to find the right terms to implement a query and to modify it as the case may be. Terminology and expressions characteristics used in searching (Bates, 1993), the match between user terms and the subject headings from the catalog (Carlyle, 1989; Drabensttot, 1996; Drabensttot & Vizine-Goetz, 1994), the incidence of spelling errors (Drabensttot and Weller, 1996; Ferl and Millsap, 1996; Walker and Jones, 1987), users problems
  17. Poynder, R.: Web research engines? (1996) 0.01
    0.008156957 = product of:
      0.057098698 = sum of:
        0.028549349 = weight(_text_:classification in 5698) [ClassicSimilarity], result of:
          0.028549349 = score(doc=5698,freq=4.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.29856625 = fieldWeight in 5698, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=5698)
        0.028549349 = weight(_text_:classification in 5698) [ClassicSimilarity], result of:
          0.028549349 = score(doc=5698,freq=4.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.29856625 = fieldWeight in 5698, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=5698)
      0.14285715 = coord(2/14)
    
    Abstract
    Describes the shortcomings of search engines for the WWW comparing their current capabilities to those of the first generation CD-ROM products. Some allow phrase searching and most are improving their Boolean searching. Few allow truncation, wild cards or nested logic. They are stateless, losing previous search criteria. Unlike the indexing and classification systems for today's CD-ROMs, those for Web pages are random, unstructured and of variable quality. Considers that at best Web search engines can only offer free text searching. Discusses whether automatic data classification systems such as Infoseek Ultra can overcome the haphazard nature of the Web with neural network technology, and whether Boolean search techniques may be redundant when replaced by technology such as the Euroferret search engine. However, artificial intelligence is rarely successful on huge, varied databases. Relevance ranking and automatic query expansion still use the same simple inverted indexes. Most Web search engines do nothing more than word counting. Further complications arise with foreign languages
  18. Spiteri, L.F.: ¬The essential elements of faceted thesauri (1999) 0.01
    0.008156957 = product of:
      0.057098698 = sum of:
        0.028549349 = weight(_text_:classification in 5362) [ClassicSimilarity], result of:
          0.028549349 = score(doc=5362,freq=4.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.29856625 = fieldWeight in 5362, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=5362)
        0.028549349 = weight(_text_:classification in 5362) [ClassicSimilarity], result of:
          0.028549349 = score(doc=5362,freq=4.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.29856625 = fieldWeight in 5362, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=5362)
      0.14285715 = coord(2/14)
    
    Abstract
    The goal of this study is to evaluate, compare, and contrast how facet analysis is used to construct the systematic or faceted displays of a selection of information retrieval thesauri. More specifically, the study seeks to examine which principles of facet analysis are used in the thesauri, and the extent to which different thesauri apply these principles in the same way. A measuring instrument was designed for the purpose of evaluating the structure of faceted thesauri. This instrument was applied to fourteen faceted information retrieval thesauri. The study reveals that the thesauri do not share a common definition of what constitutes a facet. In some cases, the thesauri apply both enumerative-style classification and facet analysis to arrange their indexing terms. A number of the facets used in the thesauri are not homogeneous or mutually exclusive. The principle of synthesis is used in only 50% of the thesauri, and no one citation order is used consistently by the thesauri.
    Source
    Cataloging and classification quarterly. 28(1999) no.4, S.31-52
  19. Huang, L.; Milne, D.; Frank, E.; Witten, I.H.: Learning a concept-based document similarity measure (2012) 0.01
    0.008156957 = product of:
      0.057098698 = sum of:
        0.028549349 = weight(_text_:classification in 372) [ClassicSimilarity], result of:
          0.028549349 = score(doc=372,freq=4.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.29856625 = fieldWeight in 372, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=372)
        0.028549349 = weight(_text_:classification in 372) [ClassicSimilarity], result of:
          0.028549349 = score(doc=372,freq=4.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.29856625 = fieldWeight in 372, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=372)
      0.14285715 = coord(2/14)
    
    Abstract
    Document similarity measures are crucial components of many text-analysis tasks, including information retrieval, document classification, and document clustering. Conventional measures are brittle: They estimate the surface overlap between documents based on the words they mention and ignore deeper semantic connections. We propose a new measure that assesses similarity at both the lexical and semantic levels, and learns from human judgments how to combine them by using machine-learning techniques. Experiments show that the new measure produces values for documents that are more consistent with people's judgments than people are with each other. We also use it to classify and cluster large document sets covering different genres and topics, and find that it improves both classification and clustering performance.
  20. Wang, Y.-H.; Jhuo, P.-S.: ¬A semantic faceted search with rule-based inference (2009) 0.01
    0.008156957 = product of:
      0.057098698 = sum of:
        0.028549349 = weight(_text_:classification in 540) [ClassicSimilarity], result of:
          0.028549349 = score(doc=540,freq=4.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.29856625 = fieldWeight in 540, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=540)
        0.028549349 = weight(_text_:classification in 540) [ClassicSimilarity], result of:
          0.028549349 = score(doc=540,freq=4.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.29856625 = fieldWeight in 540, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=540)
      0.14285715 = coord(2/14)
    
    Abstract
    Semantic Search has become an active research of Semantic Web in recent years. The classification methodology plays a pretty critical role in the beginning of search process to disambiguate irrelevant information. However, the applications related to Folksonomy suffer from many obstacles. This study attempts to eliminate the problems resulted from Folksonomy using existing semantic technology. We also focus on how to effectively integrate heterogeneous ontologies over the Internet to acquire the integrity of domain knowledge. A faceted logic layer is abstracted in order to strengthen category framework and organize existing available ontologies according to a series of steps based on the methodology of faceted classification and ontology construction. The result showed that our approach can facilitate the integration of inconsistent or even heterogeneous ontologies. This paper also generalizes the principles of picking appropriate facets with which our facet browser completely complies so that better semantic search result can be obtained.

Authors

Years

Languages

  • e 76
  • d 4
  • chi 1
  • f 1
  • More… Less…

Types

  • a 72
  • el 11
  • r 4
  • m 3
  • s 1
  • x 1
  • More… Less…