Search (21 results, page 1 of 2)

  • × language_ss:"e"
  • × type_ss:"x"
  1. Farazi, M.: Faceted lightweight ontologies : a formalization and some experiments (2010) 0.25
    0.24532785 = product of:
      0.4906557 = sum of:
        0.039739996 = product of:
          0.11921998 = sum of:
            0.11921998 = weight(_text_:3a in 4997) [ClassicSimilarity], result of:
              0.11921998 = score(doc=4997,freq=2.0), product of:
                0.25455406 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.03002521 = queryNorm
                0.46834838 = fieldWeight in 4997, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4997)
          0.33333334 = coord(1/3)
        0.11921998 = weight(_text_:2f in 4997) [ClassicSimilarity], result of:
          0.11921998 = score(doc=4997,freq=2.0), product of:
            0.25455406 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03002521 = queryNorm
            0.46834838 = fieldWeight in 4997, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4997)
        0.016822865 = weight(_text_:classification in 4997) [ClassicSimilarity], result of:
          0.016822865 = score(doc=4997,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.17593184 = fieldWeight in 4997, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4997)
        0.05960999 = product of:
          0.11921998 = sum of:
            0.11921998 = weight(_text_:3a in 4997) [ClassicSimilarity], result of:
              0.11921998 = score(doc=4997,freq=2.0), product of:
                0.25455406 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.03002521 = queryNorm
                0.46834838 = fieldWeight in 4997, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4997)
          0.5 = coord(1/2)
        0.11921998 = weight(_text_:2f in 4997) [ClassicSimilarity], result of:
          0.11921998 = score(doc=4997,freq=2.0), product of:
            0.25455406 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03002521 = queryNorm
            0.46834838 = fieldWeight in 4997, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4997)
        0.016822865 = weight(_text_:classification in 4997) [ClassicSimilarity], result of:
          0.016822865 = score(doc=4997,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.17593184 = fieldWeight in 4997, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4997)
        0.11921998 = weight(_text_:2f in 4997) [ClassicSimilarity], result of:
          0.11921998 = score(doc=4997,freq=2.0), product of:
            0.25455406 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03002521 = queryNorm
            0.46834838 = fieldWeight in 4997, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4997)
      0.5 = coord(7/14)
    
    Abstract
    While classifications are heavily used to categorize web content, the evolution of the web foresees a more formal structure - ontology - which can serve this purpose. Ontologies are core artifacts of the Semantic Web which enable machines to use inference rules to conduct automated reasoning on data. Lightweight ontologies bridge the gap between classifications and ontologies. A lightweight ontology (LO) is an ontology representing a backbone taxonomy where the concept of the child node is more specific than the concept of the parent node. Formal lightweight ontologies can be generated from their informal ones. The key applications of formal lightweight ontologies are document classification, semantic search, and data integration. However, these applications suffer from the following problems: the disambiguation accuracy of the state of the art NLP tools used in generating formal lightweight ontologies from their informal ones; the lack of background knowledge needed for the formal lightweight ontologies; and the limitation of ontology reuse. In this dissertation, we propose a novel solution to these problems in formal lightweight ontologies; namely, faceted lightweight ontology (FLO). FLO is a lightweight ontology in which terms, present in each node label, and their concepts, are available in the background knowledge (BK), which is organized as a set of facets. A facet can be defined as a distinctive property of the groups of concepts that can help in differentiating one group from another. Background knowledge can be defined as a subset of a knowledge base, such as WordNet, and often represents a specific domain.
    Content
    PhD Dissertation at International Doctorate School in Information and Communication Technology. Vgl.: https%3A%2F%2Fcore.ac.uk%2Fdownload%2Fpdf%2F150083013.pdf&usg=AOvVaw2n-qisNagpyT0lli_6QbAQ.
  2. Xiong, C.: Knowledge based text representations for information retrieval (2016) 0.22
    0.21602866 = product of:
      0.5040669 = sum of:
        0.031791996 = product of:
          0.095375985 = sum of:
            0.095375985 = weight(_text_:3a in 5820) [ClassicSimilarity], result of:
              0.095375985 = score(doc=5820,freq=2.0), product of:
                0.25455406 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.03002521 = queryNorm
                0.3746787 = fieldWeight in 5820, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5820)
          0.33333334 = coord(1/3)
        0.13488202 = weight(_text_:2f in 5820) [ClassicSimilarity], result of:
          0.13488202 = score(doc=5820,freq=4.0), product of:
            0.25455406 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03002521 = queryNorm
            0.5298757 = fieldWeight in 5820, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03125 = fieldNorm(doc=5820)
        0.047687992 = product of:
          0.095375985 = sum of:
            0.095375985 = weight(_text_:3a in 5820) [ClassicSimilarity], result of:
              0.095375985 = score(doc=5820,freq=2.0), product of:
                0.25455406 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.03002521 = queryNorm
                0.3746787 = fieldWeight in 5820, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5820)
          0.5 = coord(1/2)
        0.13488202 = weight(_text_:2f in 5820) [ClassicSimilarity], result of:
          0.13488202 = score(doc=5820,freq=4.0), product of:
            0.25455406 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03002521 = queryNorm
            0.5298757 = fieldWeight in 5820, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03125 = fieldNorm(doc=5820)
        0.13488202 = weight(_text_:2f in 5820) [ClassicSimilarity], result of:
          0.13488202 = score(doc=5820,freq=4.0), product of:
            0.25455406 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03002521 = queryNorm
            0.5298757 = fieldWeight in 5820, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03125 = fieldNorm(doc=5820)
        0.019940836 = product of:
          0.039881673 = sum of:
            0.039881673 = weight(_text_:texts in 5820) [ClassicSimilarity], result of:
              0.039881673 = score(doc=5820,freq=2.0), product of:
                0.16460659 = queryWeight, product of:
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.03002521 = queryNorm
                0.2422848 = fieldWeight in 5820, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5820)
          0.5 = coord(1/2)
      0.42857143 = coord(6/14)
    
    Abstract
    This proposal includes plans to improve the quality of relevant entities with a co-learning framework that learns from both entity labels and document labels. We also plan to develop a hybrid ranking system that combines word based and entity based representations together with their uncertainties considered. At last, we plan to enrich the text representations with connections between entities. We propose several ways to infer entity graph representations for texts, and to rank documents using their structure representations. This dissertation overcomes the limitation of word based representations with external and carefully curated information from knowledge bases. We believe this thesis research is a solid start towards the new generation of intelligent, semantic, and structured information retrieval.
    Content
    Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Language and Information Technologies. Vgl.: https%3A%2F%2Fwww.cs.cmu.edu%2F~cx%2Fpapers%2Fknowledge_based_text_representation.pdf&usg=AOvVaw0SaTSvhWLTh__Uz_HtOtl3.
  3. Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.21
    0.2064732 = product of:
      0.4817708 = sum of:
        0.14306398 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
          0.14306398 = score(doc=563,freq=2.0), product of:
            0.25455406 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03002521 = queryNorm
            0.56201804 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.02018744 = weight(_text_:classification in 563) [ClassicSimilarity], result of:
          0.02018744 = score(doc=563,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.21111822 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.14306398 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
          0.14306398 = score(doc=563,freq=2.0), product of:
            0.25455406 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03002521 = queryNorm
            0.56201804 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.02018744 = weight(_text_:classification in 563) [ClassicSimilarity], result of:
          0.02018744 = score(doc=563,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.21111822 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.14306398 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
          0.14306398 = score(doc=563,freq=2.0), product of:
            0.25455406 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03002521 = queryNorm
            0.56201804 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.0122040035 = product of:
          0.024408007 = sum of:
            0.024408007 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
              0.024408007 = score(doc=563,freq=2.0), product of:
                0.10514317 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03002521 = queryNorm
                0.23214069 = fieldWeight in 563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=563)
          0.5 = coord(1/2)
      0.42857143 = coord(6/14)
    
    Abstract
    In this thesis we propose three new word association measures for multi-word term extraction. We combine these association measures with LocalMaxs algorithm in our extraction model and compare the results of different multi-word term extraction methods. Our approach is language and domain independent and requires no training data. It can be applied to such tasks as text summarization, information retrieval, and document classification. We further explore the potential of using multi-word terms as an effective representation for general web-page summarization. We extract multi-word terms from human written summaries in a large collection of web-pages, and generate the summaries by aligning document words with these multi-word terms. Our system applies machine translation technology to learn the aligning process from a training set and focuses on selecting high quality multi-word terms from human written summaries to generate suitable results for web-page summarization.
    Content
    A Thesis presented to The University of Guelph In partial fulfilment of requirements for the degree of Master of Science in Computer Science. Vgl. Unter: http://www.inf.ufrgs.br%2F~ceramisch%2Fdownload_files%2Fpublications%2F2009%2Fp01.pdf.
    Date
    10. 1.2013 19:22:47
  4. Stojanovic, N.: Ontology-based Information Retrieval : methods and tools for cooperative query answering (2005) 0.13
    0.13057427 = product of:
      0.36560795 = sum of:
        0.031791996 = product of:
          0.095375985 = sum of:
            0.095375985 = weight(_text_:3a in 701) [ClassicSimilarity], result of:
              0.095375985 = score(doc=701,freq=2.0), product of:
                0.25455406 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.03002521 = queryNorm
                0.3746787 = fieldWeight in 701, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.03125 = fieldNorm(doc=701)
          0.33333334 = coord(1/3)
        0.095375985 = weight(_text_:2f in 701) [ClassicSimilarity], result of:
          0.095375985 = score(doc=701,freq=2.0), product of:
            0.25455406 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03002521 = queryNorm
            0.3746787 = fieldWeight in 701, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03125 = fieldNorm(doc=701)
        0.047687992 = product of:
          0.095375985 = sum of:
            0.095375985 = weight(_text_:3a in 701) [ClassicSimilarity], result of:
              0.095375985 = score(doc=701,freq=2.0), product of:
                0.25455406 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.03002521 = queryNorm
                0.3746787 = fieldWeight in 701, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.03125 = fieldNorm(doc=701)
          0.5 = coord(1/2)
        0.095375985 = weight(_text_:2f in 701) [ClassicSimilarity], result of:
          0.095375985 = score(doc=701,freq=2.0), product of:
            0.25455406 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03002521 = queryNorm
            0.3746787 = fieldWeight in 701, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03125 = fieldNorm(doc=701)
        0.095375985 = weight(_text_:2f in 701) [ClassicSimilarity], result of:
          0.095375985 = score(doc=701,freq=2.0), product of:
            0.25455406 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03002521 = queryNorm
            0.3746787 = fieldWeight in 701, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03125 = fieldNorm(doc=701)
      0.35714287 = coord(5/14)
    
    Content
    Vgl.: http%3A%2F%2Fdigbib.ubka.uni-karlsruhe.de%2Fvolltexte%2Fdocuments%2F1627&ei=tAtYUYrBNoHKtQb3l4GYBw&usg=AFQjCNHeaxKkKU3-u54LWxMNYGXaaDLCGw&sig2=8WykXWQoDKjDSdGtAakH2Q&bvm=bv.44442042,d.Yms.
  5. Slavic-Overfield, A.: Classification management and use in a networked environment : the case of the Universal Decimal Classification (2005) 0.02
    0.01956973 = product of:
      0.09132541 = sum of:
        0.035607297 = weight(_text_:classification in 2191) [ClassicSimilarity], result of:
          0.035607297 = score(doc=2191,freq=14.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.37237754 = fieldWeight in 2191, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03125 = fieldNorm(doc=2191)
        0.020110816 = weight(_text_:bibliographic in 2191) [ClassicSimilarity], result of:
          0.020110816 = score(doc=2191,freq=2.0), product of:
            0.11688946 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.03002521 = queryNorm
            0.17204987 = fieldWeight in 2191, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.03125 = fieldNorm(doc=2191)
        0.035607297 = weight(_text_:classification in 2191) [ClassicSimilarity], result of:
          0.035607297 = score(doc=2191,freq=14.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.37237754 = fieldWeight in 2191, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03125 = fieldNorm(doc=2191)
      0.21428572 = coord(3/14)
    
    Abstract
    In the Internet information space, advanced information retrieval (IR) methods and automatic text processing are used in conjunction with traditional knowledge organization systems (KOS). New information technology provides a platform for better KOS publishing, exploitation and sharing both for human and machine use. Networked KOS services are now being planned and developed as powerful tools for resource discovery. They will enable automatic contextualisation, interpretation and query matching to different indexing languages. The Semantic Web promises to be an environment in which the quality of semantic relationships in bibliographic classification systems can be fully exploited. Their use in the networked environment is, however, limited by the fact that they are not prepared or made available for advanced machine processing. The UDC was chosen for this research because of its widespread use and its long-term presence in online information retrieval systems. It was also the first system to be used for the automatic classification of Internet resources, and the first to be made available as a classification tool on the Web. The objective of this research is to establish the advantages of using UDC for information retrieval in a networked environment, to highlight the problems of automation and classification exchange, and to offer possible solutions. The first research question was is there enough evidence of the use of classification on the Internet to justify further development with this particular environment in mind? The second question is what are the automation requirements for the full exploitation of UDC and its exchange? The third question is which areas are in need of improvement and what specific recommendations can be made for implementing the UDC in a networked environment? A summary of changes required in the management and development of the UDC to facilitate its full adaptation for future use is drawn from this analysis.
  6. Francu, V.: Multilingual access to information using an intermediate language (2003) 0.01
    0.010173514 = product of:
      0.071214594 = sum of:
        0.035607297 = weight(_text_:classification in 1742) [ClassicSimilarity], result of:
          0.035607297 = score(doc=1742,freq=14.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.37237754 = fieldWeight in 1742, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03125 = fieldNorm(doc=1742)
        0.035607297 = weight(_text_:classification in 1742) [ClassicSimilarity], result of:
          0.035607297 = score(doc=1742,freq=14.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.37237754 = fieldWeight in 1742, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03125 = fieldNorm(doc=1742)
      0.14285715 = coord(2/14)
    
    Abstract
    While being theoretically so widely available, information can be restricted from a more general use by linguistic barriers. The linguistic aspects of the information languages and particularly the chances of an enhanced access to information by means of multilingual access facilities will make the substance of this thesis. The main problem of this research is thus to demonstrate that information retrieval can be improved by using multilingual thesaurus terms based on an intermediate or switching language to search with. Universal classification systems in general can play the role of switching languages for reasons dealt with in the forthcoming pages. The Universal Decimal Classification (UDC) in particular is the classification system used as example of a switching language for our objectives. The question may arise: why a universal classification system and not another thesaurus? Because the UDC like most of the classification systems uses symbols. Therefore, it is language independent and the problems of compatibility between such a thesaurus and different other thesauri in different languages are avoided. Another question may still arise? Why not then, assign running numbers to the descriptors in a thesaurus and make a switching language out of the resulting enumerative system? Because of some other characteristics of the UDC: hierarchical structure and terminological richness, consistency and control. One big problem to find an answer to is: can a thesaurus be made having as a basis a classification system in any and all its parts? To what extent this question can be given an affirmative answer? This depends much on the attributes of the universal classification system which can be favourably used to this purpose. Examples of different situations will be given and discussed upon beginning with those classes of UDC which are best fitted for building a thesaurus structure out of them (classes which are both hierarchical and faceted)...
  7. Tzitzikas, Y.: Collaborative ontology-based information indexing and retrieval (2002) 0.01
    0.009839063 = product of:
      0.045915626 = sum of:
        0.013458292 = weight(_text_:classification in 2281) [ClassicSimilarity], result of:
          0.013458292 = score(doc=2281,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.14074548 = fieldWeight in 2281, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03125 = fieldNorm(doc=2281)
        0.01899904 = product of:
          0.03799808 = sum of:
            0.03799808 = weight(_text_:schemes in 2281) [ClassicSimilarity], result of:
              0.03799808 = score(doc=2281,freq=2.0), product of:
                0.16067243 = queryWeight, product of:
                  5.3512506 = idf(docFreq=569, maxDocs=44218)
                  0.03002521 = queryNorm
                0.2364941 = fieldWeight in 2281, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.3512506 = idf(docFreq=569, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2281)
          0.5 = coord(1/2)
        0.013458292 = weight(_text_:classification in 2281) [ClassicSimilarity], result of:
          0.013458292 = score(doc=2281,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.14074548 = fieldWeight in 2281, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03125 = fieldNorm(doc=2281)
      0.21428572 = coord(3/14)
    
    Abstract
    An information system like the Web is a continuously evolving system consisting of multiple heterogeneous information sources, covering a wide domain of discourse, and a huge number of users (human or software) with diverse characteristics and needs, that produce and consume information. The challenge nowadays is to build a scalable information infrastructure enabling the effective, accurate, content based retrieval of information, in a way that adapts to the characteristics and interests of the users. The aim of this work is to propose formally sound methods for building such an information network based on ontologies which are widely used and are easy to grasp by ordinary Web users. The main results of this work are: - A novel scheme for indexing and retrieving objects according to multiple aspects or facets. The proposed scheme is a faceted scheme enriched with a method for specifying the combinations of terms that are valid. We give a model-theoretic interpretation to this model and we provide mechanisms for inferring the valid combinations of terms. This inference service can be exploited for preventing errors during the indexing process, which is very important especially in the case where the indexing is done collaboratively by many users, and for deriving "complete" navigation trees suitable for browsing through the Web. The proposed scheme has several advantages over the hierarchical classification schemes currently employed by Web catalogs, namely, conceptual clarity (it is easier to understand), compactness (it takes less space), and scalability (the update operations can be formulated more easily and be performed more effciently). - A exible and effecient model for building mediators over ontology based information sources. The proposed mediators support several modes of query translation and evaluation which can accommodate various application needs and levels of answer quality. The proposed model can be used for providing users with customized views of Web catalogs. It can also complement the techniques for building mediators over relational sources so as to support approximate translation of partially ordered domain values.
  8. Pepper, S.: ¬The typology and semantics of binominal lexemes : noun-noun compounds and their functional equivalents (2020) 0.01
    0.006660128 = product of:
      0.046620894 = sum of:
        0.023310447 = weight(_text_:classification in 104) [ClassicSimilarity], result of:
          0.023310447 = score(doc=104,freq=6.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.24377833 = fieldWeight in 104, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03125 = fieldNorm(doc=104)
        0.023310447 = weight(_text_:classification in 104) [ClassicSimilarity], result of:
          0.023310447 = score(doc=104,freq=6.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.24377833 = fieldWeight in 104, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03125 = fieldNorm(doc=104)
      0.14285715 = coord(2/14)
    
    Abstract
    The dissertation establishes 'binominal lexeme' as a comparative concept and discusses its cross-linguistic typology and semantics. Informally, a binominal lexeme is a noun-noun compound or functional equivalent; more precisely, it is a lexical item that consists primarily of two thing-morphs between which there exists an unstated semantic relation. Examples of binominals include Mandarin Chinese ?? (tielù) [iron road], French chemin de fer [way of iron] and Russian ???????? ?????? (zeleznaja doroga) [iron:adjz road]. All of these combine a word denoting 'iron' and a word denoting 'road' or 'way' to denote the meaning railway. In each case, the unstated semantic relation is one of composition: a railway is conceptualized as a road that is composed (or made) of iron. However, three different morphosyntactic strategies are employed: compounding, prepositional phrase and relational adjective. This study explores the range of such strategies used by a worldwide sample of 106 languages to express a set of 100 meanings from various semantic domains, resulting in a classification consisting of nine different morphosyntactic types. The semantic relations found in the data are also explored and a classification called the Hatcher-Bourque system is developed that operates at two levels of granularity, together with a tool for classifying binominals, the Bourquifier. The classification is extended to other subfields of language, including metonymy and lexical semantics, and beyond language to the domain of knowledge representation, resulting in a proposal for a general model of associative relations called the PHAB model. The many findings of the research include universals concerning the recruitment of anchoring nominal modification strategies, a method for comparing non-binary typologies, the non-universality (despite its predominance) of compounding, and a scale of frequencies for semantic relations which may provide insights into the associative nature of human thought.
  9. Geisriegler, E.: Enriching electronic texts with semantic metadata : a use case for the historical Newspaper Collection ANNO (Austrian Newspapers Online) of the Austrian National Libraryhek (2012) 0.01
    0.0050137215 = product of:
      0.0701921 = sum of:
        0.0701921 = sum of:
          0.04985209 = weight(_text_:texts in 595) [ClassicSimilarity], result of:
            0.04985209 = score(doc=595,freq=2.0), product of:
              0.16460659 = queryWeight, product of:
                5.4822793 = idf(docFreq=499, maxDocs=44218)
                0.03002521 = queryNorm
              0.302856 = fieldWeight in 595, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4822793 = idf(docFreq=499, maxDocs=44218)
                0.0390625 = fieldNorm(doc=595)
          0.020340007 = weight(_text_:22 in 595) [ClassicSimilarity], result of:
            0.020340007 = score(doc=595,freq=2.0), product of:
              0.10514317 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.03002521 = queryNorm
              0.19345059 = fieldWeight in 595, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=595)
      0.071428575 = coord(1/14)
    
    Date
    3. 2.2013 18:00:22
  10. Furniss, P.: ¬A study of the compatibility of two subject catalogues (1980) 0.00
    0.004849789 = product of:
      0.067897044 = sum of:
        0.067897044 = weight(_text_:subject in 1945) [ClassicSimilarity], result of:
          0.067897044 = score(doc=1945,freq=2.0), product of:
            0.10738805 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.03002521 = queryNorm
            0.63225883 = fieldWeight in 1945, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.125 = fieldNorm(doc=1945)
      0.071428575 = coord(1/14)
    
  11. Sebastian, Y.: Literature-based discovery by learning heterogeneous bibliographic information networks (2017) 0.00
    0.0032120824 = product of:
      0.044969153 = sum of:
        0.044969153 = weight(_text_:bibliographic in 535) [ClassicSimilarity], result of:
          0.044969153 = score(doc=535,freq=10.0), product of:
            0.11688946 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.03002521 = queryNorm
            0.3847152 = fieldWeight in 535, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.03125 = fieldNorm(doc=535)
      0.071428575 = coord(1/14)
    
    Abstract
    Literature-based discovery (LBD) research aims at finding effective computational methods for predicting previously unknown connections between clusters of research papers from disparate research areas. Existing methods encompass two general approaches. The first approach searches for these unknown connections by examining the textual contents of research papers. In addition to the existing textual features, the second approach incorporates structural features of scientific literatures, such as citation structures. These approaches, however, have not considered research papers' latent bibliographic metadata structures as important features that can be used for predicting previously unknown relationships between them. This thesis investigates a new graph-based LBD method that exploits the latent bibliographic metadata connections between pairs of research papers. The heterogeneous bibliographic information network is proposed as an efficient graph-based data structure for modeling the complex relationships between these metadata. In contrast to previous approaches, this method seamlessly combines textual and citation information in the form of pathbased metadata features for predicting future co-citation links between research papers from disparate research fields. The results reported in this thesis provide evidence that the method is effective for reconstructing the historical literature-based discovery hypotheses. This thesis also investigates the effects of semantic modeling and topic modeling on the performance of the proposed method. For semantic modeling, a general-purpose word sense disambiguation technique is proposed to reduce the lexical ambiguity in the title and abstract of research papers. The experimental results suggest that the reduced lexical ambiguity did not necessarily lead to a better performance of the method. This thesis discusses some of the possible contributing factors to these results. Finally, topic modeling is used for learning the latent topical relations between research papers. The learned topic model is incorporated into the heterogeneous bibliographic information network graph and allows new predictive features to be learned. The results in this thesis suggest that topic modeling improves the performance of the proposed method by increasing the overall accuracy for predicting the future co-citation links between disparate research papers.
  12. Nagy T., I.: Detecting multiword expressions and named entities in natural language texts (2014) 0.00
    0.0030528049 = product of:
      0.042739265 = sum of:
        0.042739265 = product of:
          0.08547853 = sum of:
            0.08547853 = weight(_text_:texts in 1536) [ClassicSimilarity], result of:
              0.08547853 = score(doc=1536,freq=12.0), product of:
                0.16460659 = queryWeight, product of:
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.03002521 = queryNorm
                0.51928985 = fieldWeight in 1536, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=1536)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Abstract
    Multiword expressions (MWEs) are lexical items that can be decomposed into single words and display lexical, syntactic, semantic, pragmatic and/or statistical idiosyncrasy (Sag et al., 2002; Kim, 2008; Calzolari et al., 2002). The proper treatment of multiword expressions such as rock 'n' roll and make a decision is essential for many natural language processing (NLP) applications like information extraction and retrieval, terminology extraction and machine translation, and it is important to identify multiword expressions in context. For example, in machine translation we must know that MWEs form one semantic unit, hence their parts should not be translated separately. For this, multiword expressions should be identified first in the text to be translated. The chief aim of this thesis is to develop machine learning-based approaches for the automatic detection of different types of multiword expressions in English and Hungarian natural language texts. In our investigations, we pay attention to the characteristics of different types of multiword expressions such as nominal compounds, multiword named entities and light verb constructions, and we apply novel methods to identify MWEs in raw texts. In the thesis it will be demonstrated that nominal compounds and multiword amed entities may require a similar approach for their automatic detection as they behave in the same way from a linguistic point of view. Furthermore, it will be shown that the automatic detection of light verb constructions can be carried out using two effective machine learning-based approaches.
    In this thesis, we focused on the automatic detection of multiword expressions in natural language texts. On the basis of the main contributions, we can argue that: - Supervised machine learning methods can be successfully applied for the automatic detection of different types of multiword expressions in natural language texts. - Machine learning-based multiword expression detection can be successfully carried out for English as well as for Hungarian. - Our supervised machine learning-based model was successfully applied to the automatic detection of nominal compounds from English raw texts. - We developed a Wikipedia-based dictionary labeling method to automatically detect English nominal compounds. - A prior knowledge of nominal compounds can enhance Named Entity Recognition, while previously identified named entities can assist the nominal compound identification process. - The machine learning-based method can also provide acceptable results when it was trained on an automatically generated silver standard corpus. - As named entities form one semantic unit and may consist of more than one word and function as a noun, we can treat them in a similar way to nominal compounds. - Our sequence labelling-based tool can be successfully applied for identifying verbal light verb constructions in two typologically different languages, namely English and Hungarian. - Domain adaptation techniques may help diminish the distance between domains in the automatic detection of light verb constructions. - Our syntax-based method can be successfully applied for the full-coverage identification of light verb constructions. As a first step, a data-driven candidate extraction method can be utilized. After, a machine learning approach that makes use of an extended and rich feature set selects LVCs among extracted candidates. - When a precise syntactic parser is available for the actual domain, the full-coverage identification can be performed better. In other cases, the usage of the sequence labeling method is recommended.
  13. Ziemba, L.: Information retrieval with concept discovery in digital collections for agriculture and natural resources (2011) 0.00
    0.0020143287 = product of:
      0.028200602 = sum of:
        0.028200602 = product of:
          0.056401204 = sum of:
            0.056401204 = weight(_text_:texts in 4728) [ClassicSimilarity], result of:
              0.056401204 = score(doc=4728,freq=4.0), product of:
                0.16460659 = queryWeight, product of:
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.03002521 = queryNorm
                0.34264246 = fieldWeight in 4728, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.03125 = fieldNorm(doc=4728)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Abstract
    The amount and complexity of information available in a digital form is already huge and new information is being produced every day. Retrieving information relevant to address a particular need becomes a significant issue. This work utilizes knowledge organization systems (KOS), such as thesauri and ontologies and applies information extraction (IE) and computational linguistics (CL) techniques to organize, manage and retrieve information stored in digital collections in the agricultural domain. Two real world applications of the approach have been developed and are available and actively used by the public. An ontology is used to manage the Water Conservation Digital Library holding a dynamic collection of various types of digital resources in the domain of urban water conservation in Florida, USA. The ontology based back-end powers a fully operational web interface, available at http://library.conservefloridawater.org. The system has demonstrated numerous benefits of the ontology application, including accurate retrieval of resources, information sharing and reuse, and has proved to effectively facilitate information management. The major difficulty encountered with the approach is that large and dynamic number of concepts makes it difficult to keep the ontology consistent and to accurately catalog resources manually. To address the aforementioned issues, a combination of IE and CL techniques, such as Vector Space Model and probabilistic parsing, with the use of Agricultural Thesaurus were adapted to automatically extract concepts important for each of the texts in the Best Management Practices (BMP) Publication Library--a collection of documents in the domain of agricultural BMPs in Florida available at http://lyra.ifas.ufl.edu/LIB. A new approach of domain-specific concept discovery with the use of Internet search engine was developed. Initial evaluation of the results indicates significant improvement in precision of information extraction. The approach presented in this work focuses on problems unique to agriculture and natural resources domain, such as domain specific concepts and vocabularies, but should be applicable to any collection of texts in digital format. It may be of potential interest for anyone who needs to effectively manage a collection of digital resources.
  14. Schwarz, K.: Domain model enhanced search : a comparison of taxonomy, thesaurus and ontology (2005) 0.00
    0.001919193 = product of:
      0.026868701 = sum of:
        0.026868701 = product of:
          0.053737402 = sum of:
            0.053737402 = weight(_text_:schemes in 4569) [ClassicSimilarity], result of:
              0.053737402 = score(doc=4569,freq=4.0), product of:
                0.16067243 = queryWeight, product of:
                  5.3512506 = idf(docFreq=569, maxDocs=44218)
                  0.03002521 = queryNorm
                0.33445317 = fieldWeight in 4569, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.3512506 = idf(docFreq=569, maxDocs=44218)
                  0.03125 = fieldNorm(doc=4569)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Abstract
    The results of this thesis are intended to support the information architect in designing a solution for improved search in a corporate environment. Specifically we have examined the type of search problems that require a domain model to enhance the search process. There are several approaches to modeling a domain. We have considered 3 different types of domain modeling schemes; taxonomy, thesaurus and ontology. The intention is to support the information architect in making an informed choice between one or more of these schemes. In our opinion the main criteria for this choice are the modeling characteristics of a scheme and the suitability for application in the search process. The second chapter is a discussion of modeling characteristics of each scheme, followed by a comparison between them. This should give an information architect an idea of which aspects of a domain can be modeled with each scheme. What is missing here is an indication of the effort required to model a domain with each scheme. There are too many factors that influence the amount of required effort, ranging from measurable factors like domain size and resource characteristics to cultural matters such as the willingness to share knowledge and the existence of a project champion in the team to keep the project running. The third chapter shows what role domain models can play in each part of the search process. This gives an idea of the problems that domain models can solve. We have split the search process into individual parts to show that domain models can be applied very differently in the process. The fourth chapter makes recommendations about the suitability of each individualdomain modeling scheme for improving search. Each scheme has particular characteristics that make it especially suitable for a domain or a search problem. In the appendix each case study is described in detail. These descriptions are intended to serve as a benchmark. The current problem of the enterprise can be compared to those described to see which case study is most similar, which solution was chosen, which problems arose and how they were dealt with. An important issue that we have not touched upon in this thesis is that of maintenance. The real problems of a domain model are revealed when it is applied in a search system and its deficits and wrong assumptions become clear. Adaptation and maintenance are always required. Unfortunately we have not been able to glean sufficient information about maintenance issues from our case studies to draw any meaningful conclusions.
  15. Gordon, T.J.; Helmer-Hirschberg, O.: Report on a long-range forecasting study (1964) 0.00
    0.0016437208 = product of:
      0.02301209 = sum of:
        0.02301209 = product of:
          0.04602418 = sum of:
            0.04602418 = weight(_text_:22 in 4204) [ClassicSimilarity], result of:
              0.04602418 = score(doc=4204,freq=4.0), product of:
                0.10514317 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03002521 = queryNorm
                0.4377287 = fieldWeight in 4204, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4204)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Date
    22. 6.2018 13:24:08
    22. 6.2018 13:54:52
  16. Onofri, A.: Concepts in context (2013) 0.00
    0.0015003269 = product of:
      0.021004576 = sum of:
        0.021004576 = weight(_text_:subject in 1077) [ClassicSimilarity], result of:
          0.021004576 = score(doc=1077,freq=4.0), product of:
            0.10738805 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.03002521 = queryNorm
            0.1955951 = fieldWeight in 1077, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1077)
      0.071428575 = coord(1/14)
    
    Abstract
    My thesis discusses two related problems that have taken center stage in the recent literature on concepts: 1) What are the individuation conditions of concepts? Under what conditions is a concept Cv(1) the same concept as a concept Cv(2)? 2) What are the possession conditions of concepts? What conditions must be satisfied for a thinker to have a concept C? The thesis defends a novel account of concepts, which I call "pluralist-contextualist": 1) Pluralism: Different concepts have different kinds of individuation and possession conditions: some concepts are individuated more "coarsely", have less demanding possession conditions and are widely shared, while other concepts are individuated more "finely" and not shared. 2) Contextualism: When a speaker ascribes a propositional attitude to a subject S, or uses his ascription to explain/predict S's behavior, the speaker's intentions in the relevant context determine the correct individuation conditions for the concepts involved in his report. In chapters 1-3 I defend a contextualist, non-Millian theory of propositional attitude ascriptions. Then, I show how contextualism can be used to offer a novel perspective on the problem of concept individuation/possession. More specifically, I employ contextualism to provide a new, more effective argument for Fodor's "publicity principle": if contextualism is true, then certain specific concepts must be shared in order for interpersonally applicable psychological generalizations to be possible. In chapters 4-5 I raise a tension between publicity and another widely endorsed principle, the "Fregean constraint" (FC): subjects who are unaware of certain identity facts and find themselves in so-called "Frege cases" must have distinct concepts for the relevant object x. For instance: the ancient astronomers had distinct concepts (HESPERUS/PHOSPHORUS) for the same object (the planet Venus). First, I examine some leading theories of concepts and argue that they cannot meet both of our constraints at the same time. Then, I offer principled reasons to think that no theory can satisfy (FC) while also respecting publicity. (FC) appears to require a form of holism, on which a concept is individuated by its global inferential role in a subject S and can thus only be shared by someone who has exactly the same inferential dispositions as S. This explains the tension between publicity and (FC), since holism is clearly incompatible with concept shareability. To solve the tension, I suggest adopting my pluralist-contextualist proposal: concepts involved in Frege cases are holistically individuated and not public, while other concepts are more coarsely individuated and widely shared; given this "plurality" of concepts, we will then need contextual factors (speakers' intentions) to "select" the specific concepts to be employed in our intentional generalizations in the relevant contexts. In chapter 6 I develop the view further by contrasting it with some rival accounts. First, I examine a very different kind of pluralism about concepts, which has been recently defended by Daniel Weiskopf, and argue that it is insufficiently radical. Then, I consider the inferentialist accounts defended by authors like Peacocke, Rey and Jackson. Such views, I argue, are committed to an implausible picture of reference determination, on which our inferential dispositions fix the reference of our concepts: this leads to wrong predictions in all those cases of scientific disagreement where two parties have very different inferential dispositions and yet seem to refer to the same natural kind.
  17. Styltsvig, H.B.: Ontology-based information retrieval (2006) 0.00
    0.0014243455 = product of:
      0.019940836 = sum of:
        0.019940836 = product of:
          0.039881673 = sum of:
            0.039881673 = weight(_text_:texts in 1154) [ClassicSimilarity], result of:
              0.039881673 = score(doc=1154,freq=2.0), product of:
                0.16460659 = queryWeight, product of:
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.03002521 = queryNorm
                0.2422848 = fieldWeight in 1154, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1154)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Abstract
    In this thesis, we will present methods for introducing ontologies in information retrieval. The main hypothesis is that the inclusion of conceptual knowledge such as ontologies in the information retrieval process can contribute to the solution of major problems currently found in information retrieval. This utilization of ontologies has a number of challenges. Our focus is on the use of similarity measures derived from the knowledge about relations between concepts in ontologies, the recognition of semantic information in texts and the mapping of this knowledge into the ontologies in use, as well as how to fuse together the ideas of ontological similarity and ontological indexing into a realistic information retrieval scenario. To achieve the recognition of semantic knowledge in a text, shallow natural language processing is used during indexing that reveals knowledge to the level of noun phrases. Furthermore, we briefly cover the identification of semantic relations inside and between noun phrases, as well as discuss which kind of problems are caused by an increase in compoundness with respect to the structure of concepts in the evaluation of queries. Measuring similarity between concepts based on distances in the structure of the ontology is discussed. In addition, a shared nodes measure is introduced and, based on a set of intuitive similarity properties, compared to a number of different measures. In this comparison the shared nodes measure appears to be superior, though more computationally complex. Some of the major problems of shared nodes which relate to the way relations differ with respect to the degree they bring the concepts they connect closer are discussed. A generalized measure called weighted shared nodes is introduced to deal with these problems. Finally, the utilization of concept similarity in query evaluation is discussed. A semantic expansion approach that incorporates concept similarity is introduced and a generalized fuzzy set retrieval model that applies expansion during query evaluation is presented. While not commonly used in present information retrieval systems, it appears that the fuzzy set model comprises the flexibility needed when generalizing to an ontology-based retrieval model and, with the introduction of a hierarchical fuzzy aggregation principle, compound concepts can be handled in a straightforward and natural manner.
  18. Markó, K.G.: Foundation, implementation and evaluation of the MorphoSaurus system (2008) 0.00
    0.0012463024 = product of:
      0.017448233 = sum of:
        0.017448233 = product of:
          0.034896467 = sum of:
            0.034896467 = weight(_text_:texts in 4415) [ClassicSimilarity], result of:
              0.034896467 = score(doc=4415,freq=2.0), product of:
                0.16460659 = queryWeight, product of:
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.03002521 = queryNorm
                0.21199921 = fieldWeight in 4415, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=4415)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Abstract
    The proper handling of acronyms plays a crucial role in medical texts, e.g. in patient records, as well as in scientific literature. Chapter six presents an approach, in which acronyms are automatically acquired from (bio-) medical literature. Furthermore, acronyms and their definitions in different languages are linked to each other using the MorphoSaurus text processing system. Automatic word sense disambiguation is still one of the most challenging tasks in Natural Language Processing. In Chapter seven, cross-lingual considerations lead to a new methodology for automatic disambiguation applied to subwords. Beginning with Chapter eight, a series of applications based onMorphoSaurus are introduced. Firstly, the implementation of the subword approach within a crosslanguage information retrieval setting for the medical domain is described and evaluated on standard test document collections. In Chapter nine, this methodology is extended to multilingual information retrieval in the Web, for which user queries are translated into target languages based on the segmentation into subwords and their interlingual mappings. The cross-lingual, automatic assignment of document descriptors to documents is the topic of Chapter ten. A large-scale evaluation of a heuristic, as well as a statistical algorithm is carried out using a prominent medical thesaurus as a controlled vocabulary. In Chapter eleven, it will be shown how MorphoSaurus can be used to map monolingual, lexical resources across different languages. As a result, a large multilingual medical lexicon with high coverage and complete lexical information is built and evaluated against a comparable, already available and commonly used lexical repository for the medical domain. Chapter twelve sketches a few applications based on MorphoSaurus. The generality and applicability of the subword approach to other domains is outlined, and proof-of-concepts in real-world scenarios are presented. Finally, Chapter thirteen recapitulates the most important aspects of MorphoSaurus and the potential benefit of its employment in medical information systems is carefully assessed, both for medical experts in their everyday life, but also with regard to health care consumers and their existential information needs.
  19. Haslhofer, B.: ¬A Web-based mapping technique for establishing metadata interoperability (2008) 0.00
    0.0011994956 = product of:
      0.016792938 = sum of:
        0.016792938 = product of:
          0.033585876 = sum of:
            0.033585876 = weight(_text_:schemes in 3173) [ClassicSimilarity], result of:
              0.033585876 = score(doc=3173,freq=4.0), product of:
                0.16067243 = queryWeight, product of:
                  5.3512506 = idf(docFreq=569, maxDocs=44218)
                  0.03002521 = queryNorm
                0.20903322 = fieldWeight in 3173, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.3512506 = idf(docFreq=569, maxDocs=44218)
                  0.01953125 = fieldNorm(doc=3173)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Abstract
    The integration of metadata from distinct, heterogeneous data sources requires metadata interoperability, which is a qualitative property of metadata information objects that is not given by default. The technique of metadata mapping allows domain experts to establish metadata interoperability in a certain integration scenario. Mapping solutions, as a technical manifestation of this technique, are already available for the intensively studied domain of database system interoperability, but they rarely exist for the Web. If we consider the amount of steadily increasing structured metadata and corresponding metadata schemes on theWeb, we can observe a clear need for a mapping solution that can operate in aWeb-based environment. To achieve that, we first need to build its technical core, which is a mapping model that provides the language primitives to define mapping relationships. Existing SemanticWeb languages such as RDFS and OWL define some basic mapping elements (e.g., owl:equivalentProperty, owl:sameAs), but do not address the full spectrum of semantic and structural heterogeneities that can occur among distinct, incompatible metadata information objects. Furthermore, it is still unclear how to process defined mapping relationships during run-time in order to deliver metadata to the client in a uniform way. As the main contribution of this thesis, we present an abstract mapping model, which reflects the mapping problem on a generic level and provides the means for reconciling incompatible metadata. Instance transformation functions and URIs take a central role in that model. The former cover a broad spectrum of possible structural and semantic heterogeneities, while the latter bind the complete mapping model to the architecture of the Word Wide Web. On the concrete, language-specific level we present a binding of the abstract mapping model for the RDF Vocabulary Description Language (RDFS), which allows us to create mapping specifications among incompatible metadata schemes expressed in RDFS. The mapping model is embedded in a cyclic process that categorises the requirements a mapping solution should fulfil into four subsequent phases: mapping discovery, mapping representation, mapping execution, and mapping maintenance. In this thesis, we mainly focus on mapping representation and on the transformation of mapping specifications into executable SPARQL queries. For mapping discovery support, the model provides an interface for plugging-in schema and ontology matching algorithms. For mapping maintenance we introduce the concept of a simple, but effective mapping registry. Based on the mapping model, we propose aWeb-based mediator wrapper-architecture that allows domain experts to set up mediation endpoints that provide a uniform SPARQL query interface to a set of distributed metadata sources. The involved data sources are encapsulated by wrapper components that expose the contained metadata and the schema definitions on the Web and provide a SPARQL query interface to these metadata. In this thesis, we present the OAI2LOD Server, a wrapper component for integrating metadata that are accessible via the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). In a case study, we demonstrate how mappings can be created in aWeb environment and how our mediator wrapper architecture can easily be configured in order to integrate metadata from various heterogeneous data sources without the need to install any mapping solution or metadata integration solution in a local system environment.
  20. Makewita, S.M.: Investigating the generic information-seeking function of organisational decision-makers : perspectives on improving organisational information systems (2002) 0.00
    7.264289E-4 = product of:
      0.010170003 = sum of:
        0.010170003 = product of:
          0.020340007 = sum of:
            0.020340007 = weight(_text_:22 in 642) [ClassicSimilarity], result of:
              0.020340007 = score(doc=642,freq=2.0), product of:
                0.10514317 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03002521 = queryNorm
                0.19345059 = fieldWeight in 642, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=642)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Date
    22. 7.2022 12:16:58