Search (155 results, page 2 of 8)

  • × theme_ss:"Computerlinguistik"
  • × type_ss:"a"
  • × year_i:[2000 TO 2010}
  1. Nait-Baha, L.; Jackiewicz, A.; Djioua, B.; Laublet, P.: Query reformulation for information retrieval on the Web using the point of view methodology : preliminary results (2001) 0.01
    0.010169638 = product of:
      0.047458313 = sum of:
        0.020922182 = weight(_text_:web in 249) [ClassicSimilarity], result of:
          0.020922182 = score(doc=249,freq=2.0), product of:
            0.09670874 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.029633347 = queryNorm
            0.21634221 = fieldWeight in 249, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=249)
        0.00856136 = weight(_text_:information in 249) [ClassicSimilarity], result of:
          0.00856136 = score(doc=249,freq=4.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.16457605 = fieldWeight in 249, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=249)
        0.01797477 = weight(_text_:retrieval in 249) [ClassicSimilarity], result of:
          0.01797477 = score(doc=249,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.20052543 = fieldWeight in 249, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=249)
      0.21428572 = coord(3/14)
    
    Abstract
    The work we are presenting is devoted to the information collected on the WWW. By the term collected we mean the whole process of retrieving, extracting and presenting results to the user. This research is part of the RAP (Research, Analyze, Propose) project in which we propose to combine two methods: (i) query reformulation using linguistic markers according to a given point of view; and (ii) text semantic analysis by means of contextual exploration results (Descles, 1991). The general project architecture describing the interactions between the users, the RAP system and the WWW search engines is presented in Nait-Baha et al. (1998). We will focus this paper on showing how we use linguistic markers to reformulate the queries according to a given point of view
  2. Beitzel, S.M.; Jensen, E.C.; Chowdhury, A.; Grossman, D.; Frieder, O; Goharian, N.: Fusion of effective retrieval strategies in the same information retrieval system (2004) 0.01
    0.010053988 = product of:
      0.07037791 = sum of:
        0.013536699 = weight(_text_:information in 2502) [ClassicSimilarity], result of:
          0.013536699 = score(doc=2502,freq=10.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.2602176 = fieldWeight in 2502, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2502)
        0.05684121 = weight(_text_:retrieval in 2502) [ClassicSimilarity], result of:
          0.05684121 = score(doc=2502,freq=20.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.63411707 = fieldWeight in 2502, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=2502)
      0.14285715 = coord(2/14)
    
    Abstract
    Prior efforts have shown that under certain situations retrieval effectiveness may be improved via the use of data fusion techniques. Although these improvements have been observed from the fusion of result sets from several distinct information retrieval systems, it has often been thought that fusing different document retrieval strategies in a single information retrieval system will lead to similar improvements. In this study, we show that this is not the case. We hold constant systemic differences such as parsing, stemming, phrase processing, and relevance feedback, and fuse result sets generated from highly effective retrieval strategies in the same information retrieval system. From this, we show that data fusion of highly effective retrieval strategies alone shows little or no improvement in retrieval effectiveness. Furthermore, we present a detailed analysis of the performance of modern data fusion approaches, and demonstrate the reasons why they do not perform weIl when applied to this problem. Detailed results and analyses are included to support our conclusions.
    Source
    Journal of the American Society for Information Science and Technology. 55(2004) no.10, S.859-868
  3. Kunze, C.: Lexikalisch-semantische Wortnetze in Sprachwissenschaft und Sprachtechnologie (2006) 0.01
    0.009520924 = product of:
      0.066646464 = sum of:
        0.058574736 = weight(_text_:elektronische in 6023) [ClassicSimilarity], result of:
          0.058574736 = score(doc=6023,freq=2.0), product of:
            0.14013545 = queryWeight, product of:
              4.728978 = idf(docFreq=1061, maxDocs=44218)
              0.029633347 = queryNorm
            0.41798657 = fieldWeight in 6023, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.728978 = idf(docFreq=1061, maxDocs=44218)
              0.0625 = fieldNorm(doc=6023)
        0.008071727 = weight(_text_:information in 6023) [ClassicSimilarity], result of:
          0.008071727 = score(doc=6023,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.1551638 = fieldWeight in 6023, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=6023)
      0.14285715 = coord(2/14)
    
    Abstract
    Dieser Beitrag beschreibt die Strukturierungsprinzipien und Anwendungskontexte lexikalisch-semantischer Wortnetze, insbesondere des deutschen Wortnetzes GermaNet. Wortnetze sind zurzeit besonders populäre elektronische Lexikonressourcen, die große Abdeckungen semantisch strukturierter Datenfür verschiedene Sprachen und Sprachverbünde enthalten. In Wortnetzen sind die häufigsten und wichtigsten Konzepte einer Sprache mit ihren elementaren Bedeutungsrelationen repräsentiert. Zentrale Anwendungen für Wortnetze sind u.a. die Lesartendisambiguierung und die Informationserschließung. Der Artikel skizziert die neusten Szenarien, in denen GermaNet eingesetzt wird: die Semantische Informationserschließung und die Integration allgemeinsprachlicher Wortnetze mit terminologischen Ressourcen vordem Hintergrund der Datenkonvertierung in OWL.
    Source
    Information - Wissenschaft und Praxis. 57(2006) H.6/7, S.309-314
  4. Perez-Carballo, J.; Strzalkowski, T.: Natural language information retrieval : progress report (2000) 0.01
    0.008845377 = product of:
      0.061917633 = sum of:
        0.019976506 = weight(_text_:information in 6421) [ClassicSimilarity], result of:
          0.019976506 = score(doc=6421,freq=4.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.3840108 = fieldWeight in 6421, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.109375 = fieldNorm(doc=6421)
        0.04194113 = weight(_text_:retrieval in 6421) [ClassicSimilarity], result of:
          0.04194113 = score(doc=6421,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.46789268 = fieldWeight in 6421, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.109375 = fieldNorm(doc=6421)
      0.14285715 = coord(2/14)
    
    Source
    Information processing and management. 36(2000) no.1, S.155-205
  5. Blair, D.C.: Information retrieval and the philosophy of language (2002) 0.01
    0.008405258 = product of:
      0.058836803 = sum of:
        0.015630832 = weight(_text_:information in 4283) [ClassicSimilarity], result of:
          0.015630832 = score(doc=4283,freq=30.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.3004734 = fieldWeight in 4283, product of:
              5.477226 = tf(freq=30.0), with freq of:
                30.0 = termFreq=30.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=4283)
        0.04320597 = weight(_text_:retrieval in 4283) [ClassicSimilarity], result of:
          0.04320597 = score(doc=4283,freq=26.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.48200315 = fieldWeight in 4283, product of:
              5.0990195 = tf(freq=26.0), with freq of:
                26.0 = termFreq=26.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03125 = fieldNorm(doc=4283)
      0.14285715 = coord(2/14)
    
    Abstract
    Information retrieval - the retrieval, primarily, of documents or textual material - is fundamentally a linguistic process. At the very least we must describe what we want and match that description with descriptions of the information that is available to us. Furthermore, when we describe what we want, we must mean something by that description. This is a deceptively simple act, but such linguistic events have been the grist for philosophical analysis since Aristotle. Although there are complexities involved in referring to authors, document types, or other categories of information retrieval context, here I wish to focus an one of the most problematic activities in information retrieval: the description of the intellectual content of information items. And even though I take information retrieval to involve the description and retrieval of written text, what I say here is applicable to any information item whose intellectual content can be described for retrieval-books, documents, images, audio clips, video clips, scientific specimens, engineering schematics, and so forth. For convenience, though, I will refer only to the description and retrieval of documents. The description of intellectual content can go wrong in many obvious ways. We may describe what we want incorrectly; we may describe it correctly but in such general terms that its description is useless for retrieval; or we may describe what we want correctly, but misinterpret the descriptions of available information, and thereby match our description of what we want incorrectly. From a linguistic point of view, we can be misunderstood in the process of retrieval in many ways. Because the philosophy of language deals specifically with how we are understood and mis-understood, it should have some use for understanding the process of description in information retrieval. First, however, let us examine more closely the kinds of misunderstandings that can occur in information retrieval. We use language in searching for information in two principal ways. We use it to describe what we want and to discriminate what we want from other information that is available to us but that we do not want. Description and discrimination together articulate the goals of the information search process; they also delineate the two principal ways in which language can fail us in this process. Van Rijsbergen (1979) was the first to make this distinction, calling them "representation" and "discrimination.""
    Source
    Annual review of information science and technology. 37(2003), S.3-50
  6. Kettunen, K.: Reductive and generative approaches to management of morphological variation of keywords in monolingual information retrieval : an overview (2009) 0.01
    0.007787785 = product of:
      0.054514494 = sum of:
        0.0104854815 = weight(_text_:information in 2835) [ClassicSimilarity], result of:
          0.0104854815 = score(doc=2835,freq=6.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.20156369 = fieldWeight in 2835, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2835)
        0.044029012 = weight(_text_:retrieval in 2835) [ClassicSimilarity], result of:
          0.044029012 = score(doc=2835,freq=12.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.49118498 = fieldWeight in 2835, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=2835)
      0.14285715 = coord(2/14)
    
    Abstract
    Purpose - The purpose of this article is to discuss advantages and disadvantages of various means to manage morphological variation of keywords in monolingual information retrieval. Design/methodology/approach - The authors present a compilation of query results from 11 mostly European languages and a new general classification of the language dependent techniques for management of morphological variation. Variants of the different techniques are compared in some detail in terms of retrieval effectiveness and other criteria. The paper consists mainly of an overview of different management methods for keyword variation in information retrieval. Typical IR retrieval results of 11 languages and a new classification for keyword management methods are also presented. Findings - The main results of the paper are an overall comparison of reductive and generative keyword management methods in terms of retrieval effectiveness and other broader criteria. Originality/value - The paper is of value to anyone who wants to get an overall picture of keyword management techniques used in IR.
  7. Ding, Y.; Chowdhury, G.C.; Foo, S.: Incorporating the results of co-word analyses to increase search variety for information retrieval (2000) 0.01
    0.007581752 = product of:
      0.05307226 = sum of:
        0.01712272 = weight(_text_:information in 6328) [ClassicSimilarity], result of:
          0.01712272 = score(doc=6328,freq=4.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.3291521 = fieldWeight in 6328, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.09375 = fieldNorm(doc=6328)
        0.03594954 = weight(_text_:retrieval in 6328) [ClassicSimilarity], result of:
          0.03594954 = score(doc=6328,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.40105087 = fieldWeight in 6328, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.09375 = fieldNorm(doc=6328)
      0.14285715 = coord(2/14)
    
    Source
    Journal of information science. 26(2000) no.6, S.429-451
  8. Figuerola, C.G.; Gomez, R.; Lopez de San Roman, E.: Stemming and n-grams in Spanish : an evaluation of their impact in information retrieval (2000) 0.01
    0.007581752 = product of:
      0.05307226 = sum of:
        0.01712272 = weight(_text_:information in 6501) [ClassicSimilarity], result of:
          0.01712272 = score(doc=6501,freq=4.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.3291521 = fieldWeight in 6501, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.09375 = fieldNorm(doc=6501)
        0.03594954 = weight(_text_:retrieval in 6501) [ClassicSimilarity], result of:
          0.03594954 = score(doc=6501,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.40105087 = fieldWeight in 6501, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.09375 = fieldNorm(doc=6501)
      0.14285715 = coord(2/14)
    
    Source
    Journal of information science. 26(2000) no.6, S.461-467
  9. Kummer, N.: Indexierungstechniken für das japanische Retrieval (2006) 0.01
    0.007560872 = product of:
      0.0529261 = sum of:
        0.011415146 = weight(_text_:information in 5979) [ClassicSimilarity], result of:
          0.011415146 = score(doc=5979,freq=4.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.21943474 = fieldWeight in 5979, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=5979)
        0.041510954 = weight(_text_:retrieval in 5979) [ClassicSimilarity], result of:
          0.041510954 = score(doc=5979,freq=6.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.46309367 = fieldWeight in 5979, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=5979)
      0.14285715 = coord(2/14)
    
    Abstract
    Der vorliegende Artikel beschreibt die Herausforderungen, die die japanische Sprache aufgrund der besonderen Struktur ihres Schriftsystems an das Information Retrieval stellt und präsentiert Strategien und Ansätze für die Indexierung japanischer Dokumente. Im Besonderen soll auf die Effektivität aussprachebasierter (yomi-based) Indexierung sowie Fusion verschiedener einzelner Indexierungsansätze eingegangen werden.
    Source
    Effektive Information Retrieval Verfahren in Theorie und Praxis: ausgewählte und erweiterte Beiträge des Vierten Hildesheimer Evaluierungs- und Retrievalworkshop (HIER 2005), Hildesheim, 20.7.2005. Hrsg.: T. Mandl u. C. Womser-Hacker
  10. Liu, S.; Liu, F.; Yu, C.; Meng, W.: ¬An effective approach to document retrieval via utilizing WordNet and recognizing phrases (2004) 0.01
    0.0074937996 = product of:
      0.052456595 = sum of:
        0.010089659 = weight(_text_:information in 4078) [ClassicSimilarity], result of:
          0.010089659 = score(doc=4078,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.19395474 = fieldWeight in 4078, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.078125 = fieldNorm(doc=4078)
        0.042366937 = weight(_text_:retrieval in 4078) [ClassicSimilarity], result of:
          0.042366937 = score(doc=4078,freq=4.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.47264296 = fieldWeight in 4078, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.078125 = fieldNorm(doc=4078)
      0.14285715 = coord(2/14)
    
    Source
    SIGIR'04: Proceedings of the 27th Annual International ACM-SIGIR Conference an Research and Development in Information Retrieval. Ed.: K. Järvelin, u.a
  11. Ponte, J.M.: Language models for relevance feedback (2000) 0.01
    0.007471486 = product of:
      0.0523004 = sum of:
        0.012107591 = weight(_text_:information in 35) [ClassicSimilarity], result of:
          0.012107591 = score(doc=35,freq=8.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.23274569 = fieldWeight in 35, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=35)
        0.04019281 = weight(_text_:retrieval in 35) [ClassicSimilarity], result of:
          0.04019281 = score(doc=35,freq=10.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.44838852 = fieldWeight in 35, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=35)
      0.14285715 = coord(2/14)
    
    Abstract
    The language modeling approach to Information Retrieval (IR) is a conceptually simple model of IR originally developed by Ponte and Croft (1998). In this approach, the query is treated as a random event and documents are ranked according to the likelihood that the query would be generated via a language model estimated for each document. The intuition behind this approach is that users have a prototypical document in mind and will choose query terms accordingly. The intuitive appeal of this method is that inferences about the semantic content of documents do not need to be made resulting in a conceptually simple model. In this paper, techniques for relevance feedback and routing are derived from the language modeling approach in a straightforward manner and their effectiveness is demonstrated empirically. These experiments demonstrate further proof of concept for the language modeling approach to retrieval
    Series
    The Kluwer international series on information retrieval; 7
    Source
    Advances in information retrieval: Recent research from the Center for Intelligent Information Retrieval. Ed.: W.B. Croft
  12. Rapke, K.: Automatische Indexierung von Volltexten für die Gruner+Jahr Pressedatenbank (2001) 0.01
    0.007154687 = product of:
      0.050082806 = sum of:
        0.0060537956 = weight(_text_:information in 6386) [ClassicSimilarity], result of:
          0.0060537956 = score(doc=6386,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.116372846 = fieldWeight in 6386, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=6386)
        0.044029012 = weight(_text_:retrieval in 6386) [ClassicSimilarity], result of:
          0.044029012 = score(doc=6386,freq=12.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.49118498 = fieldWeight in 6386, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=6386)
      0.14285715 = coord(2/14)
    
    Abstract
    Retrieval Tests sind die anerkannteste Methode, um neue Verfahren der Inhaltserschließung gegenüber traditionellen Verfahren zu rechtfertigen. Im Rahmen einer Diplomarbeit wurden zwei grundsätzlich unterschiedliche Systeme der automatischen inhaltlichen Erschließung anhand der Pressedatenbank des Verlagshauses Gruner + Jahr (G+J) getestet und evaluiert. Untersucht wurde dabei natürlichsprachliches Retrieval im Vergleich zu Booleschem Retrieval. Bei den beiden Systemen handelt es sich zum einen um Autonomy von Autonomy Inc. und DocCat, das von IBM an die Datenbankstruktur der G+J Pressedatenbank angepasst wurde. Ersteres ist ein auf natürlichsprachlichem Retrieval basierendes, probabilistisches System. DocCat demgegenüber basiert auf Booleschem Retrieval und ist ein lernendes System, das auf Grund einer intellektuell erstellten Trainingsvorlage indexiert. Methodisch geht die Evaluation vom realen Anwendungskontext der Textdokumentation von G+J aus. Die Tests werden sowohl unter statistischen wie auch qualitativen Gesichtspunkten bewertet. Ein Ergebnis der Tests ist, dass DocCat einige Mängel gegenüber der intellektuellen Inhaltserschließung aufweist, die noch behoben werden müssen, während das natürlichsprachliche Retrieval von Autonomy in diesem Rahmen und für die speziellen Anforderungen der G+J Textdokumentation so nicht einsetzbar ist
    Source
    nfd Information - Wissenschaft und Praxis. 52(2001) H.5, S.251-262
  13. Granitzer, M.: Statistische Verfahren der Textanalyse (2006) 0.01
    0.0070486804 = product of:
      0.049340762 = sum of:
        0.042278 = weight(_text_:web in 5809) [ClassicSimilarity], result of:
          0.042278 = score(doc=5809,freq=6.0), product of:
            0.09670874 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.029633347 = queryNorm
            0.43716836 = fieldWeight in 5809, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5809)
        0.0070627616 = weight(_text_:information in 5809) [ClassicSimilarity], result of:
          0.0070627616 = score(doc=5809,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.13576832 = fieldWeight in 5809, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5809)
      0.14285715 = coord(2/14)
    
    Abstract
    Der vorliegende Artikel bietet einen Überblick über statistische Verfahren der Textanalyse im Kontext des Semantic Webs. Als Einleitung erfolgt die Diskussion von Methoden und gängigen Techniken zur Vorverarbeitung von Texten wie z. B. Stemming oder Part-of-Speech Tagging. Die so eingeführten Repräsentationsformen dienen als Basis für statistische Merkmalsanalysen sowie für weiterführende Techniken wie Information Extraction und maschinelle Lernverfahren. Die Darstellung dieser speziellen Techniken erfolgt im Überblick, wobei auf die wichtigsten Aspekte in Bezug auf das Semantic Web detailliert eingegangen wird. Die Anwendung der vorgestellten Techniken zur Erstellung und Wartung von Ontologien sowie der Verweis auf weiterführende Literatur bilden den Abschluss dieses Artikels.
    Source
    Semantic Web: Wege zur vernetzten Wissensgesellschaft. Hrsg.: T. Pellegrini, u. A. Blumauer
    Theme
    Semantic Web
  14. Chen, K.-H.: Evaluating Chinese text retrieval with multilingual queries (2002) 0.01
    0.007000556 = product of:
      0.04900389 = sum of:
        0.0070627616 = weight(_text_:information in 1851) [ClassicSimilarity], result of:
          0.0070627616 = score(doc=1851,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.13576832 = fieldWeight in 1851, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1851)
        0.04194113 = weight(_text_:retrieval in 1851) [ClassicSimilarity], result of:
          0.04194113 = score(doc=1851,freq=8.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.46789268 = fieldWeight in 1851, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1851)
      0.14285715 = coord(2/14)
    
    Abstract
    This paper reports the design of a Chinese test collection with multilingual queries and the application of this test collection to evaluate information retrieval Systems. The effective indexing units, IR models, translation techniques, and query expansion for Chinese text retrieval are identified. The collaboration of East Asian countries for construction of test collections for cross-language multilingual text retrieval is also discussed in this paper. As well, a tool is designed to help assessors judge relevante and gather the events of relevante judgment. The log file created by this tool will be used to analyze the behaviors of assessors in the future.
  15. Vilar, P.; Dimec, J.: Krnjenje kot osnova nekaterih nekonvencionalnih metod poizvedovanja (2000) 0.01
    0.006865305 = product of:
      0.04805713 = sum of:
        0.012107591 = weight(_text_:information in 6331) [ClassicSimilarity], result of:
          0.012107591 = score(doc=6331,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.23274569 = fieldWeight in 6331, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.09375 = fieldNorm(doc=6331)
        0.03594954 = weight(_text_:retrieval in 6331) [ClassicSimilarity], result of:
          0.03594954 = score(doc=6331,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.40105087 = fieldWeight in 6331, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.09375 = fieldNorm(doc=6331)
      0.14285715 = coord(2/14)
    
    Footnote
    Übers. d. Titels: Stemming as a basis for some non-conventional methods of information retrieval
  16. Ruchimskaya, E.M.: Yavlenie variativnosti estestevennogo yazyka i sposoby ee ustraneniya v verbal'nykh IPYA (2000) 0.01
    0.006865305 = product of:
      0.04805713 = sum of:
        0.012107591 = weight(_text_:information in 6472) [ClassicSimilarity], result of:
          0.012107591 = score(doc=6472,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.23274569 = fieldWeight in 6472, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.09375 = fieldNorm(doc=6472)
        0.03594954 = weight(_text_:retrieval in 6472) [ClassicSimilarity], result of:
          0.03594954 = score(doc=6472,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.40105087 = fieldWeight in 6472, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.09375 = fieldNorm(doc=6472)
      0.14285715 = coord(2/14)
    
    Footnote
    Übers. des Titels: Natural language variations and their handling in information retrieval languages
  17. Chieu, H.L.; Lee, Y.K.: Query based event extraction along a timeline (2004) 0.01
    0.006865305 = product of:
      0.04805713 = sum of:
        0.012107591 = weight(_text_:information in 4108) [ClassicSimilarity], result of:
          0.012107591 = score(doc=4108,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.23274569 = fieldWeight in 4108, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.09375 = fieldNorm(doc=4108)
        0.03594954 = weight(_text_:retrieval in 4108) [ClassicSimilarity], result of:
          0.03594954 = score(doc=4108,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.40105087 = fieldWeight in 4108, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.09375 = fieldNorm(doc=4108)
      0.14285715 = coord(2/14)
    
    Source
    SIGIR'04: Proceedings of the 27th Annual International ACM-SIGIR Conference an Research and Development in Information Retrieval. Ed.: K. Järvelin, u.a
  18. Mustafa el Hadi, W.: Human language technology and its role in information access and management (2003) 0.01
    0.006558729 = product of:
      0.0459111 = sum of:
        0.015953152 = weight(_text_:information in 5524) [ClassicSimilarity], result of:
          0.015953152 = score(doc=5524,freq=20.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.30666938 = fieldWeight in 5524, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5524)
        0.029957948 = weight(_text_:retrieval in 5524) [ClassicSimilarity], result of:
          0.029957948 = score(doc=5524,freq=8.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.33420905 = fieldWeight in 5524, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5524)
      0.14285715 = coord(2/14)
    
    Abstract
    The role of linguistics in information access, extraction and dissemination is essential. Radical changes in the techniques of information and communication at the end of the twentieth century have had a significant effect on the function of the linguistic paradigm and its applications in all forms of communication. The introduction of new technical means have deeply changed the possibilities for the distribution of information. In this situation, what is the role of the linguistic paradigm and its practical applications, i.e., natural language processing (NLP) techniques when applied to information access? What solutions can linguistics offer in human computer interaction, extraction and management? Many fields show the relevance of the linguistic paradigm through the various technologies that require NLP, such as document and message understanding, information detection, extraction, and retrieval, question and answer, cross-language information retrieval (CLIR), text summarization, filtering, and spoken document retrieval. This paper focuses on the central role of human language technologies in the information society, surveys the current situation, describes the benefits of the above mentioned applications, outlines successes and challenges, and discusses solutions. It reviews the resources and means needed to advance information access and dissemination across language boundaries in the twenty-first century. Multilingualism, which is a natural result of globalization, requires more effort in the direction of language technology. The scope of human language technology (HLT) is large, so we limit our review to applications that involve multilinguality.
    Content
    Beitrag eines Themenheftes "Knowledge organization and classification in international information retrieval"
  19. Ballesteros, L.A.: Cross-language retrieval via transitive relation (2000) 0.01
    0.0064898217 = product of:
      0.04542875 = sum of:
        0.008737902 = weight(_text_:information in 30) [ClassicSimilarity], result of:
          0.008737902 = score(doc=30,freq=6.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.16796975 = fieldWeight in 30, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=30)
        0.036690846 = weight(_text_:retrieval in 30) [ClassicSimilarity], result of:
          0.036690846 = score(doc=30,freq=12.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.40932083 = fieldWeight in 30, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=30)
      0.14285715 = coord(2/14)
    
    Abstract
    The growth in availability of multi-lingual data in all areas of the public and private sector is driving an increasing need for systems that facilitate access to multi-lingual resources. Cross-language Retrieval (CLR) technology is a means of addressing this need. A CLR system must address two main hurdles to effective cross-language retrieval. First, it must address the ambiguity that arises when trying to map the meaning of text across languages. That is, it must address both within-language ambiguity and cross-language ambiguity. Second, it has to incorporate multilingual resources that will enable it to perform the mapping across languages. The difficulty here is that there is a limited number of lexical resources and virtually none for some pairs of languages. This work focuses on a dictionary approach to addressing the problem of limited lexical resources. A dictionary approach is taken since bilingual dictionaries are more prevalent and simpler to apply than other resources. We show that a transitive translation approach, where a third language is employed as an interlingua between the source and target languages, is a viable means of performing CLR between languages for which no bilingual dictionary is available
    Series
    The Kluwer international series on information retrieval; 7
    Source
    Advances in information retrieval: Recent research from the Center for Intelligent Information Retrieval. Ed.: W.B. Croft
  20. Benoit, G.: Data discretization for novel relationship discovery in information retrieval (2002) 0.01
    0.006472671 = product of:
      0.045308694 = sum of:
        0.011415146 = weight(_text_:information in 5197) [ClassicSimilarity], result of:
          0.011415146 = score(doc=5197,freq=4.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.21943474 = fieldWeight in 5197, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=5197)
        0.033893548 = weight(_text_:retrieval in 5197) [ClassicSimilarity], result of:
          0.033893548 = score(doc=5197,freq=4.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.37811437 = fieldWeight in 5197, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=5197)
      0.14285715 = coord(2/14)
    
    Abstract
    A sample of 600 Dialog and Swiss-Prot full text records in genetics and molecular biology were parsed and term frequencies calculated to provide data for a test of Benoit's visualization model for retrieval. A retrieved set is displayed graphically allowing for manipulation of document and concept relationships in real time, which hopefully will reveal unanticipated relationships.
    Source
    Journal of the American Society for Information Science and Technology. 53(2002) no.9, S.736-746

Authors

Languages

  • e 123
  • d 29
  • ru 2
  • slv 1
  • More… Less…