Search (8 results, page 1 of 1)

  • × theme_ss:"Computerlinguistik"
  • × theme_ss:"Semantisches Umfeld in Indexierung u. Retrieval"
  1. Colace, F.; Santo, M. De; Greco, L.; Napoletano, P.: Weighted word pairs for query expansion (2015) 0.00
    0.0029000505 = product of:
      0.005800101 = sum of:
        0.005800101 = product of:
          0.011600202 = sum of:
            0.011600202 = weight(_text_:a in 2687) [ClassicSimilarity], result of:
              0.011600202 = score(doc=2687,freq=12.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.21843673 = fieldWeight in 2687, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2687)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This paper proposes a novel query expansion method to improve accuracy of text retrieval systems. Our method makes use of a minimal relevance feedback to expand the initial query with a structured representation composed of weighted pairs of words. Such a structure is obtained from the relevance feedback through a method for pairs of words selection based on the Probabilistic Topic Model. We compared our method with other baseline query expansion schemes and methods. Evaluations performed on TREC-8 demonstrated the effectiveness of the proposed method with respect to the baseline.
    Type
    a
  2. Symonds, M.; Bruza, P.; Zuccon, G.; Koopman, B.; Sitbon, L.; Turner, I.: Automatic query expansion : a structural linguistic perspective (2014) 0.00
    0.0025370158 = product of:
      0.0050740317 = sum of:
        0.0050740317 = product of:
          0.010148063 = sum of:
            0.010148063 = weight(_text_:a in 1338) [ClassicSimilarity], result of:
              0.010148063 = score(doc=1338,freq=18.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.19109234 = fieldWeight in 1338, product of:
                  4.2426405 = tf(freq=18.0), with freq of:
                    18.0 = termFreq=18.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1338)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    A user's query is considered to be an imprecise description of their information need. Automatic query expansion is the process of reformulating the original query with the goal of improving retrieval effectiveness. Many successful query expansion techniques model syntagmatic associations that infer two terms co-occur more often than by chance in natural language. However, structural linguistics relies on both syntagmatic and paradigmatic associations to deduce the meaning of a word. Given the success of dependency-based approaches to query expansion and the reliance on word meanings in the query formulation process, we argue that modeling both syntagmatic and paradigmatic information in the query expansion process improves retrieval effectiveness. This article develops and evaluates a new query expansion technique that is based on a formal, corpus-based model of word meaning that models syntagmatic and paradigmatic associations. We demonstrate that when sufficient statistical information exists, as in the case of longer queries, including paradigmatic information alone provides significant improvements in retrieval effectiveness across a wide variety of data sets. More generally, when our new query expansion approach is applied to large-scale web retrieval it demonstrates significant improvements in retrieval effectiveness over a strong baseline system, based on a commercial search engine.
    Type
    a
  3. Järvelin, A.; Keskustalo, H.; Sormunen, E.; Saastamoinen, M.; Kettunen, K.: Information retrieval from historical newspaper collections in highly inflectional languages : a query expansion approach (2016) 0.00
    0.0023919214 = product of:
      0.0047838427 = sum of:
        0.0047838427 = product of:
          0.009567685 = sum of:
            0.009567685 = weight(_text_:a in 3223) [ClassicSimilarity], result of:
              0.009567685 = score(doc=3223,freq=16.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.18016359 = fieldWeight in 3223, product of:
                  4.0 = tf(freq=16.0), with freq of:
                    16.0 = termFreq=16.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3223)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The aim of the study was to test whether query expansion by approximate string matching methods is beneficial in retrieval from historical newspaper collections in a language rich with compounds and inflectional forms (Finnish). First, approximate string matching methods were used to generate lists of index words most similar to contemporary query terms in a digitized newspaper collection from the 1800s. Top index word variants were categorized to estimate the appropriate query expansion ranges in the retrieval test. Second, the effectiveness of approximate string matching methods, automatically generated inflectional forms, and their combinations were measured in a Cranfield-style test. Finally, a detailed topic-level analysis of test results was conducted. In the index of historical newspaper collection the occurrences of a word typically spread to many linguistic and historical variants along with optical character recognition (OCR) errors. All query expansion methods improved the baseline results. Extensive expansion of around 30 variants for each query word was required to achieve the highest performance improvement. Query expansion based on approximate string matching was superior to using the inflectional forms of the query words, showing that coverage of the different types of variation is more important than precision in handling one type of variation.
    Type
    a
  4. Magennis, M.: Expert rule-based query expansion (1995) 0.00
    0.0020506454 = product of:
      0.004101291 = sum of:
        0.004101291 = product of:
          0.008202582 = sum of:
            0.008202582 = weight(_text_:a in 5181) [ClassicSimilarity], result of:
              0.008202582 = score(doc=5181,freq=6.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.1544581 = fieldWeight in 5181, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5181)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Examines how, for term based free text retrieval, Interactive Query Expansion (IQE) provides better retrieval performance tahn Automatic Query Expansion (AQE) but the performance of IQE depends on the strategy employed by the user to select expansion terms. The aim is to build an expert query expansion system using term selection rules based on expert users' strategies. It is expected that such a system will achieve better performance for novice or inexperienced users that either AQE or IQE. The procedure is to discover expert IQE users' term selection strategies through observation and interrogation, to construct a rule based query expansion (RQE) system based on these and to compare the resulting retrieval performance with that of comparable AQE and IQE systems
    Type
    a
  5. Li, N.; Sun, J.: Improving Chinese term association from the linguistic perspective (2017) 0.00
    0.0020296127 = product of:
      0.0040592253 = sum of:
        0.0040592253 = product of:
          0.008118451 = sum of:
            0.008118451 = weight(_text_:a in 3381) [ClassicSimilarity], result of:
              0.008118451 = score(doc=3381,freq=8.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.15287387 = fieldWeight in 3381, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3381)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The study aims to solve how to construct the semantic relations of specific domain terms by applying linguistic rules. The semantic structure analysis at the morpheme level was used for semantic measure, and a morpheme-based term association model was proposed by improving and combining the literal-based similarity algorithm and co-occurrence relatedness methods. This study provides a novel insight into the method of semantic analysis and calculation by morpheme parsing, and the proposed solution is feasible for the automatic association of compound terms. The results show that this approach could be used to construct appropriate term association and form a reasonable structural knowledge graph. However, due to linguistic differences, the viability and effectiveness of the use of our method in non-Chinese linguistic environments should be verified.
    Type
    a
  6. Niemi, T.; Jämsen , J.: ¬A query language for discovering semantic associations, part I : approach and formal definition of query primitives (2007) 0.00
    0.0018909799 = product of:
      0.0037819599 = sum of:
        0.0037819599 = product of:
          0.0075639198 = sum of:
            0.0075639198 = weight(_text_:a in 591) [ClassicSimilarity], result of:
              0.0075639198 = score(doc=591,freq=10.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.14243183 = fieldWeight in 591, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=591)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In contemporary query languages, the user is responsible for navigation among semantically related data. Because of the huge amount of data and the complex structural relationships among data in modern applications, it is unrealistic to suppose that the user could know completely the content and structure of the available information. There are several query languages whose purpose is to facilitate navigation in unknown structures of databases. However, the background assumption of these languages is that the user knows how data are related to each other semantically in the structure at hand. So far only little attention has been paid to how unknown semantic associations among available data can be discovered. We address this problem in this article. A semantic association between two entities can be constructed if a sequence of relationships expressed explicitly in a database can be found that connects these entities to each other. This sequence may contain several other entities through which the original entities are connected to each other indirectly. We introduce an expressive and declarative query language for discovering semantic associations. Our query language is able, for example, to discover semantic associations between entities for which only some of the characteristics are known. Further, it integrates the manipulation of semantic associations with the manipulation of documents that may contain information on entities in semantic associations.
    Type
    a
  7. Niemi, T.; Jämsen, J.: ¬A query language for discovering semantic associations, part II : sample queries and query evaluation (2007) 0.00
    0.0014647468 = product of:
      0.0029294936 = sum of:
        0.0029294936 = product of:
          0.005858987 = sum of:
            0.005858987 = weight(_text_:a in 580) [ClassicSimilarity], result of:
              0.005858987 = score(doc=580,freq=6.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.11032722 = fieldWeight in 580, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=580)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In our query language introduced in Part I (Journal of the American Society for Information Science and Technology. 58(2007) no.11, S.1559-1568) the user can formulate queries to find out (possibly complex) semantic relationships among entities. In this article we demonstrate the usage of our query language and discuss the new applications that it supports. We categorize several query types and give sample queries. The query types are categorized based on whether the entities specified in a query are known or unknown to the user in advance, and whether text information in documents is utilized. Natural language is used to represent the results of queries in order to facilitate correct interpretation by the user. We discuss briefly the issues related to the prototype implementation of the query language and show that an independent operation like Rho (Sheth et al., 2005; Anyanwu & Sheth, 2002, 2003), which presupposes entities of interest to be known in advance, is exceedingly inefficient in emulating the behavior of our query language. The discussion also covers potential problems, and challenges for future work.
    Type
    a
  8. Frederichs, A.: Natürlichsprachige Abfrage und 3-D-Visualisierung von Wissenszusammenhängen (2007) 0.00
    0.0011959607 = product of:
      0.0023919214 = sum of:
        0.0023919214 = product of:
          0.0047838427 = sum of:
            0.0047838427 = weight(_text_:a in 566) [ClassicSimilarity], result of:
              0.0047838427 = score(doc=566,freq=4.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.090081796 = fieldWeight in 566, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=566)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Type
    a