Search (116 results, page 1 of 6)

  • × theme_ss:"Computerlinguistik"
  1. Bian, G.-W.; Chen, H.-H.: Cross-language information access to multilingual collections on the Internet (2000) 0.16
    0.16141582 = product of:
      0.24212372 = sum of:
        0.09994029 = weight(_text_:query in 4436) [ClassicSimilarity], result of:
          0.09994029 = score(doc=4436,freq=4.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.43571556 = fieldWeight in 4436, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.046875 = fieldNorm(doc=4436)
        0.14218344 = sum of:
          0.10206425 = weight(_text_:page in 4436) [ClassicSimilarity], result of:
            0.10206425 = score(doc=4436,freq=2.0), product of:
              0.27565226 = queryWeight, product of:
                5.5854197 = idf(docFreq=450, maxDocs=44218)
                0.049352113 = queryNorm
              0.37026453 = fieldWeight in 4436, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5854197 = idf(docFreq=450, maxDocs=44218)
                0.046875 = fieldNorm(doc=4436)
          0.040119182 = weight(_text_:22 in 4436) [ClassicSimilarity], result of:
            0.040119182 = score(doc=4436,freq=2.0), product of:
              0.1728227 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.049352113 = queryNorm
              0.23214069 = fieldWeight in 4436, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=4436)
      0.6666667 = coord(2/3)
    
    Abstract
    Language barrier is the major problem that people face in searching for, retrieving, and understanding multilingual collections on the Internet. This paper deals with query translation and document translation in a Chinese-English information retrieval system called MTIR. Bilingual dictionary and monolingual corpus-based approaches are adopted to select suitable tranlated query terms. A machine transliteration algorithm is introduced to resolve proper name searching. We consider several design issues for document translation, including which material is translated, what roles the HTML tags play in translation, what the tradeoff is between the speed performance and the translation performance, and what from the translated result is presented in. About 100.000 Web pages translated in the last 4 months of 1997 are used for quantitative study of online and real-time Web page translation
    Date
    16. 2.2000 14:22:39
  2. Kajanan, S.; Bao, Y.; Datta, A.; VanderMeer, D.; Dutta, K.: Efficient automatic search query formulation using phrase-level analysis (2014) 0.10
    0.10210096 = product of:
      0.15315144 = sum of:
        0.09422461 = weight(_text_:query in 1264) [ClassicSimilarity], result of:
          0.09422461 = score(doc=1264,freq=8.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.41079655 = fieldWeight in 1264, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.03125 = fieldNorm(doc=1264)
        0.058926824 = product of:
          0.11785365 = sum of:
            0.11785365 = weight(_text_:page in 1264) [ClassicSimilarity], result of:
              0.11785365 = score(doc=1264,freq=6.0), product of:
                0.27565226 = queryWeight, product of:
                  5.5854197 = idf(docFreq=450, maxDocs=44218)
                  0.049352113 = queryNorm
                0.42754465 = fieldWeight in 1264, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  5.5854197 = idf(docFreq=450, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1264)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Over the past decade, the volume of information available digitally over the Internet has grown enormously. Technical developments in the area of search, such as Google's Page Rank algorithm, have proved so good at serving relevant results that Internet search has become integrated into daily human activity. One can endlessly explore topics of interest simply by querying and reading through the resulting links. Yet, although search engines are well known for providing relevant results based on users' queries, users do not always receive the results they are looking for. Google's Director of Research describes clickstream evidence of frustrated users repeatedly reformulating queries and searching through page after page of results. Given the general quality of search engine results, one must consider the possibility that the frustrated user's query is not effective; that is, it does not describe the essence of the user's interest. Indeed, extensive research into human search behavior has found that humans are not very effective at formulating good search queries that describe what they are interested in. Ideally, the user should simply point to a portion of text that sparked the user's interest, and a system should automatically formulate a search query that captures the essence of the text. In this paper, we describe an implemented system that provides this capability. We first describe how our work differs from existing work in automatic query formulation, and propose a new method for improved quantification of the relevance of candidate search terms drawn from input text using phrase-level analysis. We then propose an implementable method designed to provide relevant queries based on a user's text input. We demonstrate the quality of our results and performance of our system through experimental studies. Our results demonstrate that our system produces relevant search terms with roughly two-thirds precision and recall compared to search terms selected by experts, and that typical users find significantly more relevant results (31% more relevant) more quickly (64% faster) using our system than self-formulated search queries. Further, we show that our implementation can scale to request loads of up to 10 requests per second within current online responsiveness expectations (<2-second response times at the highest loads tested).
  3. Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.07
    0.07229989 = product of:
      0.21689966 = sum of:
        0.21689966 = sum of:
          0.17678048 = weight(_text_:page in 563) [ClassicSimilarity], result of:
            0.17678048 = score(doc=563,freq=6.0), product of:
              0.27565226 = queryWeight, product of:
                5.5854197 = idf(docFreq=450, maxDocs=44218)
                0.049352113 = queryNorm
              0.641317 = fieldWeight in 563, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.5854197 = idf(docFreq=450, maxDocs=44218)
                0.046875 = fieldNorm(doc=563)
          0.040119182 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
            0.040119182 = score(doc=563,freq=2.0), product of:
              0.1728227 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.049352113 = queryNorm
              0.23214069 = fieldWeight in 563, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=563)
      0.33333334 = coord(1/3)
    
    Abstract
    In this thesis we propose three new word association measures for multi-word term extraction. We combine these association measures with LocalMaxs algorithm in our extraction model and compare the results of different multi-word term extraction methods. Our approach is language and domain independent and requires no training data. It can be applied to such tasks as text summarization, information retrieval, and document classification. We further explore the potential of using multi-word terms as an effective representation for general web-page summarization. We extract multi-word terms from human written summaries in a large collection of web-pages, and generate the summaries by aligning document words with these multi-word terms. Our system applies machine translation technology to learn the aligning process from a training set and focuses on selecting high quality multi-word terms from human written summaries to generate suitable results for web-page summarization.
    Date
    10. 1.2013 19:22:47
  4. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.07
    0.06562922 = product of:
      0.09844383 = sum of:
        0.078384236 = product of:
          0.2351527 = sum of:
            0.2351527 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
              0.2351527 = score(doc=562,freq=2.0), product of:
                0.41840777 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.049352113 = queryNorm
                0.56201804 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
        0.020059591 = product of:
          0.040119182 = sum of:
            0.040119182 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
              0.040119182 = score(doc=562,freq=2.0), product of:
                0.1728227 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049352113 = queryNorm
                0.23214069 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  5. Symonds, M.; Bruza, P.; Zuccon, G.; Koopman, B.; Sitbon, L.; Turner, I.: Automatic query expansion : a structural linguistic perspective (2014) 0.06
    0.062075913 = product of:
      0.18622774 = sum of:
        0.18622774 = weight(_text_:query in 1338) [ClassicSimilarity], result of:
          0.18622774 = score(doc=1338,freq=20.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.811908 = fieldWeight in 1338, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1338)
      0.33333334 = coord(1/3)
    
    Abstract
    A user's query is considered to be an imprecise description of their information need. Automatic query expansion is the process of reformulating the original query with the goal of improving retrieval effectiveness. Many successful query expansion techniques model syntagmatic associations that infer two terms co-occur more often than by chance in natural language. However, structural linguistics relies on both syntagmatic and paradigmatic associations to deduce the meaning of a word. Given the success of dependency-based approaches to query expansion and the reliance on word meanings in the query formulation process, we argue that modeling both syntagmatic and paradigmatic information in the query expansion process improves retrieval effectiveness. This article develops and evaluates a new query expansion technique that is based on a formal, corpus-based model of word meaning that models syntagmatic and paradigmatic associations. We demonstrate that when sufficient statistical information exists, as in the case of longer queries, including paradigmatic information alone provides significant improvements in retrieval effectiveness across a wide variety of data sets. More generally, when our new query expansion approach is applied to large-scale web retrieval it demonstrates significant improvements in retrieval effectiveness over a strong baseline system, based on a commercial search engine.
  6. Magennis, M.: Expert rule-based query expansion (1995) 0.06
    0.061452016 = product of:
      0.18435605 = sum of:
        0.18435605 = weight(_text_:query in 5181) [ClassicSimilarity], result of:
          0.18435605 = score(doc=5181,freq=10.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.8037479 = fieldWeight in 5181, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5181)
      0.33333334 = coord(1/3)
    
    Abstract
    Examines how, for term based free text retrieval, Interactive Query Expansion (IQE) provides better retrieval performance tahn Automatic Query Expansion (AQE) but the performance of IQE depends on the strategy employed by the user to select expansion terms. The aim is to build an expert query expansion system using term selection rules based on expert users' strategies. It is expected that such a system will achieve better performance for novice or inexperienced users that either AQE or IQE. The procedure is to discover expert IQE users' term selection strategies through observation and interrogation, to construct a rule based query expansion (RQE) system based on these and to compare the resulting retrieval performance with that of comparable AQE and IQE systems
  7. Niemi, T.; Jämsen, J.: ¬A query language for discovering semantic associations, part II : sample queries and query evaluation (2007) 0.06
    0.05889038 = product of:
      0.17667113 = sum of:
        0.17667113 = weight(_text_:query in 580) [ClassicSimilarity], result of:
          0.17667113 = score(doc=580,freq=18.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.7702435 = fieldWeight in 580, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.0390625 = fieldNorm(doc=580)
      0.33333334 = coord(1/3)
    
    Abstract
    In our query language introduced in Part I (Journal of the American Society for Information Science and Technology. 58(2007) no.11, S.1559-1568) the user can formulate queries to find out (possibly complex) semantic relationships among entities. In this article we demonstrate the usage of our query language and discuss the new applications that it supports. We categorize several query types and give sample queries. The query types are categorized based on whether the entities specified in a query are known or unknown to the user in advance, and whether text information in documents is utilized. Natural language is used to represent the results of queries in order to facilitate correct interpretation by the user. We discuss briefly the issues related to the prototype implementation of the query language and show that an independent operation like Rho (Sheth et al., 2005; Anyanwu & Sheth, 2002, 2003), which presupposes entities of interest to be known in advance, is exceedingly inefficient in emulating the behavior of our query language. The discussion also covers potential problems, and challenges for future work.
  8. Fernández, R.T.; Losada, D.E.: Effective sentence retrieval based on query-independent evidence (2012) 0.06
    0.057700556 = product of:
      0.17310166 = sum of:
        0.17310166 = weight(_text_:query in 2728) [ClassicSimilarity], result of:
          0.17310166 = score(doc=2728,freq=12.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.75468147 = fieldWeight in 2728, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.046875 = fieldNorm(doc=2728)
      0.33333334 = coord(1/3)
    
    Abstract
    In this paper we propose an effective sentence retrieval method that consists of incorporating query-independent features into standard sentence retrieval models. To meet this aim, we apply a formal methodology and consider different query-independent features. In particular, we show that opinion-based features are promising. Opinion mining is an increasingly important research topic but little is known about how to improve retrieval algorithms with opinion-based components. In this respect, we consider here different kinds of opinion-based features to act as query-independent evidence and study whether this incorporation improves retrieval performance. On the other hand, information needs are usually related to people, locations or organizations. We hypothesize here that using these named entities as query-independent features may also improve the sentence relevance estimation. Finally, the length of the retrieval unit has been shown to be an important component in different retrieval scenarios. We therefore include length-based features in our study. Our evaluation demonstrates that, either in isolation or in combination, these query-independent features help to improve substantially the performance of state-of-the-art sentence retrieval methods.
  9. Järvelin, A.; Keskustalo, H.; Sormunen, E.; Saastamoinen, M.; Kettunen, K.: Information retrieval from historical newspaper collections in highly inflectional languages : a query expansion approach (2016) 0.06
    0.055522382 = product of:
      0.16656715 = sum of:
        0.16656715 = weight(_text_:query in 3223) [ClassicSimilarity], result of:
          0.16656715 = score(doc=3223,freq=16.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.7261926 = fieldWeight in 3223, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3223)
      0.33333334 = coord(1/3)
    
    Abstract
    The aim of the study was to test whether query expansion by approximate string matching methods is beneficial in retrieval from historical newspaper collections in a language rich with compounds and inflectional forms (Finnish). First, approximate string matching methods were used to generate lists of index words most similar to contemporary query terms in a digitized newspaper collection from the 1800s. Top index word variants were categorized to estimate the appropriate query expansion ranges in the retrieval test. Second, the effectiveness of approximate string matching methods, automatically generated inflectional forms, and their combinations were measured in a Cranfield-style test. Finally, a detailed topic-level analysis of test results was conducted. In the index of historical newspaper collection the occurrences of a word typically spread to many linguistic and historical variants along with optical character recognition (OCR) errors. All query expansion methods improved the baseline results. Extensive expansion of around 30 variants for each query word was required to achieve the highest performance improvement. Query expansion based on approximate string matching was superior to using the inflectional forms of the query words, showing that coverage of the different types of variation is more important than precision in handling one type of variation.
  10. Patrick, J.; Zhang, J.; Artola-Zubillaga, X.: ¬An architecture and query language for a federation of heterogeneous dictionary databases (2000) 0.05
    0.054964356 = product of:
      0.16489306 = sum of:
        0.16489306 = weight(_text_:query in 339) [ClassicSimilarity], result of:
          0.16489306 = score(doc=339,freq=2.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.71889395 = fieldWeight in 339, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.109375 = fieldNorm(doc=339)
      0.33333334 = coord(1/3)
    
  11. Colace, F.; Santo, M. De; Greco, L.; Napoletano, P.: Weighted word pairs for query expansion (2015) 0.05
    0.054964356 = product of:
      0.16489306 = sum of:
        0.16489306 = weight(_text_:query in 2687) [ClassicSimilarity], result of:
          0.16489306 = score(doc=2687,freq=8.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.71889395 = fieldWeight in 2687, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2687)
      0.33333334 = coord(1/3)
    
    Abstract
    This paper proposes a novel query expansion method to improve accuracy of text retrieval systems. Our method makes use of a minimal relevance feedback to expand the initial query with a structured representation composed of weighted pairs of words. Such a structure is obtained from the relevance feedback through a method for pairs of words selection based on the Probabilistic Topic Model. We compared our method with other baseline query expansion schemes and methods. Evaluations performed on TREC-8 demonstrated the effectiveness of the proposed method with respect to the baseline.
  12. Airio, E.: Who benefits from CLIR in web retrieval? (2008) 0.05
    0.052673157 = product of:
      0.15801947 = sum of:
        0.15801947 = weight(_text_:query in 2342) [ClassicSimilarity], result of:
          0.15801947 = score(doc=2342,freq=10.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.68892676 = fieldWeight in 2342, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.046875 = fieldNorm(doc=2342)
      0.33333334 = coord(1/3)
    
    Abstract
    Purpose - The aim of the current paper is to test whether query translation is beneficial in web retrieval. Design/methodology/approach - The language pairs were Finnish-Swedish, English-German and Finnish-French. A total of 12-18 participants were recruited for each language pair. Each participant performed four retrieval tasks. The author's aim was to compare the performance of the translated queries with that of the target language queries. Thus, the author asked participants to formulate a source language query and a target language query for each task. The source language queries were translated into the target language utilizing a dictionary-based system. In English-German, also machine translation was utilized. The author used Google as the search engine. Findings - The results differed depending on the language pair. The author concluded that the dictionary coverage had an effect on the results. On average, the results of query-translation were better than in the traditional laboratory tests. Originality/value - This research shows that query translation in web is beneficial especially for users with moderate and non-active language skills. This is valuable information for developers of cross-language information retrieval systems.
  13. Niemi, T.; Jämsen , J.: ¬A query language for discovering semantic associations, part I : approach and formal definition of query primitives (2007) 0.05
    0.048083793 = product of:
      0.14425138 = sum of:
        0.14425138 = weight(_text_:query in 591) [ClassicSimilarity], result of:
          0.14425138 = score(doc=591,freq=12.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.6289012 = fieldWeight in 591, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.0390625 = fieldNorm(doc=591)
      0.33333334 = coord(1/3)
    
    Abstract
    In contemporary query languages, the user is responsible for navigation among semantically related data. Because of the huge amount of data and the complex structural relationships among data in modern applications, it is unrealistic to suppose that the user could know completely the content and structure of the available information. There are several query languages whose purpose is to facilitate navigation in unknown structures of databases. However, the background assumption of these languages is that the user knows how data are related to each other semantically in the structure at hand. So far only little attention has been paid to how unknown semantic associations among available data can be discovered. We address this problem in this article. A semantic association between two entities can be constructed if a sequence of relationships expressed explicitly in a database can be found that connects these entities to each other. This sequence may contain several other entities through which the original entities are connected to each other indirectly. We introduce an expressive and declarative query language for discovering semantic associations. Our query language is able, for example, to discover semantic associations between entities for which only some of the characteristics are known. Further, it integrates the manipulation of semantic associations with the manipulation of documents that may contain information on entities in semantic associations.
  14. Bedathur, S.; Narang, A.: Mind your language : effects of spoken query formulation on retrieval effectiveness (2013) 0.05
    0.04760053 = product of:
      0.14280158 = sum of:
        0.14280158 = weight(_text_:query in 1150) [ClassicSimilarity], result of:
          0.14280158 = score(doc=1150,freq=6.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.62258047 = fieldWeight in 1150, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1150)
      0.33333334 = coord(1/3)
    
    Abstract
    Voice search is becoming a popular mode for interacting with search engines. As a result, research has gone into building better voice transcription engines, interfaces, and search engines that better handle inherent verbosity of queries. However, when one considers its use by non- native speakers of English, another aspect that becomes important is the formulation of the query by users. In this paper, we present the results of a preliminary study that we conducted with non-native English speakers who formulate queries for given retrieval tasks. Our results show that the current search engines are sensitive in their rankings to the query formulation, and thus highlights the need for developing more robust ranking methods.
  15. Chieu, H.L.; Lee, Y.K.: Query based event extraction along a timeline (2004) 0.05
    0.04711231 = product of:
      0.14133692 = sum of:
        0.14133692 = weight(_text_:query in 4108) [ClassicSimilarity], result of:
          0.14133692 = score(doc=4108,freq=2.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.61619484 = fieldWeight in 4108, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.09375 = fieldNorm(doc=4108)
      0.33333334 = coord(1/3)
    
  16. Vechtomova, O.; Karamuftuoglum, M.; Robertson, S.E.: On document relevance and lexical cohesion between query terms (2006) 0.05
    0.04711231 = product of:
      0.14133692 = sum of:
        0.14133692 = weight(_text_:query in 987) [ClassicSimilarity], result of:
          0.14133692 = score(doc=987,freq=8.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.61619484 = fieldWeight in 987, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.046875 = fieldNorm(doc=987)
      0.33333334 = coord(1/3)
    
    Abstract
    Lexical cohesion is a property of text, achieved through lexical-semantic relations between words in text. Most information retrieval systems make use of lexical relations in text only to a limited extent. In this paper we empirically investigate whether the degree of lexical cohesion between the contexts of query terms' occurrences in a document is related to its relevance to the query. Lexical cohesion between distinct query terms in a document is estimated on the basis of the lexical-semantic relations (repetition, synonymy, hyponymy and sibling) that exist between there collocates - words that co-occur with them in the same windows of text. Experiments suggest significant differences between the lexical cohesion in relevant and non-relevant document sets exist. A document ranking method based on lexical cohesion shows some performance improvements.
  17. Owei, V.; Higa, K.: ¬A paradigm for natural language explanation of database queries : a semantic data model approach (1994) 0.04
    0.04441791 = product of:
      0.13325372 = sum of:
        0.13325372 = weight(_text_:query in 8189) [ClassicSimilarity], result of:
          0.13325372 = score(doc=8189,freq=4.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.5809541 = fieldWeight in 8189, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.0625 = fieldNorm(doc=8189)
      0.33333334 = coord(1/3)
    
    Abstract
    An interface that provides the user with automatic feedback in the form of an explanation of how the database management system interprets user specified queries. Proposes an approach that exploits the rich semantics of graphical semantic data models to construct a restricted natural language explanation of database queries that are specified in a very high level declarative form. These interpretations of the specified query represent the system's 'understanding' of the query, and are returned to the user for validation
  18. Pritchard-Schoch, T.: Natural language comes of age (1993) 0.04
    0.04441791 = product of:
      0.13325372 = sum of:
        0.13325372 = weight(_text_:query in 2570) [ClassicSimilarity], result of:
          0.13325372 = score(doc=2570,freq=4.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.5809541 = fieldWeight in 2570, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.0625 = fieldNorm(doc=2570)
      0.33333334 = coord(1/3)
    
    Abstract
    Discusses natural languages and the natural language implementations of Westlaw's full-text legal documents, Westlaw Is Natural. Natural language is not aritificial intelligence but a hybrid of linguistics, mathematics and statistics. Provides 3 classes of retrieval models. Explains how Westlaw processes an English query. Assesses WIN. Covers WIN enhancements; the natural language features of Congressional Quarterly's Washington Alert using a document for a query; the personal librarian front end search software and Dowquest from Dow Jones news/retrieval. Conmsiders whether natural language encourages fuzzy thinking and whether Boolean logic will still be needed
  19. Ahlgren, P.; Kekäläinen, J.: Indexing strategies for Swedish full text retrieval under different user scenarios (2007) 0.04
    0.0438943 = product of:
      0.13168289 = sum of:
        0.13168289 = weight(_text_:query in 896) [ClassicSimilarity], result of:
          0.13168289 = score(doc=896,freq=10.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.5741056 = fieldWeight in 896, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.0390625 = fieldNorm(doc=896)
      0.33333334 = coord(1/3)
    
    Abstract
    This paper deals with Swedish full text retrieval and the problem of morphological variation of query terms in the document database. The effects of combination of indexing strategies with query terms on retrieval effectiveness were studied. Three of five tested combinations involved indexing strategies that used conflation, in the form of normalization. Further, two of these three combinations used indexing strategies that employed compound splitting. Normalization and compound splitting were performed by SWETWOL, a morphological analyzer for the Swedish language. A fourth combination attempted to group related terms by right hand truncation of query terms. The four combinations were compared to each other and to a baseline combination, where no attempt was made to counteract the problem of morphological variation of query terms in the document database. The five combinations were evaluated under six different user scenarios, where each scenario simulated a certain user type. The four alternative combinations outperformed the baseline, for each user scenario. The truncation combination had the best performance under each user scenario. The main conclusion of the paper is that normalization and right hand truncation (performed by a search expert) enhanced retrieval effectiveness in comparison to the baseline. The performance of the three combinations of indexing strategies with query terms based on normalization was not far below the performance of the truncation combination.
  20. Rajasurya, S.; Muralidharan, T.; Devi, S.; Swamynathan, S.: Semantic information retrieval using ontology in university domain (2012) 0.04
    0.0438943 = product of:
      0.13168289 = sum of:
        0.13168289 = weight(_text_:query in 2861) [ClassicSimilarity], result of:
          0.13168289 = score(doc=2861,freq=10.0), product of:
            0.22937049 = queryWeight, product of:
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.049352113 = queryNorm
            0.5741056 = fieldWeight in 2861, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              4.6476326 = idf(docFreq=1151, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2861)
      0.33333334 = coord(1/3)
    
    Abstract
    Today's conventional search engines hardly do provide the essential content relevant to the user's search query. This is because the context and semantics of the request made by the user is not analyzed to the full extent. So here the need for a semantic web search arises. SWS is upcoming in the area of web search which combines Natural Language Processing and Artificial Intelligence. The objective of the work done here is to design, develop and implement a semantic search engine- SIEU(Semantic Information Extraction in University Domain) confined to the university domain. SIEU uses ontology as a knowledge base for the information retrieval process. It is not just a mere keyword search. It is one layer above what Google or any other search engines retrieve by analyzing just the keywords. Here the query is analyzed both syntactically and semantically. The developed system retrieves the web results more relevant to the user query through keyword expansion. The results obtained here will be accurate enough to satisfy the request made by the user. The level of accuracy will be enhanced since the query is analyzed semantically. The system will be of great use to the developers and researchers who work on web. The Google results are re-ranked and optimized for providing the relevant links. For ranking an algorithm has been applied which fetches more apt results for the user query.

Years

Languages

  • e 97
  • d 17
  • el 1
  • m 1
  • More… Less…

Types

  • a 98
  • el 10
  • m 7
  • s 4
  • p 2
  • x 2
  • b 1
  • d 1
  • r 1
  • More… Less…