Search (6445 results, page 2 of 323)

  1. Marega, R.; Pazienza, M.T.: CoDHIR: an information retrieval system based on semantic document representation (1994) 0.25
    0.24971274 = product of:
      0.33295032 = sum of:
        0.15266767 = weight(_text_:vector in 1062) [ClassicSimilarity], result of:
          0.15266767 = score(doc=1062,freq=2.0), product of:
            0.30654848 = queryWeight, product of:
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.047605187 = queryNorm
            0.4980213 = fieldWeight in 1062, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1062)
        0.14178926 = weight(_text_:space in 1062) [ClassicSimilarity], result of:
          0.14178926 = score(doc=1062,freq=4.0), product of:
            0.24842183 = queryWeight, product of:
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.047605187 = queryNorm
            0.5707601 = fieldWeight in 1062, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1062)
        0.038493384 = product of:
          0.07698677 = sum of:
            0.07698677 = weight(_text_:model in 1062) [ClassicSimilarity], result of:
              0.07698677 = score(doc=1062,freq=4.0), product of:
                0.1830527 = queryWeight, product of:
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.047605187 = queryNorm
                0.4205716 = fieldWeight in 1062, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1062)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    Describes an information retrieval (IR) system, implemented as a part of a Content Driven Hypertext Information Retrieval (CoDHIR) project. Focuses on the use of semantic information that can be automatically acquired by applying natural language processing (NLP) techniques to texts. The information is represented using conceptual graphs. The problems of synonyms and homonyms is addressed by using a model based on the interpretation of conceptual graphs extracted from texts. The detection of contextual roles of words allows an improvement in retrieval precision over traditional IR technologies. Ranking of documents, based on document relevance, is obtained by extending the vector space model into an oblique space and taking into account the relevance among different word couples
  2. Bollmann-Sdorra, P.; Raghavan, V.V.: On the necessity of term dependence in a query space for weighted retrieval (1998) 0.25
    0.24828489 = product of:
      0.33104652 = sum of:
        0.10904834 = weight(_text_:vector in 2158) [ClassicSimilarity], result of:
          0.10904834 = score(doc=2158,freq=2.0), product of:
            0.30654848 = queryWeight, product of:
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.047605187 = queryNorm
            0.3557295 = fieldWeight in 2158, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2158)
        0.20255609 = weight(_text_:space in 2158) [ClassicSimilarity], result of:
          0.20255609 = score(doc=2158,freq=16.0), product of:
            0.24842183 = queryWeight, product of:
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.047605187 = queryNorm
            0.8153715 = fieldWeight in 2158, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2158)
        0.019442094 = product of:
          0.03888419 = sum of:
            0.03888419 = weight(_text_:model in 2158) [ClassicSimilarity], result of:
              0.03888419 = score(doc=2158,freq=2.0), product of:
                0.1830527 = queryWeight, product of:
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.047605187 = queryNorm
                0.21242073 = fieldWeight in 2158, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2158)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    In recent years, in the context of the vector space model, the view, held by many researchers, that documents, queries, terms, etc., are all elements of a common space has been challenged (Bollmann-Sdorra & Raghavan, 1993). In particular, it was noted that term independence has to be investigated in the context of user preferences and it was shown, through counterexamples, that term independence can hold in the document space, but not in the query space and vice versa. In this article, we continue the investigation of query and document spaces with respect to the property of term independence. We prove, under realistic assumptions, that requiring term independence to hold in the query space is inconsistent with the goal of achieving better performance by means of weighted retrieval. The result that term independence in the query space is undesirable is obtained without making any assumption about wjether or not the property of term independence holds in the document space. The results of this article reinforce our position that the properties of document and query spaces must be investigated separately, since the document and query spaces do not necessarily have the same properties
  3. Wong, S.K.M.: On modelling information retrieval with probabilistic inference (1995) 0.24
    0.24012579 = product of:
      0.32016772 = sum of:
        0.17447734 = weight(_text_:vector in 1938) [ClassicSimilarity], result of:
          0.17447734 = score(doc=1938,freq=2.0), product of:
            0.30654848 = queryWeight, product of:
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.047605187 = queryNorm
            0.5691672 = fieldWeight in 1938, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.0625 = fieldNorm(doc=1938)
        0.11458302 = weight(_text_:space in 1938) [ClassicSimilarity], result of:
          0.11458302 = score(doc=1938,freq=2.0), product of:
            0.24842183 = queryWeight, product of:
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.047605187 = queryNorm
            0.46124378 = fieldWeight in 1938, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.0625 = fieldNorm(doc=1938)
        0.031107351 = product of:
          0.062214702 = sum of:
            0.062214702 = weight(_text_:model in 1938) [ClassicSimilarity], result of:
              0.062214702 = score(doc=1938,freq=2.0), product of:
                0.1830527 = queryWeight, product of:
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.047605187 = queryNorm
                0.33987316 = fieldWeight in 1938, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1938)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    Examines and extends the logical models of information retrieval in the context of probability theory and extends the applications of these fundamental ideas to term weighting and relevance. Develops a unified framework for modelling the retrieval process with probabilistic inference to provide a common conceptual and mathematical basis for many retrieval models, such as Boolean, fuzzy sets, vector space, and conventional probabilistic models. Employs this framework to identify the underlying assumptions by each model and analyzes the inherent relationships between them. Although the treatment is primarily theoretical, practical methods for rstimating the required probabilities are provided by simple examples
  4. Silva, W.T. d.; Milidiu, R.L.: Belief function model for information retrieval (1993) 0.24
    0.23970023 = product of:
      0.3196003 = sum of:
        0.15266767 = weight(_text_:vector in 3740) [ClassicSimilarity], result of:
          0.15266767 = score(doc=3740,freq=2.0), product of:
            0.30654848 = queryWeight, product of:
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.047605187 = queryNorm
            0.4980213 = fieldWeight in 3740, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3740)
        0.100260146 = weight(_text_:space in 3740) [ClassicSimilarity], result of:
          0.100260146 = score(doc=3740,freq=2.0), product of:
            0.24842183 = queryWeight, product of:
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.047605187 = queryNorm
            0.4035883 = fieldWeight in 3740, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3740)
        0.0666725 = product of:
          0.133345 = sum of:
            0.133345 = weight(_text_:model in 3740) [ClassicSimilarity], result of:
              0.133345 = score(doc=3740,freq=12.0), product of:
                0.1830527 = queryWeight, product of:
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.047605187 = queryNorm
                0.7284514 = fieldWeight in 3740, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3740)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    Describes the belief function model for automatic indexing and ranking of documents with respect to a given user query. The model is based on a controlled vocabulary, such as a thesaurus, and on term frequencies in each document. Belief functions and query belief function can be defined and the agreement between a document belief function and a query belief function can be computed. Proposes that the set of documents be ranked according to their agreement with the given user query. Demonstrates that the Belief Function Model is wider in scope than the Standard Vector Space Model
    Object
    Belief Function Model
  5. Cribbin, T.: Discovering latent topical structure by second-order similarity analysis (2011) 0.23
    0.23219806 = product of:
      0.3095974 = sum of:
        0.18887727 = weight(_text_:vector in 4470) [ClassicSimilarity], result of:
          0.18887727 = score(doc=4470,freq=6.0), product of:
            0.30654848 = queryWeight, product of:
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.047605187 = queryNorm
            0.6161416 = fieldWeight in 4470, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4470)
        0.101278044 = weight(_text_:space in 4470) [ClassicSimilarity], result of:
          0.101278044 = score(doc=4470,freq=4.0), product of:
            0.24842183 = queryWeight, product of:
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.047605187 = queryNorm
            0.40768576 = fieldWeight in 4470, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4470)
        0.019442094 = product of:
          0.03888419 = sum of:
            0.03888419 = weight(_text_:model in 4470) [ClassicSimilarity], result of:
              0.03888419 = score(doc=4470,freq=2.0), product of:
                0.1830527 = queryWeight, product of:
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.047605187 = queryNorm
                0.21242073 = fieldWeight in 4470, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4470)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    Computing document similarity directly from a "bag of words" vector space model can be problematic because term independence causes the relationships between synonymous terms and the contextual influences that determine the sense of polysemous terms to be ignored. This study compares two methods that potentially address these problems by deriving the higher order relationships that lie latent within the original first-order space. The first is latent semantic analysis (LSA), a dimension reduction method that is a well-known means of addressing the vocabulary mismatch problem in information retrieval systems. The second is the lesser known yet conceptually simple approach of second-order similarity (SOS) analysis, whereby latent similarity is measured in terms of mutual first-order similarity. Nearest neighbour tests show that SOS analysis derives similarity models that are superior to both first-order and LSA-derived models at both coarse and fine levels of semantic granularity. SOS analysis has been criticized for its computational complexity. A second contribution is the novel application of vector truncation to reduce run-time by a constant factor. Speed-ups of 4 to 10 times are achievable without compromising the structural gains achieved by full-vector SOS analysis.
  6. Mestrovic, A.; Cali, A.: ¬An ontology-based approach to information retrieval (2017) 0.23
    0.23186485 = product of:
      0.30915314 = sum of:
        0.21809667 = weight(_text_:vector in 3489) [ClassicSimilarity], result of:
          0.21809667 = score(doc=3489,freq=8.0), product of:
            0.30654848 = queryWeight, product of:
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.047605187 = queryNorm
            0.711459 = fieldWeight in 3489, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3489)
        0.07161439 = weight(_text_:space in 3489) [ClassicSimilarity], result of:
          0.07161439 = score(doc=3489,freq=2.0), product of:
            0.24842183 = queryWeight, product of:
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.047605187 = queryNorm
            0.28827736 = fieldWeight in 3489, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3489)
        0.019442094 = product of:
          0.03888419 = sum of:
            0.03888419 = weight(_text_:model in 3489) [ClassicSimilarity], result of:
              0.03888419 = score(doc=3489,freq=2.0), product of:
                0.1830527 = queryWeight, product of:
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.047605187 = queryNorm
                0.21242073 = fieldWeight in 3489, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3489)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    We define a general framework for ontology-based information retrieval (IR). In our approach, document and query expansion rely on a base taxonomy that is extracted from a lexical database or a Linked Data set (e.g. WordNet, Wiktionary etc.). Each term from a document or query is modelled as a vector of base concepts from the base taxonomy. We define a set of mapping functions which map multiple ontological layers (dimensions) onto the base taxonomy. This way, each concept from the included ontologies can also be represented as a vector of base concepts from the base taxonomy. We propose a general weighting schema which is used for the vector space model. Our framework can therefore take into account various lexical and semantic relations between terms and concepts (e.g. synonymy, hierarchy, meronymy, antonymy, geo-proximity, etc.). This allows us to avoid certain vocabulary problems (e.g. synonymy, polysemy) as well as to reduce the vector size in the IR tasks.
  7. Akerele, O.; David, A.; Osofisan, A.: Using the concepts of Case Based Reasoning and Basic Categories for enhancing adaptation to the user's level of knowledge in Decision Support System (2014) 0.23
    0.22661653 = product of:
      0.30215538 = sum of:
        0.130858 = weight(_text_:vector in 1449) [ClassicSimilarity], result of:
          0.130858 = score(doc=1449,freq=2.0), product of:
            0.30654848 = queryWeight, product of:
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.047605187 = queryNorm
            0.4268754 = fieldWeight in 1449, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.046875 = fieldNorm(doc=1449)
        0.08593727 = weight(_text_:space in 1449) [ClassicSimilarity], result of:
          0.08593727 = score(doc=1449,freq=2.0), product of:
            0.24842183 = queryWeight, product of:
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.047605187 = queryNorm
            0.34593284 = fieldWeight in 1449, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.046875 = fieldNorm(doc=1449)
        0.085360095 = sum of:
          0.046661027 = weight(_text_:model in 1449) [ClassicSimilarity], result of:
            0.046661027 = score(doc=1449,freq=2.0), product of:
              0.1830527 = queryWeight, product of:
                3.845226 = idf(docFreq=2569, maxDocs=44218)
                0.047605187 = queryNorm
              0.25490487 = fieldWeight in 1449, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.845226 = idf(docFreq=2569, maxDocs=44218)
                0.046875 = fieldNorm(doc=1449)
          0.03869907 = weight(_text_:22 in 1449) [ClassicSimilarity], result of:
            0.03869907 = score(doc=1449,freq=2.0), product of:
              0.16670525 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.047605187 = queryNorm
              0.23214069 = fieldWeight in 1449, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=1449)
      0.75 = coord(3/4)
    
    Abstract
    In most search systems, mapping queries with documents employs techniques such as vector space model, naïve Bayes, Bayesian theorem etc. to classify resulting documents. In this research studies, we are proposing the use of the concept of basic categories to representing the user's level of knowledge based on the concepts he employed during his search activities, so that the system could propose adapted results based on the observed user's level of knowledge. Our hypothesis is that this approach will enhance the decision support system for solving decisional problems in which information retrieval constitutes the backbone technical problem.
    Source
    Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik
  8. Kalczynski, P.J.; Chou, A.: Temporal Document Retrieval Model for business news archives (2005) 0.23
    0.2250543 = product of:
      0.3000724 = sum of:
        0.15266767 = weight(_text_:vector in 1030) [ClassicSimilarity], result of:
          0.15266767 = score(doc=1030,freq=2.0), product of:
            0.30654848 = queryWeight, product of:
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.047605187 = queryNorm
            0.4980213 = fieldWeight in 1030, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1030)
        0.100260146 = weight(_text_:space in 1030) [ClassicSimilarity], result of:
          0.100260146 = score(doc=1030,freq=2.0), product of:
            0.24842183 = queryWeight, product of:
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.047605187 = queryNorm
            0.4035883 = fieldWeight in 1030, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1030)
        0.04714458 = product of:
          0.09428916 = sum of:
            0.09428916 = weight(_text_:model in 1030) [ClassicSimilarity], result of:
              0.09428916 = score(doc=1030,freq=6.0), product of:
                0.1830527 = queryWeight, product of:
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.047605187 = queryNorm
                0.51509297 = fieldWeight in 1030, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1030)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    Temporal expressions occurring in business news, such as "last week" or "at the end of this month," carry important information about the time context of the news document and were proved to be useful for document retrieval. We found that about 10% of these expressions are difficult to project onto the calendar due to the uncertainty about their bounds. This paper introduces a novel approach to representing temporal expressions. A user study is conducted to measure the degree of uncertainty for selected temporal expressions and a method for representing uncertainty based on fuzzy numbers is proposed. The classical Vector Space Model is extended to the Temporal Document Retrieval Model (TDRM) that incorporates the proposed fuzzy representations of temporal expressions.
  9. Terada, A.; Tokunaga, T.; Tanaka, H.: Automatic expansion of abbreviations by using context and character information (2004) 0.22
    0.22074673 = product of:
      0.29432896 = sum of:
        0.18506117 = weight(_text_:vector in 2560) [ClassicSimilarity], result of:
          0.18506117 = score(doc=2560,freq=4.0), product of:
            0.30654848 = queryWeight, product of:
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.047605187 = queryNorm
            0.603693 = fieldWeight in 2560, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.046875 = fieldNorm(doc=2560)
        0.08593727 = weight(_text_:space in 2560) [ClassicSimilarity], result of:
          0.08593727 = score(doc=2560,freq=2.0), product of:
            0.24842183 = queryWeight, product of:
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.047605187 = queryNorm
            0.34593284 = fieldWeight in 2560, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.046875 = fieldNorm(doc=2560)
        0.023330513 = product of:
          0.046661027 = sum of:
            0.046661027 = weight(_text_:model in 2560) [ClassicSimilarity], result of:
              0.046661027 = score(doc=2560,freq=2.0), product of:
                0.1830527 = queryWeight, product of:
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.047605187 = queryNorm
                0.25490487 = fieldWeight in 2560, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2560)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    Unknown words such as proper nouns, abbreviations, and acronyms are a major obstacle in text processing. Abbreviations, in particular, are difficult to read/process because they are often domain specific. In this paper, we propose a method for automatic expansion of abbreviations by using context and character information. In previous studies dictionaries were used to search for abbreviation expansion candidates (candidates words for original form of abbreviations) to expand abbreviations. We use a corpus with few abbreviations from the same field instead of a dictionary. We calculate the adequacy of abbreviation expansion candidates based on the similarity between the context of the target abbreviation and that of its expansion candidate. The similarity is calculated using a vector space model in which each vector element consists of words surrounding the target abbreviation and those of its expansion candidate. Experiments using approximately 10,000 documents in the field of aviation showed that the accuracy of the proposed method is 10% higher than that of previously developed methods.
  10. Lund, K.; Burgess, C.: Producing high-dimensional semantic spaces from lexical co-occurrence (1996) 0.22
    0.22074673 = product of:
      0.29432896 = sum of:
        0.18506117 = weight(_text_:vector in 1704) [ClassicSimilarity], result of:
          0.18506117 = score(doc=1704,freq=4.0), product of:
            0.30654848 = queryWeight, product of:
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.047605187 = queryNorm
            0.603693 = fieldWeight in 1704, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.046875 = fieldNorm(doc=1704)
        0.08593727 = weight(_text_:space in 1704) [ClassicSimilarity], result of:
          0.08593727 = score(doc=1704,freq=2.0), product of:
            0.24842183 = queryWeight, product of:
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.047605187 = queryNorm
            0.34593284 = fieldWeight in 1704, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.046875 = fieldNorm(doc=1704)
        0.023330513 = product of:
          0.046661027 = sum of:
            0.046661027 = weight(_text_:model in 1704) [ClassicSimilarity], result of:
              0.046661027 = score(doc=1704,freq=2.0), product of:
                0.1830527 = queryWeight, product of:
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.047605187 = queryNorm
                0.25490487 = fieldWeight in 1704, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1704)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    A procedure that processes a corpus of text and produces numeric vectors containing information about its meanings for each word is presented. This procedure is applied to a large corpus of natural language text taken from Usenet, and the resulting vectors are examined to determine what information is contained within them. These vectors provide the coordinates in a high-dimensional space in which word relationships can be analyzed. Analyses of both vector similarity and multidimensional scaling demonstrate that there is significant semantic information carried in the vectors. A comparison of vector similarity with human reaction times in a single-word priming experiment is presented. These vectors provide the basis for a representational model of semantic memory, hyperspace analogue to language (HAL).
  11. Huang, Y.-L.: ¬A theoretic and empirical research of cluster indexing for Mandarine Chinese full text document (1998) 0.22
    0.21856591 = product of:
      0.2914212 = sum of:
        0.15266767 = weight(_text_:vector in 513) [ClassicSimilarity], result of:
          0.15266767 = score(doc=513,freq=2.0), product of:
            0.30654848 = queryWeight, product of:
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.047605187 = queryNorm
            0.4980213 = fieldWeight in 513, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.0546875 = fieldNorm(doc=513)
        0.100260146 = weight(_text_:space in 513) [ClassicSimilarity], result of:
          0.100260146 = score(doc=513,freq=2.0), product of:
            0.24842183 = queryWeight, product of:
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.047605187 = queryNorm
            0.4035883 = fieldWeight in 513, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.0546875 = fieldNorm(doc=513)
        0.038493384 = product of:
          0.07698677 = sum of:
            0.07698677 = weight(_text_:model in 513) [ClassicSimilarity], result of:
              0.07698677 = score(doc=513,freq=4.0), product of:
                0.1830527 = queryWeight, product of:
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.047605187 = queryNorm
                0.4205716 = fieldWeight in 513, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=513)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    Since most popular commercialized systems for full text retrieval are designed with full text scaning and Boolean logic query mode, these systems use an oversimplified relationship between the indexing form and the content of document. Reports the use of Singular Value Decomposition (SVD) to develop a Cluster Indexing Model (CIM) based on a Vector Space Model (VSM) in orer to explore the index theory of cluster indexing for chinese full text documents. From a series of experiments, it was found that the indexing performance of CIM is better than traditional VSM, and has almost equivalent effectiveness of the authority control of index terms
  12. López-Pujalte, C.; Guerrero-Bote, V.P.; Moya-Anegón, F. de: Genetic algorithms in relevance feedback : a second test and new contributions (2003) 0.22
    0.21856591 = product of:
      0.2914212 = sum of:
        0.15266767 = weight(_text_:vector in 1076) [ClassicSimilarity], result of:
          0.15266767 = score(doc=1076,freq=2.0), product of:
            0.30654848 = queryWeight, product of:
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.047605187 = queryNorm
            0.4980213 = fieldWeight in 1076, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1076)
        0.100260146 = weight(_text_:space in 1076) [ClassicSimilarity], result of:
          0.100260146 = score(doc=1076,freq=2.0), product of:
            0.24842183 = queryWeight, product of:
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.047605187 = queryNorm
            0.4035883 = fieldWeight in 1076, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1076)
        0.038493384 = product of:
          0.07698677 = sum of:
            0.07698677 = weight(_text_:model in 1076) [ClassicSimilarity], result of:
              0.07698677 = score(doc=1076,freq=4.0), product of:
                0.1830527 = queryWeight, product of:
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.047605187 = queryNorm
                0.4205716 = fieldWeight in 1076, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1076)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    The present work is the continuation of an earlier study which reviewed the literature on relevance feedback genetic techniques that follow the vector space model (the model that is most commonly used in this type of application), and implemented them so that they could be compared with each other as well as with one of the best traditional methods of relevance feedback--the Ide dec-hi method. We here carry out the comparisons on more test collections (Cranfield, CISI, Medline, and NPL), using the residual collection method for their evaluation as is recommended in this type of technique. We also add some fitness functions of our own design.
  13. Savoy, J.; Ndarugendamwo, M.; Vrajitoru, D.: Report on the TREC-4 experiment : combining probabilistic and vector-space schemes (1996) 0.22
    0.21679527 = product of:
      0.43359053 = sum of:
        0.261716 = weight(_text_:vector in 7574) [ClassicSimilarity], result of:
          0.261716 = score(doc=7574,freq=2.0), product of:
            0.30654848 = queryWeight, product of:
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.047605187 = queryNorm
            0.8537508 = fieldWeight in 7574, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.09375 = fieldNorm(doc=7574)
        0.17187454 = weight(_text_:space in 7574) [ClassicSimilarity], result of:
          0.17187454 = score(doc=7574,freq=2.0), product of:
            0.24842183 = queryWeight, product of:
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.047605187 = queryNorm
            0.6918657 = fieldWeight in 7574, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.09375 = fieldNorm(doc=7574)
      0.5 = coord(2/4)
    
  14. Benoît, G.: Properties-based retrieval and user decision states : user control and behavior modeling (2004) 0.21
    0.21403947 = product of:
      0.28538597 = sum of:
        0.130858 = weight(_text_:vector in 2262) [ClassicSimilarity], result of:
          0.130858 = score(doc=2262,freq=2.0), product of:
            0.30654848 = queryWeight, product of:
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.047605187 = queryNorm
            0.4268754 = fieldWeight in 2262, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.046875 = fieldNorm(doc=2262)
        0.12153365 = weight(_text_:space in 2262) [ClassicSimilarity], result of:
          0.12153365 = score(doc=2262,freq=4.0), product of:
            0.24842183 = queryWeight, product of:
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.047605187 = queryNorm
            0.48922288 = fieldWeight in 2262, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.046875 = fieldNorm(doc=2262)
        0.03299433 = product of:
          0.06598866 = sum of:
            0.06598866 = weight(_text_:model in 2262) [ClassicSimilarity], result of:
              0.06598866 = score(doc=2262,freq=4.0), product of:
                0.1830527 = queryWeight, product of:
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.047605187 = queryNorm
                0.36048993 = fieldWeight in 2262, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2262)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    As retrieval set size in information retrieval (IR) becomes larger, users may need greater interactive opportunities to determine for themselves potential relevance of the resources offered by a given collection. A parts-of-document approach, coupled with an interactive graphic interface and control panel, permits end users to tailor the information seeking (IS) session. Applying the model described by the author in a previous paper in this journal, this paper explores two issues: whether a group of information seekers in the same research domain will want to use this type of IR interaction, and whether such interaction is more successful than relevancy ranked lists, based an the general vector model. In addition, the paper proposes the use of gradient space as a means of capturing end users' cognitive states- decision-making points-during a parts-of-document-based IR session. It concludes that, for a group of biomedical researchers, a parts-of-document approach is preferred for certain IR situations and that gradient space provides designers of systems with empirical evidence suited for systems analysis.
  15. Dolamic, L.; Savoy, J.: Indexing and searching strategies for the Russian language (2009) 0.21
    0.21224323 = product of:
      0.28299096 = sum of:
        0.15421765 = weight(_text_:vector in 3301) [ClassicSimilarity], result of:
          0.15421765 = score(doc=3301,freq=4.0), product of:
            0.30654848 = queryWeight, product of:
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.047605187 = queryNorm
            0.5030775 = fieldWeight in 3301, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3301)
        0.101278044 = weight(_text_:space in 3301) [ClassicSimilarity], result of:
          0.101278044 = score(doc=3301,freq=4.0), product of:
            0.24842183 = queryWeight, product of:
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.047605187 = queryNorm
            0.40768576 = fieldWeight in 3301, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3301)
        0.027495272 = product of:
          0.054990545 = sum of:
            0.054990545 = weight(_text_:model in 3301) [ClassicSimilarity], result of:
              0.054990545 = score(doc=3301,freq=4.0), product of:
                0.1830527 = queryWeight, product of:
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.047605187 = queryNorm
                0.30040827 = fieldWeight in 3301, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3301)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    This paper describes and evaluates various stemming and indexing strategies for the Russian language. We design and evaluate two stemming approaches, a light and a more aggressive one, and compare these stemmers to the Snowball stemmer, to no stemming, and also to a language-independent approach (n-gram). To evaluate the suggested stemming strategies we apply various probabilistic information retrieval (IR) models, including the Okapi, the Divergence from Randomness (DFR), a statistical language model (LM), as well as two vector-space approaches, namely, the classical tf idf scheme and the dtu-dtn model. We find that the vector-space dtu-dtn and the DFR models tend to result in better retrieval effectiveness than the Okapi, LM, or tf idf models, while only the latter two IR approaches result in statistically significant performance differences. Ignoring stemming generally reduces the MAP by more than 50%, and these differences are always significant. When applying an n-gram approach, performance differences are usually lower than an approach involving stemming. Finally, our light stemmer tends to perform best, although performance differences between the light, aggressive, and Snowball stemmers are not statistically significant.
  16. Fox, K.L.; Frieder, O.; Knepper, M.M.; Snowberg, E.J.: SENTINEL: a multiple engine information retrieval and visualization system (1999) 0.21
    0.21011007 = product of:
      0.28014675 = sum of:
        0.15266767 = weight(_text_:vector in 3547) [ClassicSimilarity], result of:
          0.15266767 = score(doc=3547,freq=2.0), product of:
            0.30654848 = queryWeight, product of:
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.047605187 = queryNorm
            0.4980213 = fieldWeight in 3547, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3547)
        0.100260146 = weight(_text_:space in 3547) [ClassicSimilarity], result of:
          0.100260146 = score(doc=3547,freq=2.0), product of:
            0.24842183 = queryWeight, product of:
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.047605187 = queryNorm
            0.4035883 = fieldWeight in 3547, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3547)
        0.027218932 = product of:
          0.054437865 = sum of:
            0.054437865 = weight(_text_:model in 3547) [ClassicSimilarity], result of:
              0.054437865 = score(doc=3547,freq=2.0), product of:
                0.1830527 = queryWeight, product of:
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.047605187 = queryNorm
                0.29738903 = fieldWeight in 3547, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3547)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    We describe a prototype Information Retrieval system; SENTINEL, under development at Harris Corporation's Information Systems Division. SENTINEL is a fusion of multiple information retrieval technologies, integrating n-grams, a vector space model, and a neural network training rule. One of the primary advantages of SENTINEL is its 3-dimensional visualization capability that is based fully upon the mathematical representation of information with SENTINEL. The 3-dimensional visualization capability provides users with an intuitive understanding, with relevance/query refinement techniques athat can be better utilized, resulting in higher retrieval precision
  17. Schutze, H.; Pederson, J.O.: ¬A cooccurrence-based thesaurus and two applications to information retrieval (1997) 0.20
    0.20439655 = product of:
      0.4087931 = sum of:
        0.24674822 = weight(_text_:vector in 153) [ClassicSimilarity], result of:
          0.24674822 = score(doc=153,freq=4.0), product of:
            0.30654848 = queryWeight, product of:
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.047605187 = queryNorm
            0.804924 = fieldWeight in 153, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.0625 = fieldNorm(doc=153)
        0.16204487 = weight(_text_:space in 153) [ClassicSimilarity], result of:
          0.16204487 = score(doc=153,freq=4.0), product of:
            0.24842183 = queryWeight, product of:
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.047605187 = queryNorm
            0.6522972 = fieldWeight in 153, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.0625 = fieldNorm(doc=153)
      0.5 = coord(2/4)
    
    Abstract
    Presents a new method for computing a thesaurus from a text corpus. Each word is represented as a vector in a multi-dimensional space that captures cooccurrence information. Words are defined to be similar if they have similar cooccurrence patterns. 2 different methods for using these thesaurus vectors in information retrieval are shown to significantly improve performance over the Tipster reference corpus as compared to a vector space baseline
  18. Kiela, D.; Clark, S.: Detecting compositionality of multi-word expressions using nearest neighbours in vector space models (2013) 0.20
    0.20439655 = product of:
      0.4087931 = sum of:
        0.24674822 = weight(_text_:vector in 1161) [ClassicSimilarity], result of:
          0.24674822 = score(doc=1161,freq=4.0), product of:
            0.30654848 = queryWeight, product of:
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.047605187 = queryNorm
            0.804924 = fieldWeight in 1161, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.0625 = fieldNorm(doc=1161)
        0.16204487 = weight(_text_:space in 1161) [ClassicSimilarity], result of:
          0.16204487 = score(doc=1161,freq=4.0), product of:
            0.24842183 = queryWeight, product of:
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.047605187 = queryNorm
            0.6522972 = fieldWeight in 1161, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.0625 = fieldNorm(doc=1161)
      0.5 = coord(2/4)
    
    Abstract
    We present a novel unsupervised approach to detecting the compositionality of multi-word expressions. We compute the compositionality of a phrase through substituting the constituent words with their "neighbours" in a semantic vector space and averaging over the distance between the original phrase and the substituted neighbour phrases. Several methods of obtaining neighbours are presented. The results are compared to existing supervised results and achieve state-of-the-art performance on a verb-object dataset of human compositionality ratings.
  19. Vallet, D.; Fernández, M.; Castells, P.: ¬An ontology-based information retrieval model (2005) 0.20
    0.19759221 = product of:
      0.26345628 = sum of:
        0.130858 = weight(_text_:vector in 4708) [ClassicSimilarity], result of:
          0.130858 = score(doc=4708,freq=2.0), product of:
            0.30654848 = queryWeight, product of:
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.047605187 = queryNorm
            0.4268754 = fieldWeight in 4708, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.046875 = fieldNorm(doc=4708)
        0.08593727 = weight(_text_:space in 4708) [ClassicSimilarity], result of:
          0.08593727 = score(doc=4708,freq=2.0), product of:
            0.24842183 = queryWeight, product of:
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.047605187 = queryNorm
            0.34593284 = fieldWeight in 4708, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.046875 = fieldNorm(doc=4708)
        0.046661027 = product of:
          0.09332205 = sum of:
            0.09332205 = weight(_text_:model in 4708) [ClassicSimilarity], result of:
              0.09332205 = score(doc=4708,freq=8.0), product of:
                0.1830527 = queryWeight, product of:
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.047605187 = queryNorm
                0.50980973 = fieldWeight in 4708, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4708)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    Semantic search has been one of the motivations of the Semantic Web since it was envisioned. We propose a model for the exploitation of ontologybased KBs to improve search over large document repositories. Our approach includes an ontology-based scheme for the semi-automatic annotation of documents, and a retrieval system. The retrieval model is based on an adaptation of the classic vector-space model, including an annotation weighting algorithm, and a ranking algorithm. Semantic search is combined with keyword-based search to achieve tolerance to KB incompleteness. Our proposal is illustrated with sample experiments showing improvements with respect to keyword-based search, and providing ground for further research and discussion.
  20. Liu, X.; Turtle, H.: Real-time user interest modeling for real-time ranking (2013) 0.20
    0.19759221 = product of:
      0.26345628 = sum of:
        0.130858 = weight(_text_:vector in 1035) [ClassicSimilarity], result of:
          0.130858 = score(doc=1035,freq=2.0), product of:
            0.30654848 = queryWeight, product of:
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.047605187 = queryNorm
            0.4268754 = fieldWeight in 1035, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.439392 = idf(docFreq=191, maxDocs=44218)
              0.046875 = fieldNorm(doc=1035)
        0.08593727 = weight(_text_:space in 1035) [ClassicSimilarity], result of:
          0.08593727 = score(doc=1035,freq=2.0), product of:
            0.24842183 = queryWeight, product of:
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.047605187 = queryNorm
            0.34593284 = fieldWeight in 1035, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.2183776 = idf(docFreq=650, maxDocs=44218)
              0.046875 = fieldNorm(doc=1035)
        0.046661027 = product of:
          0.09332205 = sum of:
            0.09332205 = weight(_text_:model in 1035) [ClassicSimilarity], result of:
              0.09332205 = score(doc=1035,freq=8.0), product of:
                0.1830527 = queryWeight, product of:
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.047605187 = queryNorm
                0.50980973 = fieldWeight in 1035, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.845226 = idf(docFreq=2569, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1035)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    User interest as a very dynamic information need is often ignored in most existing information retrieval systems. In this research, we present the results of experiments designed to evaluate the performance of a real-time interest model (RIM) that attempts to identify the dynamic and changing query level interests regarding social media outputs. Unlike most existing ranking methods, our ranking approach targets calculation of the probability that user interest in the content of the document is subject to very dynamic user interest change. We describe 2 formulations of the model (real-time interest vector space and real-time interest language model) stemming from classical relevance ranking methods and develop a novel methodology for evaluating the performance of RIM using Amazon Mechanical Turk to collect (interest-based) relevance judgments on a daily basis. Our results show that the model usually, although not always, performs better than baseline results obtained from commercial web search engines. We identify factors that affect RIM performance and outline plans for future research.

Languages

Types

  • a 5559
  • m 482
  • el 342
  • s 219
  • x 53
  • b 42
  • r 29
  • i 25
  • n 12
  • ? 8
  • p 7
  • d 4
  • u 2
  • z 2
  • au 1
  • h 1
  • More… Less…

Themes

Subjects

Classifications