Search (35 results, page 2 of 2)

  • × author_ss:"Robertson, S.E."
  1. MacFarlane, A.; McCann, J.A.; Robertson, S.E.: Parallel methods for the generation of partitioned inverted files (2005) 0.00
    0.0044962796 = product of:
      0.031473957 = sum of:
        0.0060537956 = weight(_text_:information in 651) [ClassicSimilarity], result of:
          0.0060537956 = score(doc=651,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.116372846 = fieldWeight in 651, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=651)
        0.025420163 = weight(_text_:retrieval in 651) [ClassicSimilarity], result of:
          0.025420163 = score(doc=651,freq=4.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.2835858 = fieldWeight in 651, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=651)
      0.14285715 = coord(2/14)
    
    Abstract
    Purpose - The generation of inverted indexes is one of the most computationally intensive activities for information retrieval systems: indexing large multi-gigabyte text databases can take many hours or even days to complete. We examine the generation of partitioned inverted files in order to speed up the process of indexing. Two types of index partitions are investigated: TermId and DocId. Design/methodology/approach - We use standard measures used in parallel computing such as speedup and efficiency to examine the computing results and also the space costs of our trial indexing experiments. Findings - The results from runs on both partitioning methods are compared and contrasted, concluding that DocId is the more efficient method. Practical implications - The practical implications are that the DocId partitioning method would in most circumstances be used for distributing inverted file data in a parallel computer, particularly if indexing speed is the primary consideration. Originality/value - The paper is of value to database administrators who manage large-scale text collections, and who need to use parallel computing to implement their text retrieval services.
  2. Vechtomova, O.; Robertson, S.E.: ¬A domain-independent approach to finding related entities (2012) 0.00
    0.0044962796 = product of:
      0.031473957 = sum of:
        0.0060537956 = weight(_text_:information in 2733) [ClassicSimilarity], result of:
          0.0060537956 = score(doc=2733,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.116372846 = fieldWeight in 2733, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2733)
        0.025420163 = weight(_text_:retrieval in 2733) [ClassicSimilarity], result of:
          0.025420163 = score(doc=2733,freq=4.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.2835858 = fieldWeight in 2733, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=2733)
      0.14285715 = coord(2/14)
    
    Abstract
    We propose an approach to the retrieval of entities that have a specific relationship with the entity given in a query. Our research goal is to investigate whether related entity finding problem can be addressed by combining a measure of relatedness of candidate answer entities to the query, and likelihood that the candidate answer entity belongs to the target entity category specified in the query. An initial list of candidate entities, extracted from top ranked documents retrieved for the query, is refined using a number of statistical and linguistic methods. The proposed method extracts the category of the target entity from the query, identifies instances of this category as seed entities, and computes similarity between candidate and seed entities. The evaluation was conducted on the Related Entity Finding task of the Entity Track of TREC 2010, as well as the QA list questions from TREC 2005 and 2006. Evaluation results demonstrate that the proposed methods are effective in finding related entities.
    Source
    Information processing and management. 48(2012) no.4, S.654-670
    Theme
    Semantisches Umfeld in Indexierung u. Retrieval
  3. Robertson, S.E.: Indexing theory and retrieval effectiveness (1979) 0.00
    0.004279707 = product of:
      0.059915897 = sum of:
        0.059915897 = weight(_text_:retrieval in 5175) [ClassicSimilarity], result of:
          0.059915897 = score(doc=5175,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.6684181 = fieldWeight in 5175, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.15625 = fieldNorm(doc=5175)
      0.071428575 = coord(1/14)
    
  4. Robertson, S.E.; Walker, S.; Hancock-Beaulieu, M.M.: Large test collection experiments of an operational, interactive system : OKAPI at TREC (1995) 0.00
    0.004004761 = product of:
      0.028033325 = sum of:
        0.0070627616 = weight(_text_:information in 6964) [ClassicSimilarity], result of:
          0.0070627616 = score(doc=6964,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.13576832 = fieldWeight in 6964, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6964)
        0.020970564 = weight(_text_:retrieval in 6964) [ClassicSimilarity], result of:
          0.020970564 = score(doc=6964,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.23394634 = fieldWeight in 6964, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6964)
      0.14285715 = coord(2/14)
    
    Source
    Information processing and management. 31(1995) no.3, S.345-360
    Theme
    Semantisches Umfeld in Indexierung u. Retrieval
  5. Vechtomova, O.; Karamuftuoglum, M.; Robertson, S.E.: On document relevance and lexical cohesion between query terms (2006) 0.00
    0.003790876 = product of:
      0.02653613 = sum of:
        0.00856136 = weight(_text_:information in 987) [ClassicSimilarity], result of:
          0.00856136 = score(doc=987,freq=4.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.16457605 = fieldWeight in 987, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=987)
        0.01797477 = weight(_text_:retrieval in 987) [ClassicSimilarity], result of:
          0.01797477 = score(doc=987,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.20052543 = fieldWeight in 987, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=987)
      0.14285715 = coord(2/14)
    
    Abstract
    Lexical cohesion is a property of text, achieved through lexical-semantic relations between words in text. Most information retrieval systems make use of lexical relations in text only to a limited extent. In this paper we empirically investigate whether the degree of lexical cohesion between the contexts of query terms' occurrences in a document is related to its relevance to the query. Lexical cohesion between distinct query terms in a document is estimated on the basis of the lexical-semantic relations (repetition, synonymy, hyponymy and sibling) that exist between there collocates - words that co-occur with them in the same windows of text. Experiments suggest significant differences between the lexical cohesion in relevant and non-relevant document sets exist. A document ranking method based on lexical cohesion shows some performance improvements.
    Source
    Information processing and management. 42(2006) no.5, S.1230-1247
  6. Robertson, S.E.; Walker, S.; Beaulieu, M.M.; Gatford, M.; Payne, A.: Okapi at TREC-4 (1996) 0.00
    0.0025678244 = product of:
      0.03594954 = sum of:
        0.03594954 = weight(_text_:retrieval in 7546) [ClassicSimilarity], result of:
          0.03594954 = score(doc=7546,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.40105087 = fieldWeight in 7546, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.09375 = fieldNorm(doc=7546)
      0.071428575 = coord(1/14)
    
    Source
    The Fourth Text Retrieval Conference (TREC-4). Ed.: K. Harman
  7. Beaulieu, M.M.; Gatford, M.; Huang, X.; Robertson, S.E.; Walker, S.; Williams, P.: Okapi an TREC-5 (1997) 0.00
    0.0025678244 = product of:
      0.03594954 = sum of:
        0.03594954 = weight(_text_:retrieval in 3097) [ClassicSimilarity], result of:
          0.03594954 = score(doc=3097,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.40105087 = fieldWeight in 3097, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.09375 = fieldNorm(doc=3097)
      0.071428575 = coord(1/14)
    
    Source
    The Fifth Text Retrieval Conference (TREC-5). Ed.: E.M. Voorhees u. D.K. Harman
  8. Robertson, S.E.: On term selection for query expansion (1990) 0.00
    0.002420968 = product of:
      0.033893548 = sum of:
        0.033893548 = weight(_text_:retrieval in 2650) [ClassicSimilarity], result of:
          0.033893548 = score(doc=2650,freq=4.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.37811437 = fieldWeight in 2650, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=2650)
      0.071428575 = coord(1/14)
    
    Abstract
    In the framework of a relevance feedback system, term values or term weights may be used to (a) select new terms for inclusion in a query, and/or (b) weight the terms for retrieval purposes once selected. It has sometimes been assumed that the same weighting formula should be used for both purposes. This paper sketches a quantitative argument which suggests that the two purposes require different weighting formulae
    Theme
    Semantisches Umfeld in Indexierung u. Retrieval
  9. Robertson, S.E.: On relevance weight estimation and query expansion (1986) 0.00
    0.0021398535 = product of:
      0.029957948 = sum of:
        0.029957948 = weight(_text_:retrieval in 3875) [ClassicSimilarity], result of:
          0.029957948 = score(doc=3875,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.33420905 = fieldWeight in 3875, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.078125 = fieldNorm(doc=3875)
      0.071428575 = coord(1/14)
    
    Theme
    Semantisches Umfeld in Indexierung u. Retrieval
  10. Robertson, S.E.: ¬The parametric description of retrieval tests : Part II: Overall measures (1969) 0.00
    0.002118347 = product of:
      0.029656855 = sum of:
        0.029656855 = weight(_text_:retrieval in 4156) [ClassicSimilarity], result of:
          0.029656855 = score(doc=4156,freq=4.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.33085006 = fieldWeight in 4156, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4156)
      0.071428575 = coord(1/14)
    
    Abstract
    Two general requirements for overall measures of retrieval effectiveness are proposed, namely that the measures should be as far as possible independent of generality (this is interpreted to mean that it can be described in terms of recall and fallout), and that it should be able to measure the effectiveness of a performance curve (it should not be restricted to a simple 2X2 table). Several measures that have been proposed are examined with these conditions in mind. It turns out that most of the satisfactory ones are directly or indirectly related to swet's measure A, the area under the recall-fallout curve. In particular, Brookes' measure S and Rocchio's normalized recall are versions of A.
  11. Robertson, S.E.: ¬The parametric description of retrieval tests : Part I: The basic parameters (1969) 0.00
    0.0014978976 = product of:
      0.020970564 = sum of:
        0.020970564 = weight(_text_:retrieval in 4155) [ClassicSimilarity], result of:
          0.020970564 = score(doc=4155,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.23394634 = fieldWeight in 4155, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4155)
      0.071428575 = coord(1/14)
    
  12. Robertson, S.E.; Hancock-Beaulieu, M.M.: On the evaluation of IR systems (1992) 0.00
    0.0011531039 = product of:
      0.016143454 = sum of:
        0.016143454 = weight(_text_:information in 2619) [ClassicSimilarity], result of:
          0.016143454 = score(doc=2619,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.3103276 = fieldWeight in 2619, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.125 = fieldNorm(doc=2619)
      0.071428575 = coord(1/14)
    
    Source
    Information processing and management. 28(1992) no.4, S.457-466
  13. Robertson, S.E.: ¬The probabilistic character of relevance (1977) 0.00
    0.0011531039 = product of:
      0.016143454 = sum of:
        0.016143454 = weight(_text_:information in 7399) [ClassicSimilarity], result of:
          0.016143454 = score(doc=7399,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.3103276 = fieldWeight in 7399, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.125 = fieldNorm(doc=7399)
      0.071428575 = coord(1/14)
    
    Source
    Information processing and management. 13(1977), S.247-251
  14. Bovey, J.D.; Robertson, S.E.: ¬An algorithm for weighted searching on a Boolean system (1984) 0.00
    0.001008966 = product of:
      0.014125523 = sum of:
        0.014125523 = weight(_text_:information in 788) [ClassicSimilarity], result of:
          0.014125523 = score(doc=788,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.27153665 = fieldWeight in 788, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.109375 = fieldNorm(doc=788)
      0.071428575 = coord(1/14)
    
    Source
    Information technology: research and development. 3(1984) no.1, S.84-87
  15. Robertson, S.E.; Walker, S.; Beaulieu, M.: Experimentation as a way of life : Okapi at TREC (2000) 0.00
    8.64828E-4 = product of:
      0.012107591 = sum of:
        0.012107591 = weight(_text_:information in 6030) [ClassicSimilarity], result of:
          0.012107591 = score(doc=6030,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.23274569 = fieldWeight in 6030, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.09375 = fieldNorm(doc=6030)
      0.071428575 = coord(1/14)
    
    Source
    Information processing and management. 36(2000) no.1, S.95-108