Search (3538 results, page 1 of 177)

  • × year_i:[1990 TO 2000}
  1. Losee, R.M.: Determining information retrieval and filtering performance without experimentation (1995) 0.31
    0.31449008 = sum of:
      0.00823978 = product of:
        0.03295912 = sum of:
          0.03295912 = weight(_text_:based in 3368) [ClassicSimilarity], result of:
            0.03295912 = score(doc=3368,freq=2.0), product of:
              0.14144066 = queryWeight, product of:
                3.0129938 = idf(docFreq=5906, maxDocs=44218)
                0.04694356 = queryNorm
              0.23302436 = fieldWeight in 3368, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.0129938 = idf(docFreq=5906, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3368)
        0.25 = coord(1/4)
      0.15808989 = weight(_text_:term in 3368) [ClassicSimilarity], result of:
        0.15808989 = score(doc=3368,freq=8.0), product of:
          0.21904005 = queryWeight, product of:
            4.66603 = idf(docFreq=1130, maxDocs=44218)
            0.04694356 = queryNorm
          0.72173965 = fieldWeight in 3368, product of:
            2.828427 = tf(freq=8.0), with freq of:
              8.0 = termFreq=8.0
            4.66603 = idf(docFreq=1130, maxDocs=44218)
            0.0546875 = fieldNorm(doc=3368)
      0.12589967 = weight(_text_:frequency in 3368) [ClassicSimilarity], result of:
        0.12589967 = score(doc=3368,freq=2.0), product of:
          0.27643865 = queryWeight, product of:
            5.888745 = idf(docFreq=332, maxDocs=44218)
            0.04694356 = queryNorm
          0.45543438 = fieldWeight in 3368, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.888745 = idf(docFreq=332, maxDocs=44218)
            0.0546875 = fieldNorm(doc=3368)
      0.022260714 = product of:
        0.04452143 = sum of:
          0.04452143 = weight(_text_:22 in 3368) [ClassicSimilarity], result of:
            0.04452143 = score(doc=3368,freq=2.0), product of:
              0.16438834 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.04694356 = queryNorm
              0.2708308 = fieldWeight in 3368, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3368)
        0.5 = coord(1/2)
    
    Abstract
    The performance of an information retrieval or text and media filtering system may be determined through analytic methods as well as by traditional simulation or experimental methods. These analytic methods can provide precise statements about expected performance. They can thus determine which of 2 similarly performing systems is superior. For both a single query terms and for a multiple query term retrieval model, a model for comparing the performance of different probabilistic retrieval methods is developed. This method may be used in computing the average search length for a query, given only knowledge of database parameter values. Describes predictive models for inverse document frequency, binary independence, and relevance feedback based retrieval and filtering. Simulation illustrate how the single term model performs and sample performance predictions are given for single term and multiple term problems
    Date
    22. 2.1996 13:14:10
  2. Wong, S.K.M.; Yao, Y.Y.: ¬An information-theoretic measure of term specifics (1992) 0.30
    0.29912633 = product of:
      0.3988351 = sum of:
        0.011652809 = product of:
          0.046611235 = sum of:
            0.046611235 = weight(_text_:based in 4807) [ClassicSimilarity], result of:
              0.046611235 = score(doc=4807,freq=4.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.3295462 = fieldWeight in 4807, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4807)
          0.25 = coord(1/4)
        0.20913327 = weight(_text_:term in 4807) [ClassicSimilarity], result of:
          0.20913327 = score(doc=4807,freq=14.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.9547718 = fieldWeight in 4807, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4807)
        0.17804901 = weight(_text_:frequency in 4807) [ClassicSimilarity], result of:
          0.17804901 = score(doc=4807,freq=4.0), product of:
            0.27643865 = queryWeight, product of:
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.04694356 = queryNorm
            0.6440815 = fieldWeight in 4807, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4807)
      0.75 = coord(3/4)
    
    Abstract
    The inverse document frequency (IDF) and signal-noise ratio (S/N) approaches are term weighting schemes based on term specifics. However, the existing justifications for these methods are still some what inconclusive and sometimes even based on incompatible assumptions. Introduces an information-theoretic measure for term specifics. Shows that the IDF weighting scheme can be derived from the proposed approach by assuming that the frequency of occurrence of each index term is uniform within the set of documents containing the term. The information-theoretic interpretation of term specifics also establishes the relationship between the IDF and S/N methods
  3. Wong, W.Y.P.; Lee, D.L.: Implementation of partial document ranking using inverted files (1993) 0.23
    0.23035437 = product of:
      0.30713916 = sum of:
        0.013317495 = product of:
          0.05326998 = sum of:
            0.05326998 = weight(_text_:based in 6539) [ClassicSimilarity], result of:
              0.05326998 = score(doc=6539,freq=4.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.37662423 = fieldWeight in 6539, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6539)
          0.25 = coord(1/4)
        0.09033708 = weight(_text_:term in 6539) [ClassicSimilarity], result of:
          0.09033708 = score(doc=6539,freq=2.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.41242266 = fieldWeight in 6539, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.0625 = fieldNorm(doc=6539)
        0.20348458 = weight(_text_:frequency in 6539) [ClassicSimilarity], result of:
          0.20348458 = score(doc=6539,freq=4.0), product of:
            0.27643865 = queryWeight, product of:
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.04694356 = queryNorm
            0.7360931 = fieldWeight in 6539, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.0625 = fieldNorm(doc=6539)
      0.75 = coord(3/4)
    
    Abstract
    Examines the implementations of document ranking based on inverted files. Studies three heuristic methods for implementing the term frequency X inverse document frequency weighting strategy. The basic idea of the heuristic methods is to process the query terms in an order so that as many top documents as possible can be identified without processing all of the query terms. The heuristics were evaluated and compared. The results show improved performance. Two methods for estimating the retrieval accuracy were studied. All experiments were based on four test collection made available with the SMART system
  4. Paijmans, H.: Gravity wells of meaning : detecting information rich passages in scientific texts (1997) 0.23
    0.22742893 = product of:
      0.30323857 = sum of:
        0.009416891 = product of:
          0.037667565 = sum of:
            0.037667565 = weight(_text_:based in 7444) [ClassicSimilarity], result of:
              0.037667565 = score(doc=7444,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.26631355 = fieldWeight in 7444, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0625 = fieldNorm(doc=7444)
          0.25 = coord(1/4)
        0.09033708 = weight(_text_:term in 7444) [ClassicSimilarity], result of:
          0.09033708 = score(doc=7444,freq=2.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.41242266 = fieldWeight in 7444, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.0625 = fieldNorm(doc=7444)
        0.20348458 = weight(_text_:frequency in 7444) [ClassicSimilarity], result of:
          0.20348458 = score(doc=7444,freq=4.0), product of:
            0.27643865 = queryWeight, product of:
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.04694356 = queryNorm
            0.7360931 = fieldWeight in 7444, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.0625 = fieldNorm(doc=7444)
      0.75 = coord(3/4)
    
    Abstract
    Presents research in which 4 term weigthing schemes were used to detect information rich passages in texts and the results compared. Demonstrates that word categories and frequency derived weights have a close correlation but that weighting according to the first mention theory or the cue method shows no correlation with frequency based weights
  5. Baayen, R.H.; Lieber, H.: Word frequency distributions and lexical semantics (1997) 0.20
    0.20030972 = product of:
      0.40061945 = sum of:
        0.35609803 = weight(_text_:frequency in 3117) [ClassicSimilarity], result of:
          0.35609803 = score(doc=3117,freq=4.0), product of:
            0.27643865 = queryWeight, product of:
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.04694356 = queryNorm
            1.288163 = fieldWeight in 3117, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.109375 = fieldNorm(doc=3117)
        0.04452143 = product of:
          0.08904286 = sum of:
            0.08904286 = weight(_text_:22 in 3117) [ClassicSimilarity], result of:
              0.08904286 = score(doc=3117,freq=2.0), product of:
                0.16438834 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04694356 = queryNorm
                0.5416616 = fieldWeight in 3117, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=3117)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Relation between meaning, lexical productivity and frequency of use
    Date
    28. 2.1999 10:48:22
  6. Sun, Q.; Shaw, D.; Davis, C.H.: ¬A model for estimating the occurence of same-frequency words and the boundary between high- and low-frequency words in texts (1999) 0.19
    0.19296938 = product of:
      0.38593876 = sum of:
        0.00823978 = product of:
          0.03295912 = sum of:
            0.03295912 = weight(_text_:based in 3063) [ClassicSimilarity], result of:
              0.03295912 = score(doc=3063,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.23302436 = fieldWeight in 3063, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3063)
          0.25 = coord(1/4)
        0.377699 = weight(_text_:frequency in 3063) [ClassicSimilarity], result of:
          0.377699 = score(doc=3063,freq=18.0), product of:
            0.27643865 = queryWeight, product of:
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.04694356 = queryNorm
            1.3663031 = fieldWeight in 3063, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3063)
      0.5 = coord(2/4)
    
    Abstract
    A simpler model is proposed for estimating the frequency of any same-frequency words and identifying the boundary point between high-frequency words and low-frequency words in a text. The model, based on a 'maximum-ranking method', assigns ranks to the words and estimates word frequency by a formula. The boundary value between high-frequency and low-frequency words is obtained by taking the square root of the number of different words in the text. This straightforward model was used successfully with both English and Chinese texts
  7. Lee, D.L.; Ren, L.: Document ranking on weight-partitioned signature files (1996) 0.19
    0.18807726 = product of:
      0.3761545 = sum of:
        0.15808989 = weight(_text_:term in 2417) [ClassicSimilarity], result of:
          0.15808989 = score(doc=2417,freq=8.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.72173965 = fieldWeight in 2417, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2417)
        0.21806462 = weight(_text_:frequency in 2417) [ClassicSimilarity], result of:
          0.21806462 = score(doc=2417,freq=6.0), product of:
            0.27643865 = queryWeight, product of:
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.04694356 = queryNorm
            0.78883547 = fieldWeight in 2417, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2417)
      0.5 = coord(2/4)
    
    Abstract
    Proposes the weight partitioned signature file, a signature file organization for supporting document ranking. It uses multiple signature files each corresponding to one term frequency to represent terms with different term frequencies. Words with the same term frequency in a document are grouped together and hased into the signature file corresponding to that term frequency. Investigates the effect of false drops on retrieval effectiveness. Analyses the performance of the weight partitioned signature file under different search strategies and configurations. Obtains an optimal formula for storage allocation to minimise the effect of false drops on document ranks. Analytical results are supported by experiments on document collections
  8. Jascó, P.: KnowledgeFinder PsycLIT (1998) 0.18
    0.18272948 = product of:
      0.2436393 = sum of:
        0.009416891 = product of:
          0.037667565 = sum of:
            0.037667565 = weight(_text_:based in 3587) [ClassicSimilarity], result of:
              0.037667565 = score(doc=3587,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.26631355 = fieldWeight in 3587, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3587)
          0.25 = coord(1/4)
        0.09033708 = weight(_text_:term in 3587) [ClassicSimilarity], result of:
          0.09033708 = score(doc=3587,freq=2.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.41242266 = fieldWeight in 3587, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.0625 = fieldNorm(doc=3587)
        0.14388533 = weight(_text_:frequency in 3587) [ClassicSimilarity], result of:
          0.14388533 = score(doc=3587,freq=2.0), product of:
            0.27643865 = queryWeight, product of:
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.04694356 = queryNorm
            0.5204964 = fieldWeight in 3587, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.0625 = fieldNorm(doc=3587)
      0.75 = coord(3/4)
    
    Abstract
    Reviews PsycLIT from the American Psychological Association on the KnowledgeFinder CD-ROM version from the Aries Systemc Corporation. Focuses on its natural language search capability which can handle informal queries. Searches can be optimised for precisiion and recall using a slide bar, and the number of results may be maximised. The search system work by multilayered, weighted scoring of each term in the records. It assigns scores based on the frequency and position of the terms in the records and the uniqueness of the terms in the entire database
  9. Haiqi, Z.: ¬The literature of Qigong : publication patterns and subject headings (1997) 0.17
    0.17040399 = product of:
      0.22720532 = sum of:
        0.079044946 = weight(_text_:term in 862) [ClassicSimilarity], result of:
          0.079044946 = score(doc=862,freq=2.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.36086982 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.0546875 = fieldNorm(doc=862)
        0.12589967 = weight(_text_:frequency in 862) [ClassicSimilarity], result of:
          0.12589967 = score(doc=862,freq=2.0), product of:
            0.27643865 = queryWeight, product of:
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.04694356 = queryNorm
            0.45543438 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.0546875 = fieldNorm(doc=862)
        0.022260714 = product of:
          0.04452143 = sum of:
            0.04452143 = weight(_text_:22 in 862) [ClassicSimilarity], result of:
              0.04452143 = score(doc=862,freq=2.0), product of:
                0.16438834 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04694356 = queryNorm
                0.2708308 = fieldWeight in 862, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=862)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    Reports results of a bibliometric study of the literature of Qigong: a relaxation technique used to teach patients to control their heart rate, blood pressure, temperature and other involuntary functions through controlles breathing. All articles indexed in the MEDLINE CD-ROM database, between 1965 and 1995 were identified using 'breathing exercises' MeSH term. The articles were analyzed for geographical and language distribution and a ranking exercise enabled a core list of periodicals to be identified. In addition, the study shed light on the changing frequency of the MeSH terms and evaluated the research areas by measuring the information from these respective MeSH headings
    Source
    International forum on information and documentation. 22(1997) no.3, S.38-44
  10. Park, Y.C.; Choi, K.-S.: Automatic thesaurus construction using Bayesian networks (1996) 0.17
    0.16562025 = product of:
      0.3312405 = sum of:
        0.12775593 = weight(_text_:term in 6581) [ClassicSimilarity], result of:
          0.12775593 = score(doc=6581,freq=4.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.58325374 = fieldWeight in 6581, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.0625 = fieldNorm(doc=6581)
        0.20348458 = weight(_text_:frequency in 6581) [ClassicSimilarity], result of:
          0.20348458 = score(doc=6581,freq=4.0), product of:
            0.27643865 = queryWeight, product of:
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.04694356 = queryNorm
            0.7360931 = fieldWeight in 6581, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.0625 = fieldNorm(doc=6581)
      0.5 = coord(2/4)
    
    Abstract
    Automatic thesaurus construction is accomplished by extracting term relations mechanically. A popular method uses statistical analysis to discover the term relations. For low frequency terms the statistical information of the terms cannot be reliably used for deciding the relationship of terms. This problem is referred to as the data sparseness problem. Many studies have shown that low frequency terms are of most use in thesaurus construction. Characterizes the statistical behaviour of terms by using an inference network. Develops a formal approach using a Baysian network for the data sparseness problem
  11. Robertson, S.E.; Walker, S.; Hancock-Beaulieu, M.M.: Large test collection experiments of an operational, interactive system : OKAPI at TREC (1995) 0.16
    0.16244806 = product of:
      0.21659742 = sum of:
        0.011652809 = product of:
          0.046611235 = sum of:
            0.046611235 = weight(_text_:based in 6964) [ClassicSimilarity], result of:
              0.046611235 = score(doc=6964,freq=4.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.3295462 = fieldWeight in 6964, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=6964)
          0.25 = coord(1/4)
        0.079044946 = weight(_text_:term in 6964) [ClassicSimilarity], result of:
          0.079044946 = score(doc=6964,freq=2.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.36086982 = fieldWeight in 6964, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6964)
        0.12589967 = weight(_text_:frequency in 6964) [ClassicSimilarity], result of:
          0.12589967 = score(doc=6964,freq=2.0), product of:
            0.27643865 = queryWeight, product of:
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.04694356 = queryNorm
            0.45543438 = fieldWeight in 6964, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6964)
      0.75 = coord(3/4)
    
    Abstract
    The Okapi system has been used in a series of experiments on the TREC collections, investiganting probabilistic methods, relevance feedback, and query expansion, and interaction issues. Some new probabilistic models have been developed, resulting in simple weigthing functions that take account of document length and within document and within query term frequency. All have been shown to be beneficial when based on large quantities of relevance data as in the routing task. Interaction issues are much more difficult to evaluate in the TREC framework, and no benefits have yet been demonstrated from feedback based on small numbers of 'relevant' items identified by intermediary searchers
  12. Boynton, J.: Identifying systematic reviews in MEDLINE : developing an objective approach to search strategy design (1998) 0.14
    0.1392412 = product of:
      0.18565494 = sum of:
        0.009988121 = product of:
          0.039952483 = sum of:
            0.039952483 = weight(_text_:based in 2660) [ClassicSimilarity], result of:
              0.039952483 = score(doc=2660,freq=4.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.28246817 = fieldWeight in 2660, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2660)
          0.25 = coord(1/4)
        0.06775281 = weight(_text_:term in 2660) [ClassicSimilarity], result of:
          0.06775281 = score(doc=2660,freq=2.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.309317 = fieldWeight in 2660, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.046875 = fieldNorm(doc=2660)
        0.107914 = weight(_text_:frequency in 2660) [ClassicSimilarity], result of:
          0.107914 = score(doc=2660,freq=2.0), product of:
            0.27643865 = queryWeight, product of:
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.04694356 = queryNorm
            0.39037234 = fieldWeight in 2660, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.046875 = fieldNorm(doc=2660)
      0.75 = coord(3/4)
    
    Abstract
    Systematic reviews are becoming increasingly important for health care professionals seeking to provide evidence based health care. In the past, systematic reviews have been difficult to identify among the mass of literature labelled 'reviews'. Reports results of a study to design search strategies based on a more objective approach to strategy construction. MEDLINE was chosen as the database and word frequencies in the titles, abstracts and subject keywords of a collection of systematic reviews of the effective health care interventions were analyzed to derive a highly sensitive search strategy. 'Sensitivity' was used in preference to the usual term 'recall' as one of the measures (in addition to the usual 'precision'). The proposed strategy was found to offer 98% sensitivity in retrieving systematic reviews, while retaining a low but acceptable level of precision (20%). Reports results using other strategies with other levels of sensitivity and precision. Concludes that it is possible to use frequency analysis to generate highly sensitive strategies when retrieving systematic reviews
  13. Smith, M.P.; Pollitt, S.A.: ¬A comparison of ranking formulae and their ranks (1995) 0.13
    0.13140477 = product of:
      0.26280954 = sum of:
        0.13690987 = weight(_text_:term in 5802) [ClassicSimilarity], result of:
          0.13690987 = score(doc=5802,freq=6.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.62504494 = fieldWeight in 5802, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5802)
        0.12589967 = weight(_text_:frequency in 5802) [ClassicSimilarity], result of:
          0.12589967 = score(doc=5802,freq=2.0), product of:
            0.27643865 = queryWeight, product of:
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.04694356 = queryNorm
            0.45543438 = fieldWeight in 5802, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5802)
      0.5 = coord(2/4)
    
    Abstract
    Reports a study to compare the ranking produced by several well known probabilistic formulae. Values for the variables used in these formulae (collection frequency for a query term, number of relevant documents retrieved, and number of relevant documents retrieved, and number of relevant documents indexed by the query term) were derived using a random number generator, the number of documents in the collection was fixed at 500.000. This produced ranked bands for each formula using document term characteristics rather than actual documents. These rankings were compared with one another using the Spearman Rho ranked correlation co-efficient to determine how closely the algorithms rank documents. There is little difference in the rankings produced by the Expected Mutual Information measure EMIM and the simpler F4.5 weighting scheme
  14. Larson, R.R.: ¬The decline of subject searching : long-term trends and patterns of index use in an online catalog (1991) 0.13
    0.12733267 = product of:
      0.25466534 = sum of:
        0.06775281 = weight(_text_:term in 1104) [ClassicSimilarity], result of:
          0.06775281 = score(doc=1104,freq=2.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.309317 = fieldWeight in 1104, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.046875 = fieldNorm(doc=1104)
        0.18691254 = weight(_text_:frequency in 1104) [ClassicSimilarity], result of:
          0.18691254 = score(doc=1104,freq=6.0), product of:
            0.27643865 = queryWeight, product of:
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.04694356 = queryNorm
            0.6761447 = fieldWeight in 1104, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.046875 = fieldNorm(doc=1104)
      0.5 = coord(2/4)
    
    Abstract
    Search index usage in a large university online catalog system over a six-year period (representing about 15,3 million searches) was investigated using transaction monitor data. Mathematical models of trends and patterns in the data were developed and tested using regression techniques. The results of the analyses show a consistent decline in the frequency of subject index use by online catalog users, with a corresponding increase in the frequency of title keyword searching. Significant annual patterns in index usage were also identified. Analysis of the transaction data, and related previous studies of online catalog users, suggest a number of factors contributing to the decline in subject search frequency. Chief among these factors are user difficulties in formulating subject queries with LCSH, leading to search failure, and the problem of "information overload" as database size increases. This article presents the models and results of the transaction log analysis, discusses the underlying problems with subject searching contributing to the observed decline, and reviews some proposed improvements to online catalog systems to aid in overcoming these problems
  15. Hudnut, S.K.: Finding answers by the numbers : statistical analysis of online search results (1993) 0.12
    0.1242152 = product of:
      0.2484304 = sum of:
        0.09581695 = weight(_text_:term in 555) [ClassicSimilarity], result of:
          0.09581695 = score(doc=555,freq=4.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.4374403 = fieldWeight in 555, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.046875 = fieldNorm(doc=555)
        0.15261345 = weight(_text_:frequency in 555) [ClassicSimilarity], result of:
          0.15261345 = score(doc=555,freq=4.0), product of:
            0.27643865 = queryWeight, product of:
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.04694356 = queryNorm
            0.55206984 = fieldWeight in 555, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.046875 = fieldNorm(doc=555)
      0.5 = coord(2/4)
    
    Abstract
    Online searchers today no longer limit themselves to locating references to articles. More and more, they are called upon to locate specific answers to questions such as: Who is my chief competitor for this technology? Who is publishing the most on this subject? What is the geographic distribution of this product? These questions demand answers, not necessarily from record content, but from statistical analysis of the terms in a set of records. Most online services now provide a tool for statistical analysis such as GET on Orbit, ZOOM on ESA/IRS and RANK/RANK FILES on Dialog. With these commands, users can analyze term frequency to extrapolate very precise answers to a wide range of questions. This paper discusses the many uses of term frequency analysis and how it can be applied to areas of competitive intelligence, market analysis, bibliometric analysis and improvements of search results. The applications are illustrated by examples from Dialog
  16. Vossen, G.A.: Strategic knowledge acquisition (1996) 0.12
    0.121236674 = product of:
      0.1616489 = sum of:
        0.0070626684 = product of:
          0.028250674 = sum of:
            0.028250674 = weight(_text_:based in 915) [ClassicSimilarity], result of:
              0.028250674 = score(doc=915,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.19973516 = fieldWeight in 915, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.046875 = fieldNorm(doc=915)
          0.25 = coord(1/4)
        0.13550562 = weight(_text_:term in 915) [ClassicSimilarity], result of:
          0.13550562 = score(doc=915,freq=8.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.618634 = fieldWeight in 915, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.046875 = fieldNorm(doc=915)
        0.019080611 = product of:
          0.038161222 = sum of:
            0.038161222 = weight(_text_:22 in 915) [ClassicSimilarity], result of:
              0.038161222 = score(doc=915,freq=2.0), product of:
                0.16438834 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04694356 = queryNorm
                0.23214069 = fieldWeight in 915, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=915)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    In the competitive equation for the future Economies become knowledge-based. Therefore in Knowledge Intensive Firms (KIFs) the strategie management of knowledge becomes increasingly important. Im this paper three important conditions for efficient and affective knowledge acquisition are identified: Coordination, Communication and long term Contract. Research by the author showed that co-ordination is a relative important condition for Small and Medium sized industrial KIFs. For larger national and multinational industrial KIFs communication and Jong term contracts are relative important conditions. Because of the lack of time for co-ordination and communication a small and medium sized KIF should welcome am extemal knowledge broker as intermediary. Because knowledge is more than R&D a larger industrial KIF should adapt am approach to strategic knowledge management with am intemal knowledge broker, who is responsible for co-ordination, communication and establishing long term contracts. Furthermore, a Strategic Knowledge Network is an option im KIFs and between KIFs and partners for effective and efficient co-ordination, communication and Jong term cont(r)acts.
    Source
    Knowledge management: organization competence and methodolgy. Proceedings of the Fourth International ISMICK Symposium, 21-22 October 1996, Netherlands. Ed.: J.F. Schreinemakers
  17. Smet, E. de: Evaluation of a computerised community information system through transaction analysis and user survey (1995) 0.12
    0.11730012 = product of:
      0.15640016 = sum of:
        0.00823978 = product of:
          0.03295912 = sum of:
            0.03295912 = weight(_text_:based in 2304) [ClassicSimilarity], result of:
              0.03295912 = score(doc=2304,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.23302436 = fieldWeight in 2304, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2304)
          0.25 = coord(1/4)
        0.12589967 = weight(_text_:frequency in 2304) [ClassicSimilarity], result of:
          0.12589967 = score(doc=2304,freq=2.0), product of:
            0.27643865 = queryWeight, product of:
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.04694356 = queryNorm
            0.45543438 = fieldWeight in 2304, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2304)
        0.022260714 = product of:
          0.04452143 = sum of:
            0.04452143 = weight(_text_:22 in 2304) [ClassicSimilarity], result of:
              0.04452143 = score(doc=2304,freq=2.0), product of:
                0.16438834 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04694356 = queryNorm
                0.2708308 = fieldWeight in 2304, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2304)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    Reports on the results of a transaction analysis and user survey, evaluating a pilot system for computerized community information in a public library, based on the GDIS system (Gemeenschaps Informatie Documentair System). The non hierarchical and global approach to the integrated database proved to be useful for novice users. Out of many parameters only frequency of use correlates with retrieval success. The online questionnaire proved to be worthwhile although restricted in scope. The logbook transaction analysis yielded a rich amount of useful management information for the systems managers. The user survey yielded a rich set of data on which to perform statistical analyses according to social science practice, from which some interesting relations could be detected
    Date
    23.10.1995 19:22:11
  18. Chan, L.M.; Vizine-Goetz, D.: Feasibility of a computer-generated subject validation file based on frequency of occurrence of assigned LC Subject Headings (1995) 0.11
    0.11497667 = product of:
      0.22995333 = sum of:
        0.014125337 = product of:
          0.056501348 = sum of:
            0.056501348 = weight(_text_:based in 6816) [ClassicSimilarity], result of:
              0.056501348 = score(doc=6816,freq=2.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.39947033 = fieldWeight in 6816, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.09375 = fieldNorm(doc=6816)
          0.25 = coord(1/4)
        0.215828 = weight(_text_:frequency in 6816) [ClassicSimilarity], result of:
          0.215828 = score(doc=6816,freq=2.0), product of:
            0.27643865 = queryWeight, product of:
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.04694356 = queryNorm
            0.7807447 = fieldWeight in 6816, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.09375 = fieldNorm(doc=6816)
      0.5 = coord(2/4)
    
  19. Keen, E.M.: Designing and testing an interactive ranked retrieval system for professional searchers (1994) 0.11
    0.11248532 = product of:
      0.22497064 = sum of:
        0.09779277 = weight(_text_:term in 1066) [ClassicSimilarity], result of:
          0.09779277 = score(doc=1066,freq=6.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.44646066 = fieldWeight in 1066, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1066)
        0.12717786 = weight(_text_:frequency in 1066) [ClassicSimilarity], result of:
          0.12717786 = score(doc=1066,freq=4.0), product of:
            0.27643865 = queryWeight, product of:
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.04694356 = queryNorm
            0.46005818 = fieldWeight in 1066, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.888745 = idf(docFreq=332, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1066)
      0.5 = coord(2/4)
    
    Abstract
    Reports 3 explorations of ranked system design. 2 tests used a 'cystic fibrosis' test collection with 100 queries. Experiment 1 compared a Boolean with a ranked interactive system using a subject qualified trained searcher, and reporting recall and precision results. Experiment 2 compared 15 different ranked match algorithms in a batch mode using 2 test collections, and included some new proximate pairs and term weighting approaches. Experiment 3 is a design plan for an interactive ranked prototype offering mid search algorithm choices plus other manual search devices (such as obligatory and unwanted terms), as influenced by thinking aloud comments from experiment 1. Concludes that, in Boolean versus ranked using inverse collection frequency, the searcher inspected more records on ranked than Boolean and so achieved a higher recall but lower precision; however, the presentation order of the relevant records, was, on average, very similar in both systems. Concludes also that: query reformulation was quite strongly practised in ranked searching but does not appear to have been effective; the term pairs proximate weithing methods in experiment 2 enhanced precision on both test collections when used with inverse collection frequency weighting (ICF); and the design plan for an interactive prototype adds to a selection of match algorithms other devices, such as obligatory and unwanted term marking, evidence for this being found from think aloud comments
  20. Huffman, G.D.; Vital, D.A.; Bivins, R.G.: Generating indices with lexical association methods : term uniqueness (1990) 0.11
    0.108049594 = product of:
      0.14406613 = sum of:
        0.008323434 = product of:
          0.033293735 = sum of:
            0.033293735 = weight(_text_:based in 4152) [ClassicSimilarity], result of:
              0.033293735 = score(doc=4152,freq=4.0), product of:
                0.14144066 = queryWeight, product of:
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.04694356 = queryNorm
                0.23539014 = fieldWeight in 4152, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.0129938 = idf(docFreq=5906, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4152)
          0.25 = coord(1/4)
        0.07984746 = weight(_text_:term in 4152) [ClassicSimilarity], result of:
          0.07984746 = score(doc=4152,freq=4.0), product of:
            0.21904005 = queryWeight, product of:
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.04694356 = queryNorm
            0.3645336 = fieldWeight in 4152, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.66603 = idf(docFreq=1130, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4152)
        0.055895224 = product of:
          0.11179045 = sum of:
            0.11179045 = weight(_text_:assessment in 4152) [ClassicSimilarity], result of:
              0.11179045 = score(doc=4152,freq=4.0), product of:
                0.25917634 = queryWeight, product of:
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.04694356 = queryNorm
                0.43132967 = fieldWeight in 4152, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.52102 = idf(docFreq=480, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4152)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    A software system has been developed which orders citations retrieved from an online database in terms of relevancy. The system resulted from an effort generated by NASA's Technology Utilization Program to create new advanced software tools to largely automate the process of determining relevancy of database citations retrieved to support large technology transfer studies. The ranking is based on the generation of an enriched vocabulary using lexical association methods, a user assessment of the vocabulary and a combination of the user assessment and the lexical metric. One of the key elements in relevancy ranking is the enriched vocabulary -the terms mst be both unique and descriptive. This paper examines term uniqueness. Six lexical association methods were employed to generate characteristic word indices. A limited subset of the terms - the highest 20,40,60 and 7,5% of the uniquess words - we compared and uniquess factors developed. Computational times were also measured. It was found that methods based on occurrences and signal produced virtually the same terms. The limited subset of terms producedby the exact and centroid discrimination value were also nearly identical. Unique terms sets were produced by teh occurrence, variance and discrimination value (centroid), An end-user evaluation showed that the generated terms were largely distinct and had values of word precision which were consistent with values of the search precision.

Languages

Types

Themes

Subjects

Classifications