Search (337 results, page 1 of 17)

  • × theme_ss:"Retrievalalgorithmen"
  1. MacFarlane, A.; Robertson, S.E.; McCann, J.A.: Parallel computing for passage retrieval (2004) 0.11
    0.10531742 = product of:
      0.21063484 = sum of:
        0.047231287 = weight(_text_:retrieval in 5108) [ClassicSimilarity], result of:
          0.047231287 = score(doc=5108,freq=4.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.37811437 = fieldWeight in 5108, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=5108)
        0.008925388 = weight(_text_:of in 5108) [ClassicSimilarity], result of:
          0.008925388 = score(doc=5108,freq=2.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.13821793 = fieldWeight in 5108, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=5108)
        0.008828212 = product of:
          0.017656423 = sum of:
            0.017656423 = weight(_text_:on in 5108) [ClassicSimilarity], result of:
              0.017656423 = score(doc=5108,freq=2.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.19440265 = fieldWeight in 5108, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5108)
          0.5 = coord(1/2)
        0.14564995 = sum of:
          0.10089115 = weight(_text_:computers in 5108) [ClassicSimilarity], result of:
            0.10089115 = score(doc=5108,freq=2.0), product of:
              0.21710795 = queryWeight, product of:
                5.257537 = idf(docFreq=625, maxDocs=44218)
                0.041294612 = queryNorm
              0.464705 = fieldWeight in 5108, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.257537 = idf(docFreq=625, maxDocs=44218)
                0.0625 = fieldNorm(doc=5108)
          0.0447588 = weight(_text_:22 in 5108) [ClassicSimilarity], result of:
            0.0447588 = score(doc=5108,freq=2.0), product of:
              0.1446067 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.041294612 = queryNorm
              0.30952093 = fieldWeight in 5108, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0625 = fieldNorm(doc=5108)
      0.5 = coord(4/8)
    
    Abstract
    In this paper methods for both speeding up passage processing and examining more passages using parallel computers are explored. The number of passages processed are varied in order to examine the effect on retrieval effectiveness and efficiency. The particular algorithm applied has previously been used to good effect in Okapi experiments at TREC. This algorithm and the mechanism for applying parallel computing to speed up processing are described.
    Date
    20. 1.2007 18:30:22
  2. Faloutsos, C.: Signature files (1992) 0.09
    0.08671036 = product of:
      0.13873658 = sum of:
        0.057846278 = weight(_text_:retrieval in 3499) [ClassicSimilarity], result of:
          0.057846278 = score(doc=3499,freq=6.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.46309367 = fieldWeight in 3499, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=3499)
        0.03422346 = weight(_text_:use in 3499) [ClassicSimilarity], result of:
          0.03422346 = score(doc=3499,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.27065295 = fieldWeight in 3499, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.0625 = fieldNorm(doc=3499)
        0.0154592255 = weight(_text_:of in 3499) [ClassicSimilarity], result of:
          0.0154592255 = score(doc=3499,freq=6.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.23940048 = fieldWeight in 3499, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=3499)
        0.008828212 = product of:
          0.017656423 = sum of:
            0.017656423 = weight(_text_:on in 3499) [ClassicSimilarity], result of:
              0.017656423 = score(doc=3499,freq=2.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.19440265 = fieldWeight in 3499, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3499)
          0.5 = coord(1/2)
        0.0223794 = product of:
          0.0447588 = sum of:
            0.0447588 = weight(_text_:22 in 3499) [ClassicSimilarity], result of:
              0.0447588 = score(doc=3499,freq=2.0), product of:
                0.1446067 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.041294612 = queryNorm
                0.30952093 = fieldWeight in 3499, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3499)
          0.5 = coord(1/2)
      0.625 = coord(5/8)
    
    Abstract
    Presents a survey and discussion on signature-based text retrieval methods. It describes the main idea behind the signature approach and its advantages over other text retrieval methods, it provides a classification of the signature methods that have appeared in the literature, it describes the main representatives of each class, together with the relative advantages and drawbacks, and it gives a list of applications as well as commercial or university prototypes that use the signature approach
    Date
    7. 5.1999 15:22:48
    Source
    Information retrieval: data structures and algorithms. Ed.: W.B. Frakes u. R. Baeza-Yates
  3. Smeaton, A.F.; Rijsbergen, C.J. van: ¬The retrieval effects of query expansion on a feedback document retrieval system (1983) 0.09
    0.085731864 = product of:
      0.17146373 = sum of:
        0.10123099 = weight(_text_:retrieval in 2134) [ClassicSimilarity], result of:
          0.10123099 = score(doc=2134,freq=6.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.8104139 = fieldWeight in 2134, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.109375 = fieldNorm(doc=2134)
        0.015619429 = weight(_text_:of in 2134) [ClassicSimilarity], result of:
          0.015619429 = score(doc=2134,freq=2.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.24188137 = fieldWeight in 2134, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.109375 = fieldNorm(doc=2134)
        0.01544937 = product of:
          0.03089874 = sum of:
            0.03089874 = weight(_text_:on in 2134) [ClassicSimilarity], result of:
              0.03089874 = score(doc=2134,freq=2.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.34020463 = fieldWeight in 2134, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.109375 = fieldNorm(doc=2134)
          0.5 = coord(1/2)
        0.039163947 = product of:
          0.078327894 = sum of:
            0.078327894 = weight(_text_:22 in 2134) [ClassicSimilarity], result of:
              0.078327894 = score(doc=2134,freq=2.0), product of:
                0.1446067 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.041294612 = queryNorm
                0.5416616 = fieldWeight in 2134, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=2134)
          0.5 = coord(1/2)
      0.5 = coord(4/8)
    
    Date
    30. 3.2001 13:32:22
    Theme
    Semantisches Umfeld in Indexierung u. Retrieval
  4. Nakkouzi, Z.S.; Eastman, C.M.: Query formulation for handling negation in information retrieval systems (1990) 0.07
    0.069475316 = product of:
      0.13895063 = sum of:
        0.047231287 = weight(_text_:retrieval in 3531) [ClassicSimilarity], result of:
          0.047231287 = score(doc=3531,freq=4.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.37811437 = fieldWeight in 3531, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=3531)
        0.05927678 = weight(_text_:use in 3531) [ClassicSimilarity], result of:
          0.05927678 = score(doc=3531,freq=6.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.4687847 = fieldWeight in 3531, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.0625 = fieldNorm(doc=3531)
        0.023614356 = weight(_text_:of in 3531) [ClassicSimilarity], result of:
          0.023614356 = score(doc=3531,freq=14.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.36569026 = fieldWeight in 3531, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=3531)
        0.008828212 = product of:
          0.017656423 = sum of:
            0.017656423 = weight(_text_:on in 3531) [ClassicSimilarity], result of:
              0.017656423 = score(doc=3531,freq=2.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.19440265 = fieldWeight in 3531, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3531)
          0.5 = coord(1/2)
      0.5 = coord(4/8)
    
    Abstract
    Queries containing negation are widely recognised as presenting problems for both users and systems. In information retrieval systems such problems usually manifest themselves in the use of the NOT operator. Describes an algorithm to transform Boolean queries with negated terms into queries without negation; the transformation process is based on the use of a hierarchical thesaurus. Examines a set of user requests submitted to the Thomas Cooper Library at the University of South Carolina to determine the pattern and frequency of use of negation.
    Source
    Journal of the American Society for Information Science. 41(1990) no.3, S.171-182
  5. Stanfill, C.: Parallel information retrieval algorithms (1992) 0.07
    0.06853892 = product of:
      0.13707784 = sum of:
        0.057846278 = weight(_text_:retrieval in 3515) [ClassicSimilarity], result of:
          0.057846278 = score(doc=3515,freq=6.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.46309367 = fieldWeight in 3515, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=3515)
        0.019957775 = weight(_text_:of in 3515) [ClassicSimilarity], result of:
          0.019957775 = score(doc=3515,freq=10.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.3090647 = fieldWeight in 3515, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=3515)
        0.008828212 = product of:
          0.017656423 = sum of:
            0.017656423 = weight(_text_:on in 3515) [ClassicSimilarity], result of:
              0.017656423 = score(doc=3515,freq=2.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.19440265 = fieldWeight in 3515, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3515)
          0.5 = coord(1/2)
        0.050445575 = product of:
          0.10089115 = sum of:
            0.10089115 = weight(_text_:computers in 3515) [ClassicSimilarity], result of:
              0.10089115 = score(doc=3515,freq=2.0), product of:
                0.21710795 = queryWeight, product of:
                  5.257537 = idf(docFreq=625, maxDocs=44218)
                  0.041294612 = queryNorm
                0.464705 = fieldWeight in 3515, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.257537 = idf(docFreq=625, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3515)
          0.5 = coord(1/2)
      0.5 = coord(4/8)
    
    Abstract
    Data Parallel computers, such as the connection Machine CM-2, can provide interactive access to text databases containign tens, hundreds or even thousands of Gigabytes of data. Starts by presenting a brief overview of data parallel computing, a performance model of the CM-2, and a model of the workload involved in searching text databases. Discusses various algorithms used in information retrieval and gives performance estimates based on the data and procssing models presented
    Source
    Information retrieval: data structures and algorithms. Ed.: W.B. Frakes u. R. Baeza-Yates
  6. Beaulieu, M.; Jones, S.: Interactive searching and interface issues in the Okapi best match probabilistic retrieval system (1998) 0.07
    0.06801399 = product of:
      0.13602798 = sum of:
        0.06534432 = weight(_text_:retrieval in 430) [ClassicSimilarity], result of:
          0.06534432 = score(doc=430,freq=10.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.5231199 = fieldWeight in 430, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=430)
        0.015619429 = weight(_text_:of in 430) [ClassicSimilarity], result of:
          0.015619429 = score(doc=430,freq=8.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.24188137 = fieldWeight in 430, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=430)
        0.010924355 = product of:
          0.02184871 = sum of:
            0.02184871 = weight(_text_:on in 430) [ClassicSimilarity], result of:
              0.02184871 = score(doc=430,freq=4.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.24056101 = fieldWeight in 430, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=430)
          0.5 = coord(1/2)
        0.044139877 = product of:
          0.088279754 = sum of:
            0.088279754 = weight(_text_:computers in 430) [ClassicSimilarity], result of:
              0.088279754 = score(doc=430,freq=2.0), product of:
                0.21710795 = queryWeight, product of:
                  5.257537 = idf(docFreq=625, maxDocs=44218)
                  0.041294612 = queryNorm
                0.40661687 = fieldWeight in 430, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.257537 = idf(docFreq=625, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=430)
          0.5 = coord(1/2)
      0.5 = coord(4/8)
    
    Abstract
    Explores interface design raised by the development and evaluation of Okapi, a highly interactive information retrieval system based on a probabilistic retrieval model with relevance feedback. It uses terms frequency weighting functions to display retrieved items in a best match ranked order; it can also find additional items similar to those marked as relevant by the searcher. Compares the effectiveness of automatic and interactive query expansion in different user interface environments. focuses on the nature of interaction in information retrieval and the interrelationship between functional visibility, the user's cognitive loading and the balance of control between user and system
    Source
    Interacting with computers. 10(1998) no.3, S.237-248
    Theme
    Semantisches Umfeld in Indexierung u. Retrieval
  7. Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.07
    0.067500316 = product of:
      0.18000084 = sum of:
        0.066795126 = weight(_text_:retrieval in 402) [ClassicSimilarity], result of:
          0.066795126 = score(doc=402,freq=2.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.5347345 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
        0.06844692 = weight(_text_:use in 402) [ClassicSimilarity], result of:
          0.06844692 = score(doc=402,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.5413059 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
        0.0447588 = product of:
          0.0895176 = sum of:
            0.0895176 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
              0.0895176 = score(doc=402,freq=2.0), product of:
                0.1446067 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.041294612 = queryNorm
                0.61904186 = fieldWeight in 402, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=402)
          0.5 = coord(1/2)
      0.375 = coord(3/8)
    
    Source
    Information processing and management. 22(1986) no.6, S.465-476
  8. Losada, D.E.; Barreiro, A.: Emebedding term similarity and inverse document frequency into a logical model of information retrieval (2003) 0.07
    0.067203455 = product of:
      0.13440691 = sum of:
        0.057846278 = weight(_text_:retrieval in 1422) [ClassicSimilarity], result of:
          0.057846278 = score(doc=1422,freq=6.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.46309367 = fieldWeight in 1422, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=1422)
        0.03422346 = weight(_text_:use in 1422) [ClassicSimilarity], result of:
          0.03422346 = score(doc=1422,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.27065295 = fieldWeight in 1422, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.0625 = fieldNorm(doc=1422)
        0.019957775 = weight(_text_:of in 1422) [ClassicSimilarity], result of:
          0.019957775 = score(doc=1422,freq=10.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.3090647 = fieldWeight in 1422, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=1422)
        0.0223794 = product of:
          0.0447588 = sum of:
            0.0447588 = weight(_text_:22 in 1422) [ClassicSimilarity], result of:
              0.0447588 = score(doc=1422,freq=2.0), product of:
                0.1446067 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.041294612 = queryNorm
                0.30952093 = fieldWeight in 1422, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1422)
          0.5 = coord(1/2)
      0.5 = coord(4/8)
    
    Abstract
    We propose a novel approach to incorporate term similarity and inverse document frequency into a logical model of information retrieval. The ability of the logic to handle expressive representations along with the use of such classical notions are promising characteristics for IR systems. The approach proposed here has been efficiently implemented and experiments against test collections are presented.
    Date
    22. 3.2003 19:27:23
    Footnote
    Beitrag eines Themenheftes: Mathematical, logical, and formal methods in information retrieval
    Source
    Journal of the American Society for Information Science and technology. 54(2003) no.4, S.285-301
  9. Chang, R.: Keyword searching and indexing (1993) 0.07
    0.0653445 = product of:
      0.130689 = sum of:
        0.033397563 = weight(_text_:retrieval in 7223) [ClassicSimilarity], result of:
          0.033397563 = score(doc=7223,freq=2.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.26736724 = fieldWeight in 7223, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=7223)
        0.03422346 = weight(_text_:use in 7223) [ClassicSimilarity], result of:
          0.03422346 = score(doc=7223,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.27065295 = fieldWeight in 7223, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.0625 = fieldNorm(doc=7223)
        0.012622404 = weight(_text_:of in 7223) [ClassicSimilarity], result of:
          0.012622404 = score(doc=7223,freq=4.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.19546966 = fieldWeight in 7223, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=7223)
        0.050445575 = product of:
          0.10089115 = sum of:
            0.10089115 = weight(_text_:computers in 7223) [ClassicSimilarity], result of:
              0.10089115 = score(doc=7223,freq=2.0), product of:
                0.21710795 = queryWeight, product of:
                  5.257537 = idf(docFreq=625, maxDocs=44218)
                  0.041294612 = queryNorm
                0.464705 = fieldWeight in 7223, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.257537 = idf(docFreq=625, maxDocs=44218)
                  0.0625 = fieldNorm(doc=7223)
          0.5 = coord(1/2)
      0.5 = coord(4/8)
    
    Abstract
    Explains how a computer indexing system works. Reviews fundamentals of how data are stored and retrieved by computers. Describes B-Tree and B+-Tree indexing structures. Gives basic keyword searching techniques that the user must apply to make use of the indexing programs. The demand for keyword retrieval is increasing and librarians should expect to see the keyword-indexing feature become commonly available
  10. Dang, E.K.F.; Luk, R.W.P.; Allan, J.: Beyond bag-of-words : bigram-enhanced context-dependent term weights (2014) 0.06
    0.06140661 = product of:
      0.12281322 = sum of:
        0.046674512 = weight(_text_:retrieval in 1283) [ClassicSimilarity], result of:
          0.046674512 = score(doc=1283,freq=10.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.37365708 = fieldWeight in 1283, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1283)
        0.037047986 = weight(_text_:use in 1283) [ClassicSimilarity], result of:
          0.037047986 = score(doc=1283,freq=6.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.29299045 = fieldWeight in 1283, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1283)
        0.02675291 = weight(_text_:of in 1283) [ClassicSimilarity], result of:
          0.02675291 = score(doc=1283,freq=46.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.41429368 = fieldWeight in 1283, product of:
              6.78233 = tf(freq=46.0), with freq of:
                46.0 = termFreq=46.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1283)
        0.012337802 = product of:
          0.024675604 = sum of:
            0.024675604 = weight(_text_:on in 1283) [ClassicSimilarity], result of:
              0.024675604 = score(doc=1283,freq=10.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.271686 = fieldWeight in 1283, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1283)
          0.5 = coord(1/2)
      0.5 = coord(4/8)
    
    Abstract
    While term independence is a widely held assumption in most of the established information retrieval approaches, it is clearly not true and various works in the past have investigated a relaxation of the assumption. One approach is to use n-grams in document representation instead of unigrams. However, the majority of early works on n-grams obtained only modest performance improvement. On the other hand, the use of information based on supporting terms or "contexts" of queries has been found to be promising. In particular, recent studies showed that using new context-dependent term weights improved the performance of relevance feedback (RF) retrieval compared with using traditional bag-of-words BM25 term weights. Calculation of the new term weights requires an estimation of the local probability of relevance of each query term occurrence. In previous studies, the estimation of this probability was based on unigrams that occur in the neighborhood of a query term. We explore an integration of the n-gram and context approaches by computing context-dependent term weights based on a mixture of unigrams and bigrams. Extensive experiments are performed using the title queries of the Text Retrieval Conference (TREC)-6, TREC-7, TREC-8, and TREC-2005 collections, for RF with relevance judgment of either the top 10 or top 20 documents of an initial retrieval. We identify some crucial elements needed in the use of bigrams in our methods, such as proper inverse document frequency (IDF) weighting of the bigrams and noise reduction by pruning bigrams with large document frequency values. We show that enhancing context-dependent term weights with bigrams is effective in further improving retrieval performance.
    Source
    Journal of the Association for Information Science and Technology. 65(2014) no.6, S.1134-1148
  11. Crestani, F.; Dominich, S.; Lalmas, M.; Rijsbergen, C.J.K. van: Mathematical, logical, and formal methods in information retrieval : an introduction to the special issue (2003) 0.06
    0.0613705 = product of:
      0.122741 = sum of:
        0.06135524 = weight(_text_:retrieval in 1451) [ClassicSimilarity], result of:
          0.06135524 = score(doc=1451,freq=12.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.49118498 = fieldWeight in 1451, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=1451)
        0.025667597 = weight(_text_:use in 1451) [ClassicSimilarity], result of:
          0.025667597 = score(doc=1451,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.20298971 = fieldWeight in 1451, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.046875 = fieldNorm(doc=1451)
        0.018933605 = weight(_text_:of in 1451) [ClassicSimilarity], result of:
          0.018933605 = score(doc=1451,freq=16.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.2932045 = fieldWeight in 1451, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=1451)
        0.016784549 = product of:
          0.033569098 = sum of:
            0.033569098 = weight(_text_:22 in 1451) [ClassicSimilarity], result of:
              0.033569098 = score(doc=1451,freq=2.0), product of:
                0.1446067 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.041294612 = queryNorm
                0.23214069 = fieldWeight in 1451, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1451)
          0.5 = coord(1/2)
      0.5 = coord(4/8)
    
    Abstract
    Research an the use of mathematical, logical, and formal methods, has been central to Information Retrieval research for a long time. Research in this area is important not only because it helps enhancing retrieval effectiveness, but also because it helps clarifying the underlying concepts of Information Retrieval. In this article we outline some of the major aspects of the subject, and summarize the papers of this special issue with respect to how they relate to these aspects. We conclude by highlighting some directions of future research, which are needed to better understand the formal characteristics of Information Retrieval.
    Date
    22. 3.2003 19:27:36
    Footnote
    Einführung zu den Beiträgen eines Themenheftes: Mathematical, logical, and formal methods in information retrieval
    Source
    Journal of the American Society for Information Science and technology. 54(2003) no.4, S.281-284
  12. Cross-language information retrieval (1998) 0.06
    0.0610175 = product of:
      0.122035 = sum of:
        0.04174695 = weight(_text_:retrieval in 6299) [ClassicSimilarity], result of:
          0.04174695 = score(doc=6299,freq=32.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.33420905 = fieldWeight in 6299, product of:
              5.656854 = tf(freq=32.0), with freq of:
                32.0 = termFreq=32.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.01953125 = fieldNorm(doc=6299)
        0.015124777 = weight(_text_:use in 6299) [ClassicSimilarity], result of:
          0.015124777 = score(doc=6299,freq=4.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.11961284 = fieldWeight in 6299, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.01953125 = fieldNorm(doc=6299)
        0.015778005 = weight(_text_:of in 6299) [ClassicSimilarity], result of:
          0.015778005 = score(doc=6299,freq=64.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.24433708 = fieldWeight in 6299, product of:
              8.0 = tf(freq=64.0), with freq of:
                64.0 = termFreq=64.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.01953125 = fieldNorm(doc=6299)
        0.04938527 = sum of:
          0.013515383 = weight(_text_:on in 6299) [ClassicSimilarity], result of:
            0.013515383 = score(doc=6299,freq=12.0), product of:
              0.090823986 = queryWeight, product of:
                2.199415 = idf(docFreq=13325, maxDocs=44218)
                0.041294612 = queryNorm
              0.14880852 = fieldWeight in 6299, product of:
                3.4641016 = tf(freq=12.0), with freq of:
                  12.0 = termFreq=12.0
                2.199415 = idf(docFreq=13325, maxDocs=44218)
                0.01953125 = fieldNorm(doc=6299)
          0.035869885 = weight(_text_:line in 6299) [ClassicSimilarity], result of:
            0.035869885 = score(doc=6299,freq=2.0), product of:
              0.23157367 = queryWeight, product of:
                5.6078424 = idf(docFreq=440, maxDocs=44218)
                0.041294612 = queryNorm
              0.15489621 = fieldWeight in 6299, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6078424 = idf(docFreq=440, maxDocs=44218)
                0.01953125 = fieldNorm(doc=6299)
      0.5 = coord(4/8)
    
    Content
    Enthält die Beiträge: GREFENSTETTE, G.: The Problem of Cross-Language Information Retrieval; DAVIS, M.W.: On the Effective Use of Large Parallel Corpora in Cross-Language Text Retrieval; BALLESTEROS, L. u. W.B. CROFT: Statistical Methods for Cross-Language Information Retrieval; Distributed Cross-Lingual Information Retrieval; Automatic Cross-Language Information Retrieval Using Latent Semantic Indexing; EVANS, D.A. u.a.: Mapping Vocabularies Using Latent Semantics; PICCHI, E. u. C. PETERS: Cross-Language Information Retrieval: A System for Comparable Corpus Querying; YAMABANA, K. u.a.: A Language Conversion Front-End for Cross-Language Information Retrieval; GACHOT, D.A. u.a.: The Systran NLP Browser: An Application of Machine Translation Technology in Cross-Language Information Retrieval; HULL, D.: A Weighted Boolean Model for Cross-Language Text Retrieval; SHERIDAN, P. u.a. Building a Large Multilingual Test Collection from Comparable News Documents; OARD; D.W. u. B.J. DORR: Evaluating Cross-Language Text Filtering Effectiveness
    Footnote
    Rez. in: Machine translation review: 1999, no.10, S.26-27 (D. Lewis): "Cross Language Information Retrieval (CLIR) addresses the growing need to access large volumes of data across language boundaries. The typical requirement is for the user to input a free form query, usually a brief description of a topic, into a search or retrieval engine which returns a list, in ranked order, of documents or web pages that are relevant to the topic. The search engine matches the terms in the query to indexed terms, usually keywords previously derived from the target documents. Unlike monolingual information retrieval, CLIR requires query terms in one language to be matched to indexed terms in another. Matching can be done by bilingual dictionary lookup, full machine translation, or by applying statistical methods. A query's success is measured in terms of recall (how many potentially relevant target documents are found) and precision (what proportion of documents found are relevant). Issues in CLIR are how to translate query terms into index terms, how to eliminate alternative translations (e.g. to decide that French 'traitement' in a query means 'treatment' and not 'salary'), and how to rank or weight translation alternatives that are retained (e.g. how to order the French terms 'aventure', 'business', 'affaire', and 'liaison' as relevant translations of English 'affair'). Grefenstette provides a lucid and useful overview of the field and the problems. The volume brings together a number of experiments and projects in CLIR. Mark Davies (New Mexico State University) describes Recuerdo, a Spanish retrieval engine which reduces translation ambiguities by scanning indexes for parallel texts; it also uses either a bilingual dictionary or direct equivalents from a parallel corpus in order to compare results for queries on parallel texts. Lisa Ballesteros and Bruce Croft (University of Massachusetts) use a 'local feedback' technique which automatically enhances a query by adding extra terms to it both before and after translation; such terms can be derived from documents known to be relevant to the query.
    Christian Fluhr at al (DIST/SMTI, France) outline the EMIR (European Multilingual Information Retrieval) and ESPRIT projects. They found that using SYSTRAN to machine translate queries and to access material from various multilingual databases produced less relevant results than a method referred to as 'multilingual reformulation' (the mechanics of which are only hinted at). An interesting technique is Latent Semantic Indexing (LSI), described by Michael Littman et al (Brown University) and, most clearly, by David Evans et al (Carnegie Mellon University). LSI involves creating matrices of documents and the terms they contain and 'fitting' related documents into a reduced matrix space. This effectively allows queries to be mapped onto a common semantic representation of the documents. Eugenio Picchi and Carol Peters (Pisa) report on a procedure to create links between translation equivalents in an Italian-English parallel corpus. The links are used to construct parallel linguistic contexts in real-time for any term or combination of terms that is being searched for in either language. Their interest is primarily lexicographic but they plan to apply the same procedure to comparable corpora, i.e. to texts which are not translations of each other but which share the same domain. Kiyoshi Yamabana et al (NEC, Japan) address the issue of how to disambiguate between alternative translations of query terms. Their DMAX (double maximise) method looks at co-occurrence frequencies between both source language words and target language words in order to arrive at the most probable translation. The statistical data for the decision are derived, not from the translation texts but independently from monolingual corpora in each language. An interactive user interface allows the user to influence the selection of terms during the matching process. Denis Gachot et al (SYSTRAN) describe the SYSTRAN NLP browser, a prototype tool which collects parsing information derived from a text or corpus previously translated with SYSTRAN. The user enters queries into the browser in either a structured or free form and receives grammatical and lexical information about the source text and/or its translation.
    The retrieved output from a query including the phrase 'big rockets' may be, for instance, a sentence containing 'giant rocket' which is semantically ranked above 'military ocket'. David Hull (Xerox Research Centre, Grenoble) describes an implementation of a weighted Boolean model for Spanish-English CLIR. Users construct Boolean-type queries, weighting each term in the query, which is then translated by an on-line dictionary before being applied to the database. Comparisons with the performance of unweighted free-form queries ('vector space' models) proved encouraging. Two contributions consider the evaluation of CLIR systems. In order to by-pass the time-consuming and expensive process of assembling a standard collection of documents and of user queries against which the performance of an CLIR system is manually assessed, Páriac Sheridan et al (ETH Zurich) propose a method based on retrieving 'seed documents'. This involves identifying a unique document in a database (the 'seed document') and, for a number of queries, measuring how fast it is retrieved. The authors have also assembled a large database of multilingual news documents for testing purposes. By storing the (fairly short) documents in a structured form tagged with descriptor codes (e.g. for topic, country and area), the test suite is easily expanded while remaining consistent for the purposes of testing. Douglas Ouard and Bonne Dorr (University of Maryland) describe an evaluation methodology which appears to apply LSI techniques in order to filter and rank incoming documents designed for testing CLIR systems. The volume provides the reader an excellent overview of several projects in CLIR. It is well supported with references and is intended as a secondary text for researchers and practitioners. It highlights the need for a good, general tutorial introduction to the field."
    Series
    The Kluwer International series on information retrieval
  13. Computational information retrieval (2001) 0.06
    0.05918767 = product of:
      0.11837534 = sum of:
        0.06627123 = weight(_text_:retrieval in 4167) [ClassicSimilarity], result of:
          0.06627123 = score(doc=4167,freq=14.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.5305404 = fieldWeight in 4167, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=4167)
        0.025667597 = weight(_text_:use in 4167) [ClassicSimilarity], result of:
          0.025667597 = score(doc=4167,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.20298971 = fieldWeight in 4167, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.046875 = fieldNorm(doc=4167)
        0.014968331 = weight(_text_:of in 4167) [ClassicSimilarity], result of:
          0.014968331 = score(doc=4167,freq=10.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.23179851 = fieldWeight in 4167, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=4167)
        0.011468184 = product of:
          0.022936368 = sum of:
            0.022936368 = weight(_text_:on in 4167) [ClassicSimilarity], result of:
              0.022936368 = score(doc=4167,freq=6.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.25253648 = fieldWeight in 4167, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4167)
          0.5 = coord(1/2)
      0.5 = coord(4/8)
    
    Abstract
    This volume contains selected papers that focus on the use of linear algebra, computational statistics, and computer science in the development of algorithms and software systems for text retrieval. Experts in information modeling and retrieval share their perspectives on the design of scalable but precise text retrieval systems, revealing many of the challenges and obstacles that mathematical and statistical models must overcome to be viable for automated text processing. This very useful proceedings is an excellent companion for courses in information retrieval, applied linear algebra, and applied statistics. Computational Information Retrieval provides background material on vector space models for text retrieval that applied mathematicians, statisticians, and computer scientists may not be familiar with. For graduate students in these areas, several research questions in information modeling are exposed. In addition, several case studies concerning the efficacy of the popular Latent Semantic Analysis (or Indexing) approach are provided.
  14. Robertson, A.M.; Willett, P.: Use of genetic algorithms in information retrieval (1995) 0.06
    0.0560728 = product of:
      0.1121456 = sum of:
        0.047231287 = weight(_text_:retrieval in 2418) [ClassicSimilarity], result of:
          0.047231287 = score(doc=2418,freq=4.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.37811437 = fieldWeight in 2418, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=2418)
        0.03422346 = weight(_text_:use in 2418) [ClassicSimilarity], result of:
          0.03422346 = score(doc=2418,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.27065295 = fieldWeight in 2418, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.0625 = fieldNorm(doc=2418)
        0.021862645 = weight(_text_:of in 2418) [ClassicSimilarity], result of:
          0.021862645 = score(doc=2418,freq=12.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.33856338 = fieldWeight in 2418, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=2418)
        0.008828212 = product of:
          0.017656423 = sum of:
            0.017656423 = weight(_text_:on in 2418) [ClassicSimilarity], result of:
              0.017656423 = score(doc=2418,freq=2.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.19440265 = fieldWeight in 2418, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2418)
          0.5 = coord(1/2)
      0.5 = coord(4/8)
    
    Abstract
    Reviews the basic techniques involving genetic algorithms and their application to 2 problems in information retrieval: the generation of equifrequent groups of index terms; and the identification of optimal query and term weights. The algorithm developed for the generation of equifrequent groupings proved to be effective in operation, achieving results comparable with those obtained using a good deterministic algorithm. The algorithm developed for the identification of optimal query and term weighting involves fitness function that is based on full relevance information
  15. Efthimiadis, E.N.: Interactive query expansion : a user-based evaluation in a relevance feedback environment (2000) 0.06
    0.05606333 = product of:
      0.14950222 = sum of:
        0.033397563 = weight(_text_:retrieval in 5701) [ClassicSimilarity], result of:
          0.033397563 = score(doc=5701,freq=8.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.26736724 = fieldWeight in 5701, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03125 = fieldNorm(doc=5701)
        0.01728394 = weight(_text_:of in 5701) [ClassicSimilarity], result of:
          0.01728394 = score(doc=5701,freq=30.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.26765788 = fieldWeight in 5701, product of:
              5.477226 = tf(freq=30.0), with freq of:
                30.0 = termFreq=30.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03125 = fieldNorm(doc=5701)
        0.098820716 = sum of:
          0.017656423 = weight(_text_:on in 5701) [ClassicSimilarity], result of:
            0.017656423 = score(doc=5701,freq=8.0), product of:
              0.090823986 = queryWeight, product of:
                2.199415 = idf(docFreq=13325, maxDocs=44218)
                0.041294612 = queryNorm
              0.19440265 = fieldWeight in 5701, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                2.199415 = idf(docFreq=13325, maxDocs=44218)
                0.03125 = fieldNorm(doc=5701)
          0.08116429 = weight(_text_:line in 5701) [ClassicSimilarity], result of:
            0.08116429 = score(doc=5701,freq=4.0), product of:
              0.23157367 = queryWeight, product of:
                5.6078424 = idf(docFreq=440, maxDocs=44218)
                0.041294612 = queryNorm
              0.35049015 = fieldWeight in 5701, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.6078424 = idf(docFreq=440, maxDocs=44218)
                0.03125 = fieldNorm(doc=5701)
      0.375 = coord(3/8)
    
    Abstract
    A user-centered investigation of interactive query expansion within the context of a relevance feedback system is presented in this article. Data were collected from 25 searches using the INSPEC database. The data collection mechanisms included questionnaires, transaction logs, and relevance evaluations. The results discuss issues that relate to query expansion, retrieval effectiveness, the correspondence of the on-line-to-off-line relevance judgments, and the selection of terms for query expansion by users (interactive query expansion). The main conclusions drawn from the results of the study are that: (1) one-third of the terms presented to users in a list of candidate terms for query expansion was identified by the users as potentially useful for query expansion. (2) These terms were mainly judged as either variant expressions (synonyms) or alternative (related) terms to the initial query terms. However, a substantial portion of the selected terms were identified as representing new ideas. (3) The relationships identified between the five best terms selected by the users for query expansion and the initial query terms were that: (a) 34% of the query expansion terms have no relationship or other type of correspondence with a query term; (b) 66% of the remaining query expansion terms have a relationship to the query terms. These relationships were: narrower term (46%), broader term (3%), related term (17%). (4) The results provide evidence for the effectiveness of interactive query expansion. The initial search produced on average three highly relevant documents; the query expansion search produced on average nine further highly relevant documents. The conclusions highlight the need for more research on: interactive query expansion, the comparative evaluation of automatic vs. interactive query expansion, the study of weighted Webbased or Web-accessible retrieval systems in operational environments, and for user studies in searching ranked retrieval systems in general
    Source
    Journal of the American Society for Information Science. 51(2000) no.11, S.989-1003
    Theme
    Semantisches Umfeld in Indexierung u. Retrieval
  16. Perry, R.; Willett, P.: ¬A revies of the use of inverted files for best match searching in information retrieval systems (1983) 0.05
    0.05452141 = product of:
      0.14539044 = sum of:
        0.058445733 = weight(_text_:retrieval in 2701) [ClassicSimilarity], result of:
          0.058445733 = score(doc=2701,freq=2.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.46789268 = fieldWeight in 2701, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.109375 = fieldNorm(doc=2701)
        0.059891056 = weight(_text_:use in 2701) [ClassicSimilarity], result of:
          0.059891056 = score(doc=2701,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.47364265 = fieldWeight in 2701, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.109375 = fieldNorm(doc=2701)
        0.027053645 = weight(_text_:of in 2701) [ClassicSimilarity], result of:
          0.027053645 = score(doc=2701,freq=6.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.41895083 = fieldWeight in 2701, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.109375 = fieldNorm(doc=2701)
      0.375 = coord(3/8)
    
    Source
    Journal of information science. 6(1983), S.59-66
  17. Otterbacher, J.; Erkan, G.; Radev, D.R.: Biased LexRank : passage retrieval using random walks with question-based priors (2009) 0.05
    0.053733695 = product of:
      0.10746739 = sum of:
        0.050615493 = weight(_text_:retrieval in 2450) [ClassicSimilarity], result of:
          0.050615493 = score(doc=2450,freq=6.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.40520695 = fieldWeight in 2450, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2450)
        0.029945528 = weight(_text_:use in 2450) [ClassicSimilarity], result of:
          0.029945528 = score(doc=2450,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.23682132 = fieldWeight in 2450, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2450)
        0.013526822 = weight(_text_:of in 2450) [ClassicSimilarity], result of:
          0.013526822 = score(doc=2450,freq=6.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.20947541 = fieldWeight in 2450, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2450)
        0.013379549 = product of:
          0.026759097 = sum of:
            0.026759097 = weight(_text_:on in 2450) [ClassicSimilarity], result of:
              0.026759097 = score(doc=2450,freq=6.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.29462588 = fieldWeight in 2450, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2450)
          0.5 = coord(1/2)
      0.5 = coord(4/8)
    
    Abstract
    We present Biased LexRank, a method for semi-supervised passage retrieval in the context of question answering. We represent a text as a graph of passages linked based on their pairwise lexical similarity. We use traditional passage retrieval techniques to identify passages that are likely to be relevant to a user's natural language question. We then perform a random walk on the lexical similarity graph in order to recursively retrieve additional passages that are similar to other relevant passages. We present results on several benchmarks that show the applicability of our work to question answering and topic-focused text summarization.
  18. Sparck Jones, K.: ¬A statistical interpretation of term specificity and its application in retrieval (2004) 0.05
    0.05337083 = product of:
      0.10674166 = sum of:
        0.041327372 = weight(_text_:retrieval in 4420) [ClassicSimilarity], result of:
          0.041327372 = score(doc=4420,freq=4.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.33085006 = fieldWeight in 4420, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4420)
        0.029945528 = weight(_text_:use in 4420) [ClassicSimilarity], result of:
          0.029945528 = score(doc=4420,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.23682132 = fieldWeight in 4420, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4420)
        0.022089208 = weight(_text_:of in 4420) [ClassicSimilarity], result of:
          0.022089208 = score(doc=4420,freq=16.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.34207192 = fieldWeight in 4420, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4420)
        0.013379549 = product of:
          0.026759097 = sum of:
            0.026759097 = weight(_text_:on in 4420) [ClassicSimilarity], result of:
              0.026759097 = score(doc=4420,freq=6.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.29462588 = fieldWeight in 4420, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4420)
          0.5 = coord(1/2)
      0.5 = coord(4/8)
    
    Abstract
    The exhaustivity of document descriptions and the specificity of index terms are usually regarded as independent. It is suggested that specificity should be interpreted statistically, as a function of term use rather than of term meaning. The effects on retrieval of variations in term specificity are examined, experiments with three test collections showing, in particular, that frequently-occurring terms are required for good overall performance. It is argued that terms should be weighted according to collection frequency, so that matches on less frequent, more specific, terms are of greater value than matches on frequent terms. Results for the test collections show that considerable improvements in performance are obtained with this very simple procedure.
    Source
    Journal of documentation. 60(2004) no.5, S.493-502
  19. Lalmas, M.: XML retrieval (2009) 0.05
    0.05263522 = product of:
      0.10527044 = sum of:
        0.059039105 = weight(_text_:retrieval in 4998) [ClassicSimilarity], result of:
          0.059039105 = score(doc=4998,freq=16.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.47264296 = fieldWeight in 4998, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4998)
        0.021389665 = weight(_text_:use in 4998) [ClassicSimilarity], result of:
          0.021389665 = score(doc=4998,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.1691581 = fieldWeight in 4998, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4998)
        0.019324033 = weight(_text_:of in 4998) [ClassicSimilarity], result of:
          0.019324033 = score(doc=4998,freq=24.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.2992506 = fieldWeight in 4998, product of:
              4.8989797 = tf(freq=24.0), with freq of:
                24.0 = termFreq=24.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4998)
        0.0055176322 = product of:
          0.0110352645 = sum of:
            0.0110352645 = weight(_text_:on in 4998) [ClassicSimilarity], result of:
              0.0110352645 = score(doc=4998,freq=2.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.121501654 = fieldWeight in 4998, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4998)
          0.5 = coord(1/2)
      0.5 = coord(4/8)
    
    Abstract
    Documents usually have a content and a structure. The content refers to the text of the document, whereas the structure refers to how a document is logically organized. An increasingly common way to encode the structure is through the use of a mark-up language. Nowadays, the most widely used mark-up language for representing structure is the eXtensible Mark-up Language (XML). XML can be used to provide a focused access to documents, i.e. returning XML elements, such as sections and paragraphs, instead of whole documents in response to a query. Such focused strategies are of particular benefit for information repositories containing long documents, or documents covering a wide variety of topics, where users are directed to the most relevant content within a document. The increased adoption of XML to represent a document structure requires the development of tools to effectively access documents marked-up in XML. This book provides a detailed description of query languages, indexing strategies, ranking algorithms, presentation scenarios developed to access XML documents. Major advances in XML retrieval were seen from 2002 as a result of INEX, the Initiative for Evaluation of XML Retrieval. INEX, also described in this book, provided test sets for evaluating XML retrieval effectiveness. Many of the developments and results described in this book were investigated within INEX.
    Content
    Table of Contents: Introduction / Basic XML Concepts / Historical Perspectives / Query Languages / Indexing Strategies / Ranking Strategies / Presentation Strategies / Evaluating XML Retrieval Effectiveness / Conclusions
    LCSH
    Information retrieval
    Series
    Synthesis lectures on information concepts, retrieval & services; 7
    Subject
    Information retrieval
  20. Burgin, R.: ¬The retrieval effectiveness of 5 clustering algorithms as a function of indexing exhaustivity (1995) 0.05
    0.05191477 = product of:
      0.10382954 = sum of:
        0.059039105 = weight(_text_:retrieval in 3365) [ClassicSimilarity], result of:
          0.059039105 = score(doc=3365,freq=16.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.47264296 = fieldWeight in 3365, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3365)
        0.023000197 = weight(_text_:of in 3365) [ClassicSimilarity], result of:
          0.023000197 = score(doc=3365,freq=34.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.35617945 = fieldWeight in 3365, product of:
              5.8309517 = tf(freq=34.0), with freq of:
                34.0 = termFreq=34.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3365)
        0.007803111 = product of:
          0.015606222 = sum of:
            0.015606222 = weight(_text_:on in 3365) [ClassicSimilarity], result of:
              0.015606222 = score(doc=3365,freq=4.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.1718293 = fieldWeight in 3365, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3365)
          0.5 = coord(1/2)
        0.013987125 = product of:
          0.02797425 = sum of:
            0.02797425 = weight(_text_:22 in 3365) [ClassicSimilarity], result of:
              0.02797425 = score(doc=3365,freq=2.0), product of:
                0.1446067 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.041294612 = queryNorm
                0.19345059 = fieldWeight in 3365, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3365)
          0.5 = coord(1/2)
      0.5 = coord(4/8)
    
    Abstract
    The retrieval effectiveness of 5 hierarchical clustering methods (single link, complete link, group average, Ward's method, and weighted average) is examined as a function of indexing exhaustivity with 4 test collections (CR, Cranfield, Medlars, and Time). Evaluations of retrieval effectiveness, based on 3 measures of optimal retrieval performance, confirm earlier findings that the performance of a retrieval system based on single link clustering varies as a function of indexing exhaustivity but fail ti find similar patterns for other clustering methods. The data also confirm earlier findings regarding the poor performance of single link clustering is a retrieval environment. The poor performance of single link clustering appears to derive from that method's tendency to produce a small number of large, ill defined document clusters. By contrast, the data examined here found the retrieval performance of the other clustering methods to be general comparable. The data presented also provides an opportunity to examine the theoretical limits of cluster based retrieval and to compare these theoretical limits to the effectiveness of operational implementations. Performance standards of the 4 document collections examined were found to vary widely, and the effectiveness of operational implementations were found to be in the range defined as unacceptable. Further improvements in search strategies and document representations warrant investigations
    Date
    22. 2.1996 11:20:06
    Source
    Journal of the American Society for Information Science. 46(1995) no.8, S.562-572

Languages

Types

  • a 309
  • m 12
  • el 8
  • s 5
  • r 4
  • x 3
  • p 2
  • d 1
  • More… Less…