Document (#35928)

Author
Klein, S.T.
Title
On the use of negation in Boolean IR queries.
Source
Information processing and management. 45(2009) no.2, S.298-311
Year
2009
Abstract
The negation operator, in various forms in which it appears in Information Retrieval queries, is investigated. The applications include negated terms in Boolean queries, more specifically in the presence of metrical constraints, but also negated characters used in the definition of extended keywords by means of regular expressions. Exact definitions are suggested and their usefulness is shown on several examples. Finally, some implementation issues are discussed, in particular as to the order in which the terms of long queries, with or without negated keywords, should be processed, and efficient heuristics for choosing a good order are suggested.
Theme
Retrievalalgorithmen

Similar documents (author)

  1. Klein, W.: Organisation des Wissens durch Sprache : Konsequenzen für die maschinelle Sprachanalyse (1977) 4.96
    4.9598045 = sum of:
      4.9598045 = weight(author_txt:klein in 1748) [ClassicSimilarity], result of:
        4.9598045 = fieldWeight in 1748, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.935687 = idf(docFreq=42, maxDocs=44218)
          0.625 = fieldNorm(doc=1748)
    
  2. Klein, H.: GENIOS jetzt mit Thesaurus-Suche (1993) 4.96
    4.9598045 = sum of:
      4.9598045 = weight(author_txt:klein in 7537) [ClassicSimilarity], result of:
        4.9598045 = fieldWeight in 7537, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.935687 = idf(docFreq=42, maxDocs=44218)
          0.625 = fieldNorm(doc=7537)
    
  3. Klein, R.D.: ¬The problem of cataloguing world literature using the Nippon Decimal Classification (1994) 4.96
    4.9598045 = sum of:
      4.9598045 = weight(author_txt:klein in 867) [ClassicSimilarity], result of:
        4.9598045 = fieldWeight in 867, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.935687 = idf(docFreq=42, maxDocs=44218)
          0.625 = fieldNorm(doc=867)
    
  4. Klein, G.M.: Is there a standard default keyword operator? : a bibliometric analysis of processing options chosen by libraries to execute keyword searches in online public access catalogs (1994) 4.96
    4.9598045 = sum of:
      4.9598045 = weight(author_txt:klein in 2200) [ClassicSimilarity], result of:
        4.9598045 = fieldWeight in 2200, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.935687 = idf(docFreq=42, maxDocs=44218)
          0.625 = fieldNorm(doc=2200)
    
  5. Klein, J.T.: Interdisciplinary needs : the current context (1996) 4.96
    4.9598045 = sum of:
      4.9598045 = weight(author_txt:klein in 7176) [ClassicSimilarity], result of:
        4.9598045 = fieldWeight in 7176, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.935687 = idf(docFreq=42, maxDocs=44218)
          0.625 = fieldNorm(doc=7176)
    

Similar documents (content)

  1. Nakkouzi, Z.S.; Eastman, C.M.: Query formulation for handling negation in information retrieval systems (1990) 0.38
    0.38307276 = sum of:
      0.38307276 = product of:
        1.5961366 = sum of:
          0.029765345 = weight(abstract_txt:terms in 3531) [ClassicSimilarity], result of:
            0.029765345 = score(doc=3531,freq=1.0), product of:
              0.07851323 = queryWeight, product of:
                1.3911831 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.013956023 = queryNorm
              0.37911248 = fieldWeight in 3531, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.09375 = fieldNorm(doc=3531)
          0.12550825 = weight(abstract_txt:operator in 3531) [ClassicSimilarity], result of:
            0.12550825 = score(doc=3531,freq=1.0), product of:
              0.16264488 = queryWeight, product of:
                1.4158528 = boost
                8.231152 = idf(docFreq=31, maxDocs=44218)
                0.013956023 = queryNorm
              0.77167046 = fieldWeight in 3531, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.231152 = idf(docFreq=31, maxDocs=44218)
                0.09375 = fieldNorm(doc=3531)
          0.10989849 = weight(abstract_txt:boolean in 3531) [ClassicSimilarity], result of:
            0.10989849 = score(doc=3531,freq=1.0), product of:
              0.1875556 = queryWeight, product of:
                2.1501954 = boost
                6.2501497 = idf(docFreq=231, maxDocs=44218)
                0.013956023 = queryNorm
              0.58595157 = fieldWeight in 3531, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2501497 = idf(docFreq=231, maxDocs=44218)
                0.09375 = fieldNorm(doc=3531)
          0.5227207 = weight(abstract_txt:negation in 3531) [ClassicSimilarity], result of:
            0.5227207 = score(doc=3531,freq=3.0), product of:
              0.36779708 = queryWeight, product of:
                3.0110435 = boost
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.013956023 = queryNorm
              1.4212204 = fieldWeight in 3531, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.09375 = fieldNorm(doc=3531)
          0.20763618 = weight(abstract_txt:queries in 3531) [ClassicSimilarity], result of:
            0.20763618 = score(doc=3531,freq=3.0), product of:
              0.25040355 = queryWeight, product of:
                3.513566 = boost
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.013956023 = queryNorm
              0.82920617 = fieldWeight in 3531, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.09375 = fieldNorm(doc=3531)
          0.6006077 = weight(abstract_txt:negated in 3531) [ClassicSimilarity], result of:
            0.6006077 = score(doc=3531,freq=1.0), product of:
              0.6661314 = queryWeight, product of:
                4.9629335 = boost
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.013956023 = queryNorm
              0.9016355 = fieldWeight in 3531, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.09375 = fieldNorm(doc=3531)
        0.24 = coord(6/25)
    
  2. McQuire, A.R.; Eastman, C.M.: ¬The ambiguity of negation in natural language queries to information retrieval systems (1998) 0.19
    0.18599539 = sum of:
      0.18599539 = product of:
        1.1624712 = sum of:
          0.0074457936 = weight(abstract_txt:which in 1147) [ClassicSimilarity], result of:
            0.0074457936 = score(doc=1147,freq=1.0), product of:
              0.040844824 = queryWeight, product of:
                1.0034169 = boost
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.013956023 = queryNorm
              0.18229467 = fieldWeight in 1147, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.0625 = fieldNorm(doc=1147)
          0.34848046 = weight(abstract_txt:negation in 1147) [ClassicSimilarity], result of:
            0.34848046 = score(doc=1147,freq=3.0), product of:
              0.36779708 = queryWeight, product of:
                3.0110435 = boost
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.013956023 = queryNorm
              0.94748026 = fieldWeight in 1147, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.0625 = fieldNorm(doc=1147)
          0.11302283 = weight(abstract_txt:queries in 1147) [ClassicSimilarity], result of:
            0.11302283 = score(doc=1147,freq=2.0), product of:
              0.25040355 = queryWeight, product of:
                3.513566 = boost
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.013956023 = queryNorm
              0.4513627 = fieldWeight in 1147, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.0625 = fieldNorm(doc=1147)
          0.6935221 = weight(abstract_txt:negated in 1147) [ClassicSimilarity], result of:
            0.6935221 = score(doc=1147,freq=3.0), product of:
              0.6661314 = queryWeight, product of:
                4.9629335 = boost
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.013956023 = queryNorm
              1.0411191 = fieldWeight in 1147, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.0625 = fieldNorm(doc=1147)
        0.16 = coord(4/25)
    
  3. Klein, S.T.: Processing queries with metrical constraints in XML-based IR systems (2008) 0.15
    0.15291215 = sum of:
      0.15291215 = product of:
        0.54611486 = sum of:
          0.044219926 = weight(abstract_txt:investigated in 1342) [ClassicSimilarity], result of:
            0.044219926 = score(doc=1342,freq=1.0), product of:
              0.081134245 = queryWeight, product of:
                5.813565 = idf(docFreq=358, maxDocs=44218)
                0.013956023 = queryNorm
              0.5450217 = fieldWeight in 1342, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.813565 = idf(docFreq=358, maxDocs=44218)
                0.09375 = fieldNorm(doc=1342)
          0.01116869 = weight(abstract_txt:which in 1342) [ClassicSimilarity], result of:
            0.01116869 = score(doc=1342,freq=1.0), product of:
              0.040844824 = queryWeight, product of:
                1.0034169 = boost
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.013956023 = queryNorm
              0.273442 = fieldWeight in 1342, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.09375 = fieldNorm(doc=1342)
          0.0500487 = weight(abstract_txt:usefulness in 1342) [ClassicSimilarity], result of:
            0.0500487 = score(doc=1342,freq=1.0), product of:
              0.088115856 = queryWeight, product of:
                1.0421373 = boost
                6.0585327 = idf(docFreq=280, maxDocs=44218)
                0.013956023 = queryNorm
              0.56798744 = fieldWeight in 1342, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0585327 = idf(docFreq=280, maxDocs=44218)
                0.09375 = fieldNorm(doc=1342)
          0.07004625 = weight(abstract_txt:constraints in 1342) [ClassicSimilarity], result of:
            0.07004625 = score(doc=1342,freq=1.0), product of:
              0.1102509 = queryWeight, product of:
                1.1657058 = boost
                6.7769065 = idf(docFreq=136, maxDocs=44218)
                0.013956023 = queryNorm
              0.63533497 = fieldWeight in 1342, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7769065 = idf(docFreq=136, maxDocs=44218)
                0.09375 = fieldNorm(doc=1342)
          0.042094555 = weight(abstract_txt:terms in 1342) [ClassicSimilarity], result of:
            0.042094555 = score(doc=1342,freq=2.0), product of:
              0.07851323 = queryWeight, product of:
                1.3911831 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.013956023 = queryNorm
              0.53614604 = fieldWeight in 1342, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.09375 = fieldNorm(doc=1342)
          0.20865794 = weight(abstract_txt:metrical in 1342) [ClassicSimilarity], result of:
            0.20865794 = score(doc=1342,freq=1.0), product of:
              0.22825246 = queryWeight, product of:
                1.6772803 = boost
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.013956023 = queryNorm
              0.9141542 = fieldWeight in 1342, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.09375 = fieldNorm(doc=1342)
          0.119878806 = weight(abstract_txt:queries in 1342) [ClassicSimilarity], result of:
            0.119878806 = score(doc=1342,freq=1.0), product of:
              0.25040355 = queryWeight, product of:
                3.513566 = boost
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.013956023 = queryNorm
              0.47874242 = fieldWeight in 1342, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.09375 = fieldNorm(doc=1342)
        0.28 = coord(7/25)
    
  4. Kim, Y.W.; Kim, J.H.: ¬A model of knowledge based information retrieval with hierarchical concept graph (1990) 0.15
    0.14727472 = sum of:
      0.14727472 = product of:
        0.7363736 = sum of:
          0.009307242 = weight(abstract_txt:which in 3909) [ClassicSimilarity], result of:
            0.009307242 = score(doc=3909,freq=1.0), product of:
              0.040844824 = queryWeight, product of:
                1.0034169 = boost
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.013956023 = queryNorm
              0.22786833 = fieldWeight in 3909, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.078125 = fieldNorm(doc=3909)
          0.035078797 = weight(abstract_txt:terms in 3909) [ClassicSimilarity], result of:
            0.035078797 = score(doc=3909,freq=2.0), product of:
              0.07851323 = queryWeight, product of:
                1.3911831 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.013956023 = queryNorm
              0.44678837 = fieldWeight in 3909, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.078125 = fieldNorm(doc=3909)
          0.091582075 = weight(abstract_txt:boolean in 3909) [ClassicSimilarity], result of:
            0.091582075 = score(doc=3909,freq=1.0), product of:
              0.1875556 = queryWeight, product of:
                2.1501954 = boost
                6.2501497 = idf(docFreq=231, maxDocs=44218)
                0.013956023 = queryNorm
              0.48829293 = fieldWeight in 3909, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2501497 = idf(docFreq=231, maxDocs=44218)
                0.078125 = fieldNorm(doc=3909)
          0.09989901 = weight(abstract_txt:queries in 3909) [ClassicSimilarity], result of:
            0.09989901 = score(doc=3909,freq=1.0), product of:
              0.25040355 = queryWeight, product of:
                3.513566 = boost
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.013956023 = queryNorm
              0.39895204 = fieldWeight in 3909, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.078125 = fieldNorm(doc=3909)
          0.50050646 = weight(abstract_txt:negated in 3909) [ClassicSimilarity], result of:
            0.50050646 = score(doc=3909,freq=1.0), product of:
              0.6661314 = queryWeight, product of:
                4.9629335 = boost
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.013956023 = queryNorm
              0.751363 = fieldWeight in 3909, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.078125 = fieldNorm(doc=3909)
        0.2 = coord(5/25)
    
  5. Spink, A.; Wolfram, D.; Jansen, B.J.; Saracevic, T.: Searching the Web : the public and their queries (2001) 0.10
    0.10157062 = sum of:
      0.10157062 = product of:
        0.42321092 = sum of:
          0.030880036 = weight(abstract_txt:appears in 6980) [ClassicSimilarity], result of:
            0.030880036 = score(doc=6980,freq=1.0), product of:
              0.10137497 = queryWeight, product of:
                1.1177979 = boost
                6.49839 = idf(docFreq=180, maxDocs=44218)
                0.013956023 = queryNorm
              0.30461204 = fieldWeight in 6980, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.49839 = idf(docFreq=180, maxDocs=44218)
                0.046875 = fieldNorm(doc=6980)
          0.03327867 = weight(abstract_txt:terms in 6980) [ClassicSimilarity], result of:
            0.03327867 = score(doc=6980,freq=5.0), product of:
              0.07851323 = queryWeight, product of:
                1.3911831 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.013956023 = queryNorm
              0.42386067 = fieldWeight in 6980, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.046875 = fieldNorm(doc=6980)
          0.062754124 = weight(abstract_txt:operator in 6980) [ClassicSimilarity], result of:
            0.062754124 = score(doc=6980,freq=1.0), product of:
              0.16264488 = queryWeight, product of:
                1.4158528 = boost
                8.231152 = idf(docFreq=31, maxDocs=44218)
                0.013956023 = queryNorm
              0.38583523 = fieldWeight in 6980, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.231152 = idf(docFreq=31, maxDocs=44218)
                0.046875 = fieldNorm(doc=6980)
          0.019791588 = weight(abstract_txt:order in 6980) [ClassicSimilarity], result of:
            0.019791588 = score(doc=6980,freq=1.0), product of:
              0.09494585 = queryWeight, product of:
                1.5298572 = boost
                4.446962 = idf(docFreq=1407, maxDocs=44218)
                0.013956023 = queryNorm
              0.20845133 = fieldWeight in 6980, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.446962 = idf(docFreq=1407, maxDocs=44218)
                0.046875 = fieldNorm(doc=6980)
          0.077709965 = weight(abstract_txt:boolean in 6980) [ClassicSimilarity], result of:
            0.077709965 = score(doc=6980,freq=2.0), product of:
              0.1875556 = queryWeight, product of:
                2.1501954 = boost
                6.2501497 = idf(docFreq=231, maxDocs=44218)
                0.013956023 = queryNorm
              0.4143303 = fieldWeight in 6980, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.2501497 = idf(docFreq=231, maxDocs=44218)
                0.046875 = fieldNorm(doc=6980)
          0.19879653 = weight(abstract_txt:queries in 6980) [ClassicSimilarity], result of:
            0.19879653 = score(doc=6980,freq=11.0), product of:
              0.25040355 = queryWeight, product of:
                3.513566 = boost
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.013956023 = queryNorm
              0.79390454 = fieldWeight in 6980, product of:
                3.3166249 = tf(freq=11.0), with freq of:
                  11.0 = termFreq=11.0
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.046875 = fieldNorm(doc=6980)
        0.24 = coord(6/25)