Document (#37093)

Author
Biskri, I.
Rompré, L.
Title
Using association rules for query reformulation
Source
Next generation search engines: advanced models for information retrieval. Eds.: C. Jouis, u.a
Imprint
Hershey, PA : IGI Publishing
Year
2012
Pages
S.291-303
Abstract
In this paper the authors will present research on the combination of two methods of data mining: text classification and maximal association rules. Text classification has been the focus of interest of many researchers for a long time. However, the results take the form of lists of words (classes) that people often do not know what to do with. The use of maximal association rules induced a number of advantages: (1) the detection of dependencies and correlations between the relevant units of information (words) of different classes, (2) the extraction of hidden knowledge, often relevant, from a large volume of data. The authors will show how this combination can improve the process of information retrieval.
Footnote
Vgl.: http://www.igi-global.com/book/next-generation-search-engines/64430.
Theme
Data Mining
Retrievalalgorithmen

Similar documents (content)

  1. Mengle, S.; Goharian, N.: Passage detection using text classification (2009) 0.16
    0.15777646 = sum of:
      0.15777646 = product of:
        0.49305144 = sum of:
          0.0437479 = weight(abstract_txt:mining in 2765) [ClassicSimilarity], result of:
            0.0437479 = score(doc=2765,freq=1.0), product of:
              0.12953949 = queryWeight, product of:
                1.0985291 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.019095177 = queryNorm
              0.33771864 = fieldWeight in 2765, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2765)
          0.013797497 = weight(abstract_txt:data in 2765) [ClassicSimilarity], result of:
            0.013797497 = score(doc=2765,freq=1.0), product of:
              0.07562074 = queryWeight, product of:
                1.1869869 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.019095177 = queryNorm
              0.18245652 = fieldWeight in 2765, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2765)
          0.15346436 = weight(abstract_txt:detection in 2765) [ClassicSimilarity], result of:
            0.15346436 = score(doc=2765,freq=7.0), product of:
              0.15633987 = queryWeight, product of:
                1.2068279 = boost
                6.784232 = idf(docFreq=135, maxDocs=44218)
                0.019095177 = queryNorm
              0.9816073 = fieldWeight in 2765, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.784232 = idf(docFreq=135, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2765)
          0.13140059 = weight(abstract_txt:hidden in 2765) [ClassicSimilarity], result of:
            0.13140059 = score(doc=2765,freq=4.0), product of:
              0.16988 = queryWeight, product of:
                1.2580028 = boost
                7.071914 = idf(docFreq=101, maxDocs=44218)
                0.019095177 = queryNorm
              0.7734906 = fieldWeight in 2765, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.071914 = idf(docFreq=101, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2765)
          0.023636464 = weight(abstract_txt:classification in 2765) [ClassicSimilarity], result of:
            0.023636464 = score(doc=2765,freq=1.0), product of:
              0.108266905 = queryWeight, product of:
                1.4202778 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.019095177 = queryNorm
              0.21831661 = fieldWeight in 2765, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2765)
          0.049136795 = weight(abstract_txt:text in 2765) [ClassicSimilarity], result of:
            0.049136795 = score(doc=2765,freq=4.0), product of:
              0.11109434 = queryWeight, product of:
                1.4387039 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.019095177 = queryNorm
              0.4422979 = fieldWeight in 2765, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2765)
          0.037007608 = weight(abstract_txt:relevant in 2765) [ClassicSimilarity], result of:
            0.037007608 = score(doc=2765,freq=1.0), product of:
              0.1459827 = queryWeight, product of:
                1.6492107 = boost
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.019095177 = queryNorm
              0.2535068 = fieldWeight in 2765, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2765)
          0.040860277 = weight(abstract_txt:often in 2765) [ClassicSimilarity], result of:
            0.040860277 = score(doc=2765,freq=1.0), product of:
              0.15594624 = queryWeight, product of:
                1.7045624 = boost
                4.791134 = idf(docFreq=997, maxDocs=44218)
                0.019095177 = queryNorm
              0.26201513 = fieldWeight in 2765, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.791134 = idf(docFreq=997, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2765)
        0.32 = coord(8/25)
    
  2. Vlachidis, A.; Tudhope, D.: ¬A knowledge-based approach to information extraction for semantic interoperability in the archaeology domain (2016) 0.16
    0.15692045 = sum of:
      0.15692045 = product of:
        0.56043017 = sum of:
          0.100780755 = weight(abstract_txt:extraction in 2895) [ClassicSimilarity], result of:
            0.100780755 = score(doc=2895,freq=4.0), product of:
              0.13021705 = queryWeight, product of:
                1.1013982 = boost
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.019095177 = queryNorm
              0.77394444 = fieldWeight in 2895, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.0625 = fieldNorm(doc=2895)
          0.09374874 = weight(abstract_txt:detection in 2895) [ClassicSimilarity], result of:
            0.09374874 = score(doc=2895,freq=2.0), product of:
              0.15633987 = queryWeight, product of:
                1.2068279 = boost
                6.784232 = idf(docFreq=135, maxDocs=44218)
                0.019095177 = queryNorm
              0.59964705 = fieldWeight in 2895, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.784232 = idf(docFreq=135, maxDocs=44218)
                0.0625 = fieldNorm(doc=2895)
          0.028078169 = weight(abstract_txt:text in 2895) [ClassicSimilarity], result of:
            0.028078169 = score(doc=2895,freq=1.0), product of:
              0.11109434 = queryWeight, product of:
                1.4387039 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.019095177 = queryNorm
              0.25274166 = fieldWeight in 2895, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=2895)
          0.042294405 = weight(abstract_txt:relevant in 2895) [ClassicSimilarity], result of:
            0.042294405 = score(doc=2895,freq=1.0), product of:
              0.1459827 = queryWeight, product of:
                1.6492107 = boost
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.019095177 = queryNorm
              0.28972206 = fieldWeight in 2895, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.0625 = fieldNorm(doc=2895)
          0.08224963 = weight(abstract_txt:combination in 2895) [ClassicSimilarity], result of:
            0.08224963 = score(doc=2895,freq=1.0), product of:
              0.22744098 = queryWeight, product of:
                2.0585425 = boost
                5.7860904 = idf(docFreq=368, maxDocs=44218)
                0.019095177 = queryNorm
              0.36163065 = fieldWeight in 2895, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7860904 = idf(docFreq=368, maxDocs=44218)
                0.0625 = fieldNorm(doc=2895)
          0.08391016 = weight(abstract_txt:classes in 2895) [ClassicSimilarity], result of:
            0.08391016 = score(doc=2895,freq=1.0), product of:
              0.23049197 = queryWeight, product of:
                2.0723035 = boost
                5.8247695 = idf(docFreq=354, maxDocs=44218)
                0.019095177 = queryNorm
              0.3640481 = fieldWeight in 2895, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8247695 = idf(docFreq=354, maxDocs=44218)
                0.0625 = fieldNorm(doc=2895)
          0.12936834 = weight(abstract_txt:rules in 2895) [ClassicSimilarity], result of:
            0.12936834 = score(doc=2895,freq=2.0), product of:
              0.27948073 = queryWeight, product of:
                2.7947762 = boost
                5.236983 = idf(docFreq=638, maxDocs=44218)
                0.019095177 = queryNorm
              0.46288824 = fieldWeight in 2895, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.236983 = idf(docFreq=638, maxDocs=44218)
                0.0625 = fieldNorm(doc=2895)
        0.28 = coord(7/25)
    
  3. Principles of data mining and knowledge discovery (1998) 0.15
    0.15433611 = sum of:
      0.15433611 = product of:
        0.6430671 = sum of:
          0.084265165 = weight(abstract_txt:volume in 3822) [ClassicSimilarity], result of:
            0.084265165 = score(doc=3822,freq=1.0), product of:
              0.12633085 = queryWeight, product of:
                1.0848387 = boost
                6.0984654 = idf(docFreq=269, maxDocs=44218)
                0.019095177 = queryNorm
              0.66701967 = fieldWeight in 3822, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0984654 = idf(docFreq=269, maxDocs=44218)
                0.109375 = fieldNorm(doc=3822)
          0.123737745 = weight(abstract_txt:mining in 3822) [ClassicSimilarity], result of:
            0.123737745 = score(doc=3822,freq=2.0), product of:
              0.12953949 = queryWeight, product of:
                1.0985291 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.019095177 = queryNorm
              0.95521253 = fieldWeight in 3822, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.109375 = fieldNorm(doc=3822)
          0.027594995 = weight(abstract_txt:data in 3822) [ClassicSimilarity], result of:
            0.027594995 = score(doc=3822,freq=1.0), product of:
              0.07562074 = queryWeight, product of:
                1.1869869 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.019095177 = queryNorm
              0.36491305 = fieldWeight in 3822, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.109375 = fieldNorm(doc=3822)
          0.049136795 = weight(abstract_txt:text in 3822) [ClassicSimilarity], result of:
            0.049136795 = score(doc=3822,freq=1.0), product of:
              0.11109434 = queryWeight, product of:
                1.4387039 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.019095177 = queryNorm
              0.4422979 = fieldWeight in 3822, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.109375 = fieldNorm(doc=3822)
          0.16008516 = weight(abstract_txt:rules in 3822) [ClassicSimilarity], result of:
            0.16008516 = score(doc=3822,freq=1.0), product of:
              0.27948073 = queryWeight, product of:
                2.7947762 = boost
                5.236983 = idf(docFreq=638, maxDocs=44218)
                0.019095177 = queryNorm
              0.572795 = fieldWeight in 3822, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.236983 = idf(docFreq=638, maxDocs=44218)
                0.109375 = fieldNorm(doc=3822)
          0.19824722 = weight(abstract_txt:association in 3822) [ClassicSimilarity], result of:
            0.19824722 = score(doc=3822,freq=1.0), product of:
              0.32229674 = queryWeight, product of:
                3.0012283 = boost
                5.6238427 = idf(docFreq=433, maxDocs=44218)
                0.019095177 = queryNorm
              0.6151078 = fieldWeight in 3822, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6238427 = idf(docFreq=433, maxDocs=44218)
                0.109375 = fieldNorm(doc=3822)
        0.24 = coord(6/25)
    
  4. Cui, H.; Heidorn, P.B.: ¬The reusability of induced knowledge for the automatic semantic markup of taxonomic descriptions (2007) 0.15
    0.15093617 = sum of:
      0.15093617 = product of:
        0.6289007 = sum of:
          0.015768569 = weight(abstract_txt:data in 84) [ClassicSimilarity], result of:
            0.015768569 = score(doc=84,freq=1.0), product of:
              0.07562074 = queryWeight, product of:
                1.1869869 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.019095177 = queryNorm
              0.20852174 = fieldWeight in 84, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0625 = fieldNorm(doc=84)
          0.16743457 = weight(abstract_txt:induced in 84) [ClassicSimilarity], result of:
            0.16743457 = score(doc=84,freq=2.0), product of:
              0.23013863 = queryWeight, product of:
                1.4642162 = boost
                8.231152 = idf(docFreq=31, maxDocs=44218)
                0.019095177 = queryNorm
              0.7275379 = fieldWeight in 84, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.231152 = idf(docFreq=31, maxDocs=44218)
                0.0625 = fieldNorm(doc=84)
          0.073871635 = weight(abstract_txt:authors in 84) [ClassicSimilarity], result of:
            0.073871635 = score(doc=84,freq=3.0), product of:
              0.14679936 = queryWeight, product of:
                1.6538173 = boost
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.019095177 = queryNorm
              0.50321496 = fieldWeight in 84, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.0625 = fieldNorm(doc=84)
          0.08224963 = weight(abstract_txt:combination in 84) [ClassicSimilarity], result of:
            0.08224963 = score(doc=84,freq=1.0), product of:
              0.22744098 = queryWeight, product of:
                2.0585425 = boost
                5.7860904 = idf(docFreq=368, maxDocs=44218)
                0.019095177 = queryNorm
              0.36163065 = fieldWeight in 84, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7860904 = idf(docFreq=368, maxDocs=44218)
                0.0625 = fieldNorm(doc=84)
          0.12936834 = weight(abstract_txt:rules in 84) [ClassicSimilarity], result of:
            0.12936834 = score(doc=84,freq=2.0), product of:
              0.27948073 = queryWeight, product of:
                2.7947762 = boost
                5.236983 = idf(docFreq=638, maxDocs=44218)
                0.019095177 = queryNorm
              0.46288824 = fieldWeight in 84, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.236983 = idf(docFreq=638, maxDocs=44218)
                0.0625 = fieldNorm(doc=84)
          0.16020796 = weight(abstract_txt:association in 84) [ClassicSimilarity], result of:
            0.16020796 = score(doc=84,freq=2.0), product of:
              0.32229674 = queryWeight, product of:
                3.0012283 = boost
                5.6238427 = idf(docFreq=433, maxDocs=44218)
                0.019095177 = queryNorm
              0.49708214 = fieldWeight in 84, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6238427 = idf(docFreq=433, maxDocs=44218)
                0.0625 = fieldNorm(doc=84)
        0.24 = coord(6/25)
    
  5. Dumais, S.T.: Latent semantic analysis (2003) 0.14
    0.14191595 = sum of:
      0.14191595 = product of:
        0.39421093 = sum of:
          0.024998799 = weight(abstract_txt:mining in 2462) [ClassicSimilarity], result of:
            0.024998799 = score(doc=2462,freq=1.0), product of:
              0.12953949 = queryWeight, product of:
                1.0985291 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.019095177 = queryNorm
              0.19298208 = fieldWeight in 2462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.024444524 = weight(abstract_txt:will in 2462) [ClassicSimilarity], result of:
            0.024444524 = score(doc=2462,freq=4.0), product of:
              0.10129014 = queryWeight, product of:
                1.3737543 = boost
                3.8613079 = idf(docFreq=2528, maxDocs=44218)
                0.019095177 = queryNorm
              0.24133174 = fieldWeight in 2462, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.8613079 = idf(docFreq=2528, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.034388594 = weight(abstract_txt:text in 2462) [ClassicSimilarity], result of:
            0.034388594 = score(doc=2462,freq=6.0), product of:
              0.11109434 = queryWeight, product of:
                1.4387039 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.019095177 = queryNorm
              0.30954406 = fieldWeight in 2462, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.03662803 = weight(abstract_txt:relevant in 2462) [ClassicSimilarity], result of:
            0.03662803 = score(doc=2462,freq=3.0), product of:
              0.1459827 = queryWeight, product of:
                1.6492107 = boost
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.019095177 = queryNorm
              0.25090665 = fieldWeight in 2462, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.030157967 = weight(abstract_txt:authors in 2462) [ClassicSimilarity], result of:
            0.030157967 = score(doc=2462,freq=2.0), product of:
              0.14679936 = queryWeight, product of:
                1.6538173 = boost
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.019095177 = queryNorm
              0.20543665 = fieldWeight in 2462, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.033020087 = weight(abstract_txt:often in 2462) [ClassicSimilarity], result of:
            0.033020087 = score(doc=2462,freq=2.0), product of:
              0.15594624 = queryWeight, product of:
                1.7045624 = boost
                4.791134 = idf(docFreq=997, maxDocs=44218)
                0.019095177 = queryNorm
              0.2117402 = fieldWeight in 2462, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.791134 = idf(docFreq=997, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.11280604 = weight(abstract_txt:words in 2462) [ClassicSimilarity], result of:
            0.11280604 = score(doc=2462,freq=12.0), product of:
              0.1946677 = queryWeight, product of:
                1.9044625 = boost
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.019095177 = queryNorm
              0.57948 = fieldWeight in 2462, product of:
                3.4641016 = tf(freq=12.0), with freq of:
                  12.0 = termFreq=12.0
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.041124813 = weight(abstract_txt:combination in 2462) [ClassicSimilarity], result of:
            0.041124813 = score(doc=2462,freq=1.0), product of:
              0.22744098 = queryWeight, product of:
                2.0585425 = boost
                5.7860904 = idf(docFreq=368, maxDocs=44218)
                0.019095177 = queryNorm
              0.18081532 = fieldWeight in 2462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7860904 = idf(docFreq=368, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.056642067 = weight(abstract_txt:association in 2462) [ClassicSimilarity], result of:
            0.056642067 = score(doc=2462,freq=1.0), product of:
              0.32229674 = queryWeight, product of:
                3.0012283 = boost
                5.6238427 = idf(docFreq=433, maxDocs=44218)
                0.019095177 = queryNorm
              0.17574508 = fieldWeight in 2462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6238427 = idf(docFreq=433, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
        0.36 = coord(9/25)