Document (#34687)

Author
Stenmark, D.
Title
Identifying clusters of user behavior in intranet search engine log files
Source
Journal of the American Society for Information Science and Technology. 59(2008) no.14, S.2232-2243
Year
2008
Abstract
When studying how ordinary Web users interact with Web search engines, researchers tend to either treat the users as a homogeneous group or group them according to search experience. Neither approach is sufficient, we argue, to capture the variety in behavior that is known to exist among searchers. By applying automatic clustering technique based on self-organizing maps to search engine log files from a corporate intranet, we show that users can be usefully separated into distinguishable segments based on their actual search behavior. Based on these segments, future tools for information seeking and retrieval can be targeted to specific segments rather than just made to fit the the average user. The exact number of clusters, and to some extent their characteristics, can be expected to vary between intranets, but our results indicate that some more generic groups may exist. In our study, a large group of users appeared to be fact seekers who would benefit from higher precision, a smaller group of users were more holistically oriented and would likely benefit from higher recall, and a third category of users seemed to constitute the knowledgeable users. These three groups may raise different design implications for search-tool developers.

Similar documents (content)

  1. Zhang, Y.; Broussard, R.; Ke, W.; Gong, X.: Evaluation of a scatter/gather interface for supporting distinct health information search tasks (2014) 0.17
    0.17361878 = sum of:
      0.17361878 = product of:
        0.62006706 = sum of:
          0.015372263 = weight(abstract_txt:based in 1261) [ClassicSimilarity], result of:
            0.015372263 = score(doc=1261,freq=1.0), product of:
              0.07715238 = queryWeight, product of:
                1.252387 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.019324234 = queryNorm
              0.19924548 = fieldWeight in 1261, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.0625 = fieldNorm(doc=1261)
          0.059330847 = weight(abstract_txt:groups in 1261) [ClassicSimilarity], result of:
            0.059330847 = score(doc=1261,freq=2.0), product of:
              0.1316247 = queryWeight, product of:
                1.3356324 = boost
                5.0997415 = idf(docFreq=732, maxDocs=44218)
                0.019324234 = queryNorm
              0.4507577 = fieldWeight in 1261, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.0997415 = idf(docFreq=732, maxDocs=44218)
                0.0625 = fieldNorm(doc=1261)
          0.08747525 = weight(abstract_txt:clusters in 1261) [ClassicSimilarity], result of:
            0.08747525 = score(doc=1261,freq=1.0), product of:
              0.2148245 = queryWeight, product of:
                1.7063187 = boost
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.019324234 = queryNorm
              0.407194 = fieldWeight in 1261, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.0625 = fieldNorm(doc=1261)
          0.09485085 = weight(abstract_txt:behavior in 1261) [ClassicSimilarity], result of:
            0.09485085 = score(doc=1261,freq=2.0), product of:
              0.20600367 = queryWeight, product of:
                2.046451 = boost
                5.2092032 = idf(docFreq=656, maxDocs=44218)
                0.019324234 = queryNorm
              0.46043286 = fieldWeight in 1261, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.2092032 = idf(docFreq=656, maxDocs=44218)
                0.0625 = fieldNorm(doc=1261)
          0.12752546 = weight(abstract_txt:group in 1261) [ClassicSimilarity], result of:
            0.12752546 = score(doc=1261,freq=3.0), product of:
              0.24128364 = queryWeight, product of:
                2.5573914 = boost
                4.8823442 = idf(docFreq=910, maxDocs=44218)
                0.019324234 = queryNorm
              0.5285293 = fieldWeight in 1261, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.8823442 = idf(docFreq=910, maxDocs=44218)
                0.0625 = fieldNorm(doc=1261)
          0.12289698 = weight(abstract_txt:search in 1261) [ClassicSimilarity], result of:
            0.12289698 = score(doc=1261,freq=7.0), product of:
              0.20317124 = queryWeight, product of:
                2.8741539 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.019324234 = queryNorm
              0.60489357 = fieldWeight in 1261, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.0625 = fieldNorm(doc=1261)
          0.1126154 = weight(abstract_txt:users in 1261) [ClassicSimilarity], result of:
            0.1126154 = score(doc=1261,freq=5.0), product of:
              0.22573118 = queryWeight, product of:
                3.2722619 = boost
                3.569778 = idf(docFreq=3384, maxDocs=44218)
                0.019324234 = queryNorm
              0.49889165 = fieldWeight in 1261, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.569778 = idf(docFreq=3384, maxDocs=44218)
                0.0625 = fieldNorm(doc=1261)
        0.28 = coord(7/25)
    
  2. Chen, H.-M.; Cooper, M.D.: Stochastic modeling of usage patterns in a Web-based information system (2002) 0.16
    0.15647127 = sum of:
      0.15647127 = product of:
        0.48897272 = sum of:
          0.010625453 = weight(abstract_txt:from in 577) [ClassicSimilarity], result of:
            0.010625453 = score(doc=577,freq=2.0), product of:
              0.057992462 = queryWeight, product of:
                1.0857996 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.019324234 = queryNorm
              0.18322127 = fieldWeight in 577, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.046875 = fieldNorm(doc=577)
          0.0694226 = weight(abstract_txt:knowledgeable in 577) [ClassicSimilarity], result of:
            0.0694226 = score(doc=577,freq=1.0), product of:
              0.17705578 = queryWeight, product of:
                1.0953636 = boost
                8.364683 = idf(docFreq=27, maxDocs=44218)
                0.019324234 = queryNorm
              0.39209452 = fieldWeight in 577, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.364683 = idf(docFreq=27, maxDocs=44218)
                0.046875 = fieldNorm(doc=577)
          0.023058396 = weight(abstract_txt:based in 577) [ClassicSimilarity], result of:
            0.023058396 = score(doc=577,freq=4.0), product of:
              0.07715238 = queryWeight, product of:
                1.252387 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.019324234 = queryNorm
              0.29886824 = fieldWeight in 577, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.046875 = fieldNorm(doc=577)
          0.07035773 = weight(abstract_txt:groups in 577) [ClassicSimilarity], result of:
            0.07035773 = score(doc=577,freq=5.0), product of:
              0.1316247 = queryWeight, product of:
                1.3356324 = boost
                5.0997415 = idf(docFreq=732, maxDocs=44218)
                0.019324234 = queryNorm
              0.5345329 = fieldWeight in 577, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.0997415 = idf(docFreq=732, maxDocs=44218)
                0.046875 = fieldNorm(doc=577)
          0.050302263 = weight(abstract_txt:behavior in 577) [ClassicSimilarity], result of:
            0.050302263 = score(doc=577,freq=1.0), product of:
              0.20600367 = queryWeight, product of:
                2.046451 = boost
                5.2092032 = idf(docFreq=656, maxDocs=44218)
                0.019324234 = queryNorm
              0.2441814 = fieldWeight in 577, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2092032 = idf(docFreq=656, maxDocs=44218)
                0.046875 = fieldNorm(doc=577)
          0.1352612 = weight(abstract_txt:group in 577) [ClassicSimilarity], result of:
            0.1352612 = score(doc=577,freq=6.0), product of:
              0.24128364 = queryWeight, product of:
                2.5573914 = boost
                4.8823442 = idf(docFreq=910, maxDocs=44218)
                0.019324234 = queryNorm
              0.56058997 = fieldWeight in 577, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.8823442 = idf(docFreq=910, maxDocs=44218)
                0.046875 = fieldNorm(doc=577)
          0.092172734 = weight(abstract_txt:search in 577) [ClassicSimilarity], result of:
            0.092172734 = score(doc=577,freq=7.0), product of:
              0.20317124 = queryWeight, product of:
                2.8741539 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.019324234 = queryNorm
              0.45367017 = fieldWeight in 577, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.046875 = fieldNorm(doc=577)
          0.03777235 = weight(abstract_txt:users in 577) [ClassicSimilarity], result of:
            0.03777235 = score(doc=577,freq=1.0), product of:
              0.22573118 = queryWeight, product of:
                3.2722619 = boost
                3.569778 = idf(docFreq=3384, maxDocs=44218)
                0.019324234 = queryNorm
              0.16733333 = fieldWeight in 577, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.569778 = idf(docFreq=3384, maxDocs=44218)
                0.046875 = fieldNorm(doc=577)
        0.32 = coord(8/25)
    
  3. Chen, H.-M.; Cooper, M.D.: Using clustering techniques to detect usage patterns in a Web-based information system (2001) 0.14
    0.13926409 = sum of:
      0.13926409 = product of:
        0.43520027 = sum of:
          0.010017772 = weight(abstract_txt:from in 6526) [ClassicSimilarity], result of:
            0.010017772 = score(doc=6526,freq=1.0), product of:
              0.057992462 = queryWeight, product of:
                1.0857996 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.019324234 = queryNorm
              0.17274266 = fieldWeight in 6526, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=6526)
          0.092563465 = weight(abstract_txt:knowledgeable in 6526) [ClassicSimilarity], result of:
            0.092563465 = score(doc=6526,freq=1.0), product of:
              0.17705578 = queryWeight, product of:
                1.0953636 = boost
                8.364683 = idf(docFreq=27, maxDocs=44218)
                0.019324234 = queryNorm
              0.5227927 = fieldWeight in 6526, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.364683 = idf(docFreq=27, maxDocs=44218)
                0.0625 = fieldNorm(doc=6526)
          0.015372263 = weight(abstract_txt:based in 6526) [ClassicSimilarity], result of:
            0.015372263 = score(doc=6526,freq=1.0), product of:
              0.07715238 = queryWeight, product of:
                1.252387 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.019324234 = queryNorm
              0.19924548 = fieldWeight in 6526, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.0625 = fieldNorm(doc=6526)
          0.059330847 = weight(abstract_txt:groups in 6526) [ClassicSimilarity], result of:
            0.059330847 = score(doc=6526,freq=2.0), product of:
              0.1316247 = queryWeight, product of:
                1.3356324 = boost
                5.0997415 = idf(docFreq=732, maxDocs=44218)
                0.019324234 = queryNorm
              0.4507577 = fieldWeight in 6526, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.0997415 = idf(docFreq=732, maxDocs=44218)
                0.0625 = fieldNorm(doc=6526)
          0.08747525 = weight(abstract_txt:clusters in 6526) [ClassicSimilarity], result of:
            0.08747525 = score(doc=6526,freq=1.0), product of:
              0.2148245 = queryWeight, product of:
                1.7063187 = boost
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.019324234 = queryNorm
              0.407194 = fieldWeight in 6526, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.0625 = fieldNorm(doc=6526)
          0.07362686 = weight(abstract_txt:group in 6526) [ClassicSimilarity], result of:
            0.07362686 = score(doc=6526,freq=1.0), product of:
              0.24128364 = queryWeight, product of:
                2.5573914 = boost
                4.8823442 = idf(docFreq=910, maxDocs=44218)
                0.019324234 = queryNorm
              0.30514652 = fieldWeight in 6526, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8823442 = idf(docFreq=910, maxDocs=44218)
                0.0625 = fieldNorm(doc=6526)
          0.04645069 = weight(abstract_txt:search in 6526) [ClassicSimilarity], result of:
            0.04645069 = score(doc=6526,freq=1.0), product of:
              0.20317124 = queryWeight, product of:
                2.8741539 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.019324234 = queryNorm
              0.22862828 = fieldWeight in 6526, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.0625 = fieldNorm(doc=6526)
          0.05036314 = weight(abstract_txt:users in 6526) [ClassicSimilarity], result of:
            0.05036314 = score(doc=6526,freq=1.0), product of:
              0.22573118 = queryWeight, product of:
                3.2722619 = boost
                3.569778 = idf(docFreq=3384, maxDocs=44218)
                0.019324234 = queryNorm
              0.22311112 = fieldWeight in 6526, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.569778 = idf(docFreq=3384, maxDocs=44218)
                0.0625 = fieldNorm(doc=6526)
        0.32 = coord(8/25)
    
  4. Shneiderman, B.; Byrd, D.; Croft, W.B.: Clarifying search : a user-interface framework for text searches (1997) 0.13
    0.12560445 = sum of:
      0.12560445 = product of:
        0.5233519 = sum of:
          0.017531103 = weight(abstract_txt:from in 1258) [ClassicSimilarity], result of:
            0.017531103 = score(doc=1258,freq=1.0), product of:
              0.057992462 = queryWeight, product of:
                1.0857996 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.019324234 = queryNorm
              0.30229968 = fieldWeight in 1258, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.109375 = fieldNorm(doc=1258)
          0.08636036 = weight(abstract_txt:higher in 1258) [ClassicSimilarity], result of:
            0.08636036 = score(doc=1258,freq=1.0), product of:
              0.14667112 = queryWeight, product of:
                1.4099073 = boost
                5.3833394 = idf(docFreq=551, maxDocs=44218)
                0.019324234 = queryNorm
              0.58880275 = fieldWeight in 1258, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3833394 = idf(docFreq=551, maxDocs=44218)
                0.109375 = fieldNorm(doc=1258)
          0.121189244 = weight(abstract_txt:benefit in 1258) [ClassicSimilarity], result of:
            0.121189244 = score(doc=1258,freq=1.0), product of:
              0.18384185 = queryWeight, product of:
                1.578485 = boost
                6.027006 = idf(docFreq=289, maxDocs=44218)
                0.019324234 = queryNorm
              0.65920377 = fieldWeight in 1258, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.027006 = idf(docFreq=289, maxDocs=44218)
                0.109375 = fieldNorm(doc=1258)
          0.12884702 = weight(abstract_txt:group in 1258) [ClassicSimilarity], result of:
            0.12884702 = score(doc=1258,freq=1.0), product of:
              0.24128364 = queryWeight, product of:
                2.5573914 = boost
                4.8823442 = idf(docFreq=910, maxDocs=44218)
                0.019324234 = queryNorm
              0.5340064 = fieldWeight in 1258, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8823442 = idf(docFreq=910, maxDocs=44218)
                0.109375 = fieldNorm(doc=1258)
          0.08128871 = weight(abstract_txt:search in 1258) [ClassicSimilarity], result of:
            0.08128871 = score(doc=1258,freq=1.0), product of:
              0.20317124 = queryWeight, product of:
                2.8741539 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.019324234 = queryNorm
              0.4000995 = fieldWeight in 1258, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.109375 = fieldNorm(doc=1258)
          0.08813549 = weight(abstract_txt:users in 1258) [ClassicSimilarity], result of:
            0.08813549 = score(doc=1258,freq=1.0), product of:
              0.22573118 = queryWeight, product of:
                3.2722619 = boost
                3.569778 = idf(docFreq=3384, maxDocs=44218)
                0.019324234 = queryNorm
              0.39044446 = fieldWeight in 1258, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.569778 = idf(docFreq=3384, maxDocs=44218)
                0.109375 = fieldNorm(doc=1258)
        0.24 = coord(6/25)
    
  5. Hyldegård, J.: Beyond the search process : exploring group members' information behavior in context (2009) 0.12
    0.12480331 = sum of:
      0.12480331 = product of:
        0.5200138 = sum of:
          0.01416727 = weight(abstract_txt:from in 2458) [ClassicSimilarity], result of:
            0.01416727 = score(doc=2458,freq=2.0), product of:
              0.057992462 = queryWeight, product of:
                1.0857996 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.019324234 = queryNorm
              0.24429502 = fieldWeight in 2458, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=2458)
          0.021739662 = weight(abstract_txt:based in 2458) [ClassicSimilarity], result of:
            0.021739662 = score(doc=2458,freq=2.0), product of:
              0.07715238 = queryWeight, product of:
                1.252387 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.019324234 = queryNorm
              0.28177565 = fieldWeight in 2458, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.0625 = fieldNorm(doc=2458)
          0.059330847 = weight(abstract_txt:groups in 2458) [ClassicSimilarity], result of:
            0.059330847 = score(doc=2458,freq=2.0), product of:
              0.1316247 = queryWeight, product of:
                1.3356324 = boost
                5.0997415 = idf(docFreq=732, maxDocs=44218)
                0.019324234 = queryNorm
              0.4507577 = fieldWeight in 2458, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.0997415 = idf(docFreq=732, maxDocs=44218)
                0.0625 = fieldNorm(doc=2458)
          0.16428651 = weight(abstract_txt:behavior in 2458) [ClassicSimilarity], result of:
            0.16428651 = score(doc=2458,freq=6.0), product of:
              0.20600367 = queryWeight, product of:
                2.046451 = boost
                5.2092032 = idf(docFreq=656, maxDocs=44218)
                0.019324234 = queryNorm
              0.79749316 = fieldWeight in 2458, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.2092032 = idf(docFreq=656, maxDocs=44218)
                0.0625 = fieldNorm(doc=2458)
          0.19479835 = weight(abstract_txt:group in 2458) [ClassicSimilarity], result of:
            0.19479835 = score(doc=2458,freq=7.0), product of:
              0.24128364 = queryWeight, product of:
                2.5573914 = boost
                4.8823442 = idf(docFreq=910, maxDocs=44218)
                0.019324234 = queryNorm
              0.80734175 = fieldWeight in 2458, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.8823442 = idf(docFreq=910, maxDocs=44218)
                0.0625 = fieldNorm(doc=2458)
          0.065691195 = weight(abstract_txt:search in 2458) [ClassicSimilarity], result of:
            0.065691195 = score(doc=2458,freq=2.0), product of:
              0.20317124 = queryWeight, product of:
                2.8741539 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.019324234 = queryNorm
              0.3233292 = fieldWeight in 2458, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.0625 = fieldNorm(doc=2458)
        0.24 = coord(6/25)