Document (#34688)

Author
Stenmark, D.
Title
Identifying clusters of user behavior in intranet search engine log files
Source
Journal of the American Society for Information Science and Technology. 59(2008) no.14, S.2232-2243
Year
2008
Abstract
When studying how ordinary Web users interact with Web search engines, researchers tend to either treat the users as a homogeneous group or group them according to search experience. Neither approach is sufficient, we argue, to capture the variety in behavior that is known to exist among searchers. By applying automatic clustering technique based on self-organizing maps to search engine log files from a corporate intranet, we show that users can be usefully separated into distinguishable segments based on their actual search behavior. Based on these segments, future tools for information seeking and retrieval can be targeted to specific segments rather than just made to fit the the average user. The exact number of clusters, and to some extent their characteristics, can be expected to vary between intranets, but our results indicate that some more generic groups may exist. In our study, a large group of users appeared to be fact seekers who would benefit from higher precision, a smaller group of users were more holistically oriented and would likely benefit from higher recall, and a third category of users seemed to constitute the knowledgeable users. These three groups may raise different design implications for search-tool developers.

Similar documents (content)

  1. Zhang, Y.; Broussard, R.; Ke, W.; Gong, X.: Evaluation of a scatter/gather interface for supporting distinct health information search tasks (2014) 0.17
    0.17474277 = sum of:
      0.17474277 = product of:
        0.6240813 = sum of:
          0.015627177 = weight(abstract_txt:based in 3262) [ClassicSimilarity], result of:
            0.015627177 = score(doc=3262,freq=1.0), product of:
              0.077920385 = queryWeight, product of:
                1.2574592 = boost
                3.2088501 = idf(docFreq=4693, maxDocs=42740)
                0.019311134 = queryNorm
              0.20055313 = fieldWeight in 3262, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.2088501 = idf(docFreq=4693, maxDocs=42740)
                0.0625 = fieldNorm(doc=3262)
          0.0595136 = weight(abstract_txt:groups in 3262) [ClassicSimilarity], result of:
            0.0595136 = score(doc=3262,freq=2.0), product of:
              0.1317552 = queryWeight, product of:
                1.3350779 = boost
                5.1103826 = idf(docFreq=700, maxDocs=42740)
                0.019311134 = queryNorm
              0.45169827 = fieldWeight in 3262, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1103826 = idf(docFreq=700, maxDocs=42740)
                0.0625 = fieldNorm(doc=3262)
          0.08767992 = weight(abstract_txt:clusters in 3262) [ClassicSimilarity], result of:
            0.08767992 = score(doc=3262,freq=1.0), product of:
              0.21493167 = queryWeight, product of:
                1.7051905 = boost
                6.527092 = idf(docFreq=169, maxDocs=42740)
                0.019311134 = queryNorm
              0.40794325 = fieldWeight in 3262, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.527092 = idf(docFreq=169, maxDocs=42740)
                0.0625 = fieldNorm(doc=3262)
          0.098802775 = weight(abstract_txt:behavior in 3262) [ClassicSimilarity], result of:
            0.098802775 = score(doc=3262,freq=2.0), product of:
              0.21146257 = queryWeight, product of:
                2.0715008 = boost
                5.286164 = idf(docFreq=587, maxDocs=42740)
                0.019311134 = queryNorm
              0.46723527 = fieldWeight in 3262, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.286164 = idf(docFreq=587, maxDocs=42740)
                0.0625 = fieldNorm(doc=3262)
          0.1279745 = weight(abstract_txt:group in 3262) [ClassicSimilarity], result of:
            0.1279745 = score(doc=3262,freq=3.0), product of:
              0.24159364 = queryWeight, product of:
                2.556704 = boost
                4.8932486 = idf(docFreq=870, maxDocs=42740)
                0.019311134 = queryNorm
              0.5297097 = fieldWeight in 3262, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.8932486 = idf(docFreq=870, maxDocs=42740)
                0.0625 = fieldNorm(doc=3262)
          0.121853456 = weight(abstract_txt:search in 3262) [ClassicSimilarity], result of:
            0.121853456 = score(doc=3262,freq=7.0), product of:
              0.20180564 = queryWeight, product of:
                2.8618705 = boost
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.019311134 = queryNorm
              0.6038159 = fieldWeight in 3262, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.0625 = fieldNorm(doc=3262)
          0.1126299 = weight(abstract_txt:users in 3262) [ClassicSimilarity], result of:
            0.1126299 = score(doc=3262,freq=5.0), product of:
              0.22551154 = queryWeight, product of:
                3.267692 = boost
                3.5737147 = idf(docFreq=3258, maxDocs=42740)
                0.019311134 = queryNorm
              0.49944183 = fieldWeight in 3262, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.5737147 = idf(docFreq=3258, maxDocs=42740)
                0.0625 = fieldNorm(doc=3262)
        0.28 = coord(7/25)
    
  2. Chen, H.-M.; Cooper, M.D.: Stochastic modeling of usage patterns in a Web-based information system (2002) 0.16
    0.15697917 = sum of:
      0.15697917 = product of:
        0.4905599 = sum of:
          0.06836196 = weight(abstract_txt:knowledgeable in 1578) [ClassicSimilarity], result of:
            0.06836196 = score(doc=1578,freq=1.0), product of:
              0.17506224 = queryWeight, product of:
                1.088188 = boost
                8.330686 = idf(docFreq=27, maxDocs=42740)
                0.019311134 = queryNorm
              0.3905009 = fieldWeight in 1578, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.330686 = idf(docFreq=27, maxDocs=42740)
                0.046875 = fieldNorm(doc=1578)
          0.010879886 = weight(abstract_txt:from in 1578) [ClassicSimilarity], result of:
            0.010879886 = score(doc=1578,freq=2.0), product of:
              0.058852214 = queryWeight, product of:
                1.0928228 = boost
                2.7887225 = idf(docFreq=7144, maxDocs=42740)
                0.019311134 = queryNorm
              0.18486792 = fieldWeight in 1578, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.7887225 = idf(docFreq=7144, maxDocs=42740)
                0.046875 = fieldNorm(doc=1578)
          0.023440767 = weight(abstract_txt:based in 1578) [ClassicSimilarity], result of:
            0.023440767 = score(doc=1578,freq=4.0), product of:
              0.077920385 = queryWeight, product of:
                1.2574592 = boost
                3.2088501 = idf(docFreq=4693, maxDocs=42740)
                0.019311134 = queryNorm
              0.3008297 = fieldWeight in 1578, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.2088501 = idf(docFreq=4693, maxDocs=42740)
                0.046875 = fieldNorm(doc=1578)
          0.07057445 = weight(abstract_txt:groups in 1578) [ClassicSimilarity], result of:
            0.07057445 = score(doc=1578,freq=5.0), product of:
              0.1317552 = queryWeight, product of:
                1.3350779 = boost
                5.1103826 = idf(docFreq=700, maxDocs=42740)
                0.019311134 = queryNorm
              0.5356483 = fieldWeight in 1578, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.1103826 = idf(docFreq=700, maxDocs=42740)
                0.046875 = fieldNorm(doc=1578)
          0.052398086 = weight(abstract_txt:behavior in 1578) [ClassicSimilarity], result of:
            0.052398086 = score(doc=1578,freq=1.0), product of:
              0.21146257 = queryWeight, product of:
                2.0715008 = boost
                5.286164 = idf(docFreq=587, maxDocs=42740)
                0.019311134 = queryNorm
              0.24778894 = fieldWeight in 1578, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.286164 = idf(docFreq=587, maxDocs=42740)
                0.046875 = fieldNorm(doc=1578)
          0.13573746 = weight(abstract_txt:group in 1578) [ClassicSimilarity], result of:
            0.13573746 = score(doc=1578,freq=6.0), product of:
              0.24159364 = queryWeight, product of:
                2.556704 = boost
                4.8932486 = idf(docFreq=870, maxDocs=42740)
                0.019311134 = queryNorm
              0.561842 = fieldWeight in 1578, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.8932486 = idf(docFreq=870, maxDocs=42740)
                0.046875 = fieldNorm(doc=1578)
          0.09139009 = weight(abstract_txt:search in 1578) [ClassicSimilarity], result of:
            0.09139009 = score(doc=1578,freq=7.0), product of:
              0.20180564 = queryWeight, product of:
                2.8618705 = boost
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.019311134 = queryNorm
              0.45286193 = fieldWeight in 1578, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.046875 = fieldNorm(doc=1578)
          0.03777721 = weight(abstract_txt:users in 1578) [ClassicSimilarity], result of:
            0.03777721 = score(doc=1578,freq=1.0), product of:
              0.22551154 = queryWeight, product of:
                3.267692 = boost
                3.5737147 = idf(docFreq=3258, maxDocs=42740)
                0.019311134 = queryNorm
              0.16751787 = fieldWeight in 1578, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5737147 = idf(docFreq=3258, maxDocs=42740)
                0.046875 = fieldNorm(doc=1578)
        0.32 = coord(8/25)
    
  3. Chen, H.-M.; Cooper, M.D.: Using clustering techniques to detect usage patterns in a Web-based information system (2001) 0.14
    0.13905267 = sum of:
      0.13905267 = product of:
        0.43453962 = sum of:
          0.09114928 = weight(abstract_txt:knowledgeable in 527) [ClassicSimilarity], result of:
            0.09114928 = score(doc=527,freq=1.0), product of:
              0.17506224 = queryWeight, product of:
                1.088188 = boost
                8.330686 = idf(docFreq=27, maxDocs=42740)
                0.019311134 = queryNorm
              0.52066785 = fieldWeight in 527, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.330686 = idf(docFreq=27, maxDocs=42740)
                0.0625 = fieldNorm(doc=527)
          0.010257656 = weight(abstract_txt:from in 527) [ClassicSimilarity], result of:
            0.010257656 = score(doc=527,freq=1.0), product of:
              0.058852214 = queryWeight, product of:
                1.0928228 = boost
                2.7887225 = idf(docFreq=7144, maxDocs=42740)
                0.019311134 = queryNorm
              0.17429516 = fieldWeight in 527, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7887225 = idf(docFreq=7144, maxDocs=42740)
                0.0625 = fieldNorm(doc=527)
          0.015627177 = weight(abstract_txt:based in 527) [ClassicSimilarity], result of:
            0.015627177 = score(doc=527,freq=1.0), product of:
              0.077920385 = queryWeight, product of:
                1.2574592 = boost
                3.2088501 = idf(docFreq=4693, maxDocs=42740)
                0.019311134 = queryNorm
              0.20055313 = fieldWeight in 527, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.2088501 = idf(docFreq=4693, maxDocs=42740)
                0.0625 = fieldNorm(doc=527)
          0.0595136 = weight(abstract_txt:groups in 527) [ClassicSimilarity], result of:
            0.0595136 = score(doc=527,freq=2.0), product of:
              0.1317552 = queryWeight, product of:
                1.3350779 = boost
                5.1103826 = idf(docFreq=700, maxDocs=42740)
                0.019311134 = queryNorm
              0.45169827 = fieldWeight in 527, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1103826 = idf(docFreq=700, maxDocs=42740)
                0.0625 = fieldNorm(doc=527)
          0.08767992 = weight(abstract_txt:clusters in 527) [ClassicSimilarity], result of:
            0.08767992 = score(doc=527,freq=1.0), product of:
              0.21493167 = queryWeight, product of:
                1.7051905 = boost
                6.527092 = idf(docFreq=169, maxDocs=42740)
                0.019311134 = queryNorm
              0.40794325 = fieldWeight in 527, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.527092 = idf(docFreq=169, maxDocs=42740)
                0.0625 = fieldNorm(doc=527)
          0.07388611 = weight(abstract_txt:group in 527) [ClassicSimilarity], result of:
            0.07388611 = score(doc=527,freq=1.0), product of:
              0.24159364 = queryWeight, product of:
                2.556704 = boost
                4.8932486 = idf(docFreq=870, maxDocs=42740)
                0.019311134 = queryNorm
              0.30582803 = fieldWeight in 527, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8932486 = idf(docFreq=870, maxDocs=42740)
                0.0625 = fieldNorm(doc=527)
          0.046056278 = weight(abstract_txt:search in 527) [ClassicSimilarity], result of:
            0.046056278 = score(doc=527,freq=1.0), product of:
              0.20180564 = queryWeight, product of:
                2.8618705 = boost
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.019311134 = queryNorm
              0.22822097 = fieldWeight in 527, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.0625 = fieldNorm(doc=527)
          0.05036962 = weight(abstract_txt:users in 527) [ClassicSimilarity], result of:
            0.05036962 = score(doc=527,freq=1.0), product of:
              0.22551154 = queryWeight, product of:
                3.267692 = boost
                3.5737147 = idf(docFreq=3258, maxDocs=42740)
                0.019311134 = queryNorm
              0.22335717 = fieldWeight in 527, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5737147 = idf(docFreq=3258, maxDocs=42740)
                0.0625 = fieldNorm(doc=527)
        0.32 = coord(8/25)
    
  4. Hyldegård, J.: Beyond the search process : exploring group members' information behavior in context (2009) 0.13
    0.12668866 = sum of:
      0.12668866 = product of:
        0.5278694 = sum of:
          0.014506516 = weight(abstract_txt:from in 4459) [ClassicSimilarity], result of:
            0.014506516 = score(doc=4459,freq=2.0), product of:
              0.058852214 = queryWeight, product of:
                1.0928228 = boost
                2.7887225 = idf(docFreq=7144, maxDocs=42740)
                0.019311134 = queryNorm
              0.24649057 = fieldWeight in 4459, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.7887225 = idf(docFreq=7144, maxDocs=42740)
                0.0625 = fieldNorm(doc=4459)
          0.022100165 = weight(abstract_txt:based in 4459) [ClassicSimilarity], result of:
            0.022100165 = score(doc=4459,freq=2.0), product of:
              0.077920385 = queryWeight, product of:
                1.2574592 = boost
                3.2088501 = idf(docFreq=4693, maxDocs=42740)
                0.019311134 = queryNorm
              0.28362495 = fieldWeight in 4459, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.2088501 = idf(docFreq=4693, maxDocs=42740)
                0.0625 = fieldNorm(doc=4459)
          0.0595136 = weight(abstract_txt:groups in 4459) [ClassicSimilarity], result of:
            0.0595136 = score(doc=4459,freq=2.0), product of:
              0.1317552 = queryWeight, product of:
                1.3350779 = boost
                5.1103826 = idf(docFreq=700, maxDocs=42740)
                0.019311134 = queryNorm
              0.45169827 = fieldWeight in 4459, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1103826 = idf(docFreq=700, maxDocs=42740)
                0.0625 = fieldNorm(doc=4459)
          0.17113143 = weight(abstract_txt:behavior in 4459) [ClassicSimilarity], result of:
            0.17113143 = score(doc=4459,freq=6.0), product of:
              0.21146257 = queryWeight, product of:
                2.0715008 = boost
                5.286164 = idf(docFreq=587, maxDocs=42740)
                0.019311134 = queryNorm
              0.80927527 = fieldWeight in 4459, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.286164 = idf(docFreq=587, maxDocs=42740)
                0.0625 = fieldNorm(doc=4459)
          0.19548427 = weight(abstract_txt:group in 4459) [ClassicSimilarity], result of:
            0.19548427 = score(doc=4459,freq=7.0), product of:
              0.24159364 = queryWeight, product of:
                2.556704 = boost
                4.8932486 = idf(docFreq=870, maxDocs=42740)
                0.019311134 = queryNorm
              0.8091449 = fieldWeight in 4459, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.8932486 = idf(docFreq=870, maxDocs=42740)
                0.0625 = fieldNorm(doc=4459)
          0.065133415 = weight(abstract_txt:search in 4459) [ClassicSimilarity], result of:
            0.065133415 = score(doc=4459,freq=2.0), product of:
              0.20180564 = queryWeight, product of:
                2.8618705 = boost
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.019311134 = queryNorm
              0.3227532 = fieldWeight in 4459, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.0625 = fieldNorm(doc=4459)
        0.24 = coord(6/25)
    
  5. Shneiderman, B.; Byrd, D.; Croft, W.B.: Clarifying search : a user-interface framework for text searches (1997) 0.13
    0.12650909 = sum of:
      0.12650909 = product of:
        0.5271212 = sum of:
          0.017950898 = weight(abstract_txt:from in 3259) [ClassicSimilarity], result of:
            0.017950898 = score(doc=3259,freq=1.0), product of:
              0.058852214 = queryWeight, product of:
                1.0928228 = boost
                2.7887225 = idf(docFreq=7144, maxDocs=42740)
                0.019311134 = queryNorm
              0.30501652 = fieldWeight in 3259, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7887225 = idf(docFreq=7144, maxDocs=42740)
                0.109375 = fieldNorm(doc=3259)
          0.08894603 = weight(abstract_txt:higher in 3259) [ClassicSimilarity], result of:
            0.08894603 = score(doc=3259,freq=1.0), product of:
              0.14942594 = queryWeight, product of:
                1.4217908 = boost
                5.4423003 = idf(docFreq=502, maxDocs=42740)
                0.019311134 = queryNorm
              0.5952516 = fieldWeight in 3259, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4423003 = idf(docFreq=502, maxDocs=42740)
                0.109375 = fieldNorm(doc=3259)
          0.12217826 = weight(abstract_txt:benefit in 3259) [ClassicSimilarity], result of:
            0.12217826 = score(doc=3259,freq=1.0), product of:
              0.184645 = queryWeight, product of:
                1.580489 = boost
                6.0497622 = idf(docFreq=273, maxDocs=42740)
                0.019311134 = queryNorm
              0.66169274 = fieldWeight in 3259, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0497622 = idf(docFreq=273, maxDocs=42740)
                0.109375 = fieldNorm(doc=3259)
          0.12930068 = weight(abstract_txt:group in 3259) [ClassicSimilarity], result of:
            0.12930068 = score(doc=3259,freq=1.0), product of:
              0.24159364 = queryWeight, product of:
                2.556704 = boost
                4.8932486 = idf(docFreq=870, maxDocs=42740)
                0.019311134 = queryNorm
              0.53519905 = fieldWeight in 3259, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8932486 = idf(docFreq=870, maxDocs=42740)
                0.109375 = fieldNorm(doc=3259)
          0.08059849 = weight(abstract_txt:search in 3259) [ClassicSimilarity], result of:
            0.08059849 = score(doc=3259,freq=1.0), product of:
              0.20180564 = queryWeight, product of:
                2.8618705 = boost
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.019311134 = queryNorm
              0.3993867 = fieldWeight in 3259, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.109375 = fieldNorm(doc=3259)
          0.08814683 = weight(abstract_txt:users in 3259) [ClassicSimilarity], result of:
            0.08814683 = score(doc=3259,freq=1.0), product of:
              0.22551154 = queryWeight, product of:
                3.267692 = boost
                3.5737147 = idf(docFreq=3258, maxDocs=42740)
                0.019311134 = queryNorm
              0.39087504 = fieldWeight in 3259, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5737147 = idf(docFreq=3258, maxDocs=42740)
                0.109375 = fieldNorm(doc=3259)
        0.24 = coord(6/25)