Search (3 results, page 1 of 1)

  • × author_ss:"Chen, H.-M."
  • × theme_ss:"OPAC"
  1. Chen, H.-M.; Cooper, M.D.: Using clustering techniques to detect usage patterns in a Web-based information system (2001) 0.00
    0.0029745363 = product of:
      0.011898145 = sum of:
        0.011898145 = weight(_text_:information in 6526) [ClassicSimilarity], result of:
          0.011898145 = score(doc=6526,freq=8.0), product of:
            0.06134496 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.034944877 = queryNorm
            0.19395474 = fieldWeight in 6526, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6526)
      0.25 = coord(1/4)
    
    Abstract
    Different users of a Web-based information system will have different goals and different ways of performing their work. This article explores the possibility that we can automatically detect usage patterns without demographic information about the individuals. First, a set of 47 variables was defined that can be used to characterize a user session. The values of these variables were computed for approximately 257,000 sessions. Second, principal component analysis was employed to reduce the dimensions of the original data set. Third, a twostage, hybrid clustering method was proposed to categorize sessions into groups. Finally, an external criteriabased test of cluster validity was performed to verify the validity of the resulting usage groups (clusters). The proposed methodology was demonstrated and tested for validity using two independent samples of user sessions drawn from the transaction logs of the University of California's MELVYL® on-line library catalog system (www.melvyl.ucop.edu). The results indicate that there were six distinct categories of use in the MELVYL system: knowledgeable and sophisticated use, unsophisticated use, highly interactive use with good search performance, known-item searching, help-intensive searching, and relatively unsuccessful use. Their characteristics were interpreted and compared qualitatively. The analysis shows that each group had distinct patterns of use of the system, which justifies the methodology employed in this study
    Source
    Journal of the American Society for Information Science and technology. 52(2001) no.11, S.888-904
  2. Chen, H.-M.; Cooper, M.D.: Stochastic modeling of usage patterns in a Web-based information system (2002) 0.00
    0.002379629 = product of:
      0.009518516 = sum of:
        0.009518516 = weight(_text_:information in 577) [ClassicSimilarity], result of:
          0.009518516 = score(doc=577,freq=8.0), product of:
            0.06134496 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.034944877 = queryNorm
            0.1551638 = fieldWeight in 577, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=577)
      0.25 = coord(1/4)
    
    Abstract
    Users move from one state (or task) to another in an information system's labyrinth as they try to accomplish their work, and the amount of time they spend in each state varies. This article uses continuous-time stochastic models, mainly based on semi-Markov chains, to derive user state transition patterns (both in rates and in probabilities) in a Web-based information system. The methodology was demonstrated with 126,925 search sessions drawn from the transaction logs of the University of California's MELVYL® library catalog system (www.melvyLucop.edu). First, user sessions were categorized into six groups based on their similar use of the system. Second, by using a three-layer hierarchical taxonomy of the system Web pages, user sessions in each usage group were transformed into a sequence of states. All the usage groups but one have third-order sequential dependency in state transitions. The sole exception has fourth-order sequential dependency. The transition rates as well as transition probabilities of the semi-Markov model provide a background for interpreting user behavior probabilistically, at various levels of detail. Finally, the differences in derived usage patterns between usage groups were tested statistically. The test results showed that different groups have distinct patterns of system use. Knowledge of the extent of sequential dependency is beneficial because it allows one to predict a user's next move in a search space based on the past moves that have been made. It can also be used to help customize the design of the user interface to the system to facilitate interaction. The group CL6 labeled "knowledgeable and sophisticated usage" and the group CL7 labeled "unsophisticated usage" both had third-order sequential dependency and had the same most-frequently occurring search pattern: screen display, record display, screen display, and record display. The group CL8 called "highly interactive use with good search results" had fourth-order sequential dependency, and its most frequently occurring pattern was the same as CL6 and CL7 with one more screen display action added. The group CL13, called "known-item searching" had third-order sequential dependency, and its most frequently occurring pattern was index access, search with retrievals, screen display, and record display. Group CL14 called "help intensive searching," and CL18 called "relatively unsuccessful" both had thirdorder sequential dependency, and for both groups the most frequently occurring pattern was index access, search without retrievals, index access, and again, search without retrievals.
    Source
    Journal of the American Society for Information Science and technology. 53(2002) no.7, S.536-548
  3. Cooper, M.D.; Chen, H.-M.: Predicting the relevance of a library catalog search (2001) 0.00
    0.0016826519 = product of:
      0.0067306077 = sum of:
        0.0067306077 = weight(_text_:information in 6519) [ClassicSimilarity], result of:
          0.0067306077 = score(doc=6519,freq=4.0), product of:
            0.06134496 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.034944877 = queryNorm
            0.10971737 = fieldWeight in 6519, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=6519)
      0.25 = coord(1/4)
    
    Abstract
    Relevance has been a difficult concept to define, let alone measure. In this paper, a simple operational definition of relevance is proposed for a Web-based library catalog: whether or not during a search session the user saves, prints, mails, or downloads a citation. If one of those actions is performed, the session is considered relevant to the user. An analysis is presented illustrating the advantages and disadvantages of this definition. With this definition and good transaction logging, it is possible to ascertain the relevance of a session. This was done for 905,970 sessions conducted with the University of California's Melvyl online catalog. Next, a methodology was developed to try to predict the relevance of a session. A number of variables were defined that characterize a session, none of which used any demographic information about the user. The values of the variables were computed for the sessions. Principal components analysis was used to extract a new set of variables out of the original set. A stratified random sampling technique was used to form ten strata such that each new strata of 90,570 sessions contained the same proportion of relevant to nonrelevant sessions. Logistic regression was used to ascertain the regression coefficients for nine of the ten strata. Then, the coefficients were used to predict the relevance of the sessions in the missing strata. Overall, 17.85% of the sessions were determined to be relevant. The predicted number of relevant sessions for all ten strata was 11 %, a 6.85% difference. The authors believe that the methodology can be further refined and the prediction improved. This methodology could also have significant application in improving user searching and also in predicting electronic commerce buying decisions without the use of personal demographic data
    Source
    Journal of the American Society for Information Science and technology. 52(2001) no.10, S.813-827