Search (32 results, page 1 of 2)

Zhang, Y.; Jansen, B.J.; Spink, A.: Identification of factors predicting clickthrough in Web searching using neural network analysis (2009) 0.05

0.050362572 = product of:
  0.07554386 = sum of:
    0.009060195 = weight(_text_:in in 2742) [ClassicSimilarity], result of:
      0.009060195 = score(doc=2742,freq=4.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.12752387 = fieldWeight in 2742, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=2742)
    0.06648366 = sum of:
      0.024024425 = weight(_text_:science in 2742) [ClassicSimilarity], result of:
        0.024024425 = score(doc=2742,freq=2.0), product of:
          0.1375819 = queryWeight, product of:
            2.6341193 = idf(docFreq=8627, maxDocs=44218)
            0.052230705 = queryNorm
          0.17461908 = fieldWeight in 2742, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            2.6341193 = idf(docFreq=8627, maxDocs=44218)
            0.046875 = fieldNorm(doc=2742)
      0.042459235 = weight(_text_:22 in 2742) [ClassicSimilarity], result of:
        0.042459235 = score(doc=2742,freq=2.0), product of:
          0.18290302 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052230705 = queryNorm
          0.23214069 = fieldWeight in 2742, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=2742)
  0.6666667 = coord(2/3)

Abstract: In this research, we aim to identify factors that significantly affect the clickthrough of Web searchers. Our underlying goal is determine more efficient methods to optimize the clickthrough rate. We devise a clickthrough metric for measuring customer satisfaction of search engine results using the number of links visited, number of queries a user submits, and rank of clicked links. We use a neural network to detect the significant influence of searching characteristics on future user clickthrough. Our results show that high occurrences of query reformulation, lengthy searching duration, longer query length, and the higher ranking of prior clicked links correlate positively with future clickthrough. We provide recommendations for leveraging these findings for improving the performance of search engine retrieval and result ranking, along with implications for search engine marketing.
Date: 22. 3.2009 17:49:11
Source: Journal of the American Society for Information Science and Technology. 60(2009) no.3, S.557-570

Jansen, B.J.; Spink, A.; Pedersen, J.: ¬A temporal comparison of AItaVista Web searching (2005) 0.02
```
0.017558426 = product of:
  0.026337638 = sum of:
    0.014325427 = weight(_text_:in in 3454) [ClassicSimilarity], result of:
      0.014325427 = score(doc=3454,freq=10.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.20163295 = fieldWeight in 3454, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=3454)
    0.012012213 = product of:
      0.024024425 = sum of:
        0.024024425 = weight(_text_:science in 3454) [ClassicSimilarity], result of:
          0.024024425 = score(doc=3454,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.17461908 = fieldWeight in 3454, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.046875 = fieldNorm(doc=3454)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Major Web search engines, such as AItaVista, are essential tools in the quest to locate online information. This article reports research that used transaction log analysis to examine the characteristics and changes in AItaVista Web searching that occurred from 1998 to 2002. The research questions we examined are (1) What are the changes in AItaVista Web searching from 1998 to 2002? (2) What are the current characteristics of AItaVista searching, including the duration and frequency of search sessions? (3) What changes in the information needs of AItaVista users occurred between 1998 and 2002? The results of our research show (1) a move toward more interactivity with increases in session and query length, (2) with 70% of session durations at 5 minutes or less, the frequency of interaction is increasing, but it is happening very quickly, and (3) a broadening range of Web searchers' information needs, with the most frequent terms accounting for less than 1% of total term usage. We discuss the implications of these findings for the development of Web search engines.

Source

Journal of the American Society for Information Science and Technology. 56(2005) no.6, S.559-570
Spink, A.; Wolfram, D.; Jansen, B.J.; Saracevic, T.: Searching the Web : the public and their queries (2001) 0.02
```
0.015605008 = product of:
  0.023407511 = sum of:
    0.01539937 = weight(_text_:in in 6980) [ClassicSimilarity], result of:
      0.01539937 = score(doc=6980,freq=26.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.2167489 = fieldWeight in 6980, product of:
          5.0990195 = tf(freq=26.0), with freq of:
            26.0 = termFreq=26.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.03125 = fieldNorm(doc=6980)
    0.008008142 = product of:
      0.016016284 = sum of:
        0.016016284 = weight(_text_:science in 6980) [ClassicSimilarity], result of:
          0.016016284 = score(doc=6980,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.11641272 = fieldWeight in 6980, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.03125 = fieldNorm(doc=6980)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

In previous articles, we reported the state of Web searching in 1997 (Jansen, Spink, & Saracevic, 2000) and in 1999 (Spink, Wolfram, Jansen, & Saracevic, 2001). Such snapshot studies and statistics on Web use appear regularly (OCLC, 1999), but provide little information about Web searching trends. In this article, we compare and contrast results from our two previous studies of Excite queries' data sets, each containing over 1 million queries submitted by over 200,000 Excite users collected on 16 September 1997 and 20 December 1999. We examine how public Web searching changing during that 2-year time period. As Table 1 shows, the overall structure of Web queries in some areas did not change, while in others we see change from 1997 to 1999. Our comparison shows how Web searching changed incrementally and also dramatically. We see some moves toward greater simplicity, including shorter queries (i.e., fewer terms) and shorter sessions (i.e., fewer queries per user), with little modification (addition or deletion) of terms in subsequent queries. The trend toward shorter queries suggests that Web information content should target specific terms in order to reach Web users. Another trend was to view fewer pages of results per query. Most Excite users examined only one page of results per query, since an Excite results page contains ten ranked Web sites. Were users satisfied with the results and did not need to view more pages? It appears that the public continues to have a low tolerance of wading through retrieved sites. This decline in interactivity levels is a disturbing finding for the future of Web searching. Queries that included Boolean operators were in the minority, but the percentage increased between the two time periods. Most Boolean use involved the AND operator with many mistakes. The use of relevance feedback almost doubled from 1997 to 1999, but overall use was still small. An unusually large number of terms were used with low frequency, such as personal names, spelling errors, non-English words, and Web-specific terms, such as URLs. Web query vocabulary contains more words than found in large English texts in general. The public language of Web queries has its own and unique characteristics. How did Web searching topics change from 1997 to 1999? We classified a random sample of 2,414 queries from 1997 and 2,539 queries from 1999 into 11 categories (Table 2). From 1997 to 1999, Web searching shifted from entertainment, recreation and sex, and pornography, preferences to e-commerce-related topics under commerce, travel, employment, and economy. This shift coincided with changes in information distribution on the publicly indexed Web.

Source

Journal of the American Society for Information Science and technology. 52(2001) no.3, S.226-234
Jansen, B.J.; Rieh, S.Y.: ¬The seventeen theoretical constructs of information searching and information retrieval (2010) 0.02
```
0.015405759 = product of:
  0.023108639 = sum of:
    0.011096427 = weight(_text_:in in 3690) [ClassicSimilarity], result of:
      0.011096427 = score(doc=3690,freq=6.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.1561842 = fieldWeight in 3690, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=3690)
    0.012012213 = product of:
      0.024024425 = sum of:
        0.024024425 = weight(_text_:science in 3690) [ClassicSimilarity], result of:
          0.024024425 = score(doc=3690,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.17461908 = fieldWeight in 3690, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.046875 = fieldNorm(doc=3690)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

In this article, we identify, compare, and contrast theoretical constructs for the fields of information searching and information retrieval to emphasize the uniqueness of and synergy between the fields. Theoretical constructs are the foundational elements that underpin a field's core theories, models, assumptions, methodologies, and evaluation metrics. We provide a framework to compare and contrast the theoretical constructs in the fields of information searching and information retrieval using intellectual perspective and theoretical orientation. The intellectual perspectives are information searching, information retrieval, and cross-cutting; and the theoretical orientations are information, people, and technology. Using this framework, we identify 17 significant constructs in these fields contrasting the differences and comparing the similarities. We discuss the impact of the interplay among these constructs for moving research forward within both fields. Although there is tension between the fields due to contradictory constructs, an examination shows a trend toward convergence. We discuss the implications for future research within the information searching and information retrieval fields.

Source

Journal of the American Society for Information Science and Technology. 61(2010) no.8, S.1517-1534
Jansen, B.J.; Zhang, M.; Schultz, C.D.: Brand and its effect on user perception of search engine performance (2009) 0.02
```
0.015391628 = product of:
  0.023087442 = sum of:
    0.013077264 = weight(_text_:in in 2948) [ClassicSimilarity], result of:
      0.013077264 = score(doc=2948,freq=12.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.18406484 = fieldWeight in 2948, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2948)
    0.010010177 = product of:
      0.020020355 = sum of:
        0.020020355 = weight(_text_:science in 2948) [ClassicSimilarity], result of:
          0.020020355 = score(doc=2948,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.1455159 = fieldWeight in 2948, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2948)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

In this research we investigate the effect of search engine brand on the evaluation of searching performance. Our research is motivated by the large amount of search traffic directed to a handful of Web search engines, even though many have similar interfaces and performance. We conducted a laboratory experiment with 32 participants using a 42 factorial design confounded in four blocks to measure the effect of four search engine brands (Google, MSN, Yahoo!, and a locally developed search engine) while controlling for the quality and presentation of search engine results. We found brand indeed played a role in the searching process. Brand effect varied in different domains. Users seemed to place a high degree of trust in major search engine brands; however, they were more engaged in the searching process when using lesser-known search engines. It appears that branding affects overall Web search at four stages: (a) search engine selection, (b) search engine results page evaluation, (c) individual link evaluation, and (d) evaluation of the landing page. We discuss the implications for search engine marketing and the design of empirical studies measuring search engine performance.

Source

Journal of the American Society for Information Science and Technology. 60(2009) no.8, S.1572-1595
Jansen, B.J.; Zhang, M.; Sobel, K.; Chowdury, A.: Twitter power : tweets as electronic word of mouth (2009) 0.01
```
0.014632022 = product of:
  0.021948032 = sum of:
    0.011937855 = weight(_text_:in in 3157) [ClassicSimilarity], result of:
      0.011937855 = score(doc=3157,freq=10.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.16802745 = fieldWeight in 3157, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3157)
    0.010010177 = product of:
      0.020020355 = sum of:
        0.020020355 = weight(_text_:science in 3157) [ClassicSimilarity], result of:
          0.020020355 = score(doc=3157,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.1455159 = fieldWeight in 3157, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3157)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

In this paper we report research results investigating microblogging as a form of electronic word-of-mouth for sharing consumer opinions concerning brands. We analyzed more than 150,000 microblog postings containing branding comments, sentiments, and opinions. We investigated the overall structure of these microblog postings, the types of expressions, and the movement in positive or negative sentiment. We compared automated methods of classifying sentiment in these microblogs with manual coding. Using a case study approach, we analyzed the range, frequency, timing, and content of tweets in a corporate account. Our research findings show that 19% of microblogs contain mention of a brand. Of the branding microblogs, nearly 20% contained some expression of brand sentiments. Of these, more than 50% were positive and 33% were critical of the company or product. Our comparison of automated and manual coding showed no significant differences between the two approaches. In analyzing microblogs for structure and composition, the linguistic structure of tweets approximate the linguistic patterns of natural language expressions. We find that microblogging is an online tool for customer word of mouth communications and discuss the implications for corporations using microblogging as part of their overall marketing strategy.

Source

Journal of the American Society for Information Science and Technology. 60(2009) no.11, S.2169-2188

Jansen, B.J.; Pooch , U.: ¬A review of Web searching studies and a framework for future research (2001) 0.01

0.014325685 = product of:
  0.021488527 = sum of:
    0.0074742786 = weight(_text_:in in 5186) [ClassicSimilarity], result of:
      0.0074742786 = score(doc=5186,freq=2.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.10520181 = fieldWeight in 5186, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5186)
    0.014014249 = product of:
      0.028028497 = sum of:
        0.028028497 = weight(_text_:science in 5186) [ClassicSimilarity], result of:
          0.028028497 = score(doc=5186,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.20372227 = fieldWeight in 5186, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5186)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: Jansen and Pooch review three major search engine studies and compare them to three traditional search system studies and three OPAC search studies, to determine if user search characteristics differ. The web search engine studies indicate that most searchers use two, two search term queries per session, no boolean operators, and look only at the top ten items returned, while reporting the location of relevant information. In traditional search systems we find seven to 16 queries of six to nine terms, while about ten documents per session were viewed. The OPAC studies indicated two to five queries per session of two or less terms, with Boolean search about 1% and less than 50 documents viewed.
Source: Journal of the American Society for Information Science and technology. 52(2001) no.3, S.235-246

Koshman, S.; Spink, A.; Jansen, B.J.: Web searching on the Vivisimo search engine (2006) 0.01
```
0.013791813 = product of:
  0.020687718 = sum of:
    0.010677542 = weight(_text_:in in 216) [ClassicSimilarity], result of:
      0.010677542 = score(doc=216,freq=8.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.15028831 = fieldWeight in 216, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=216)
    0.010010177 = product of:
      0.020020355 = sum of:
        0.020020355 = weight(_text_:science in 216) [ClassicSimilarity], result of:
          0.020020355 = score(doc=216,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.1455159 = fieldWeight in 216, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=216)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

The application of clustering to Web search engine technology is a novel approach that offers structure to the information deluge often faced by Web searchers. Clustering methods have been well studied in research labs; however, real user searching with clustering systems in operational Web environments is not well understood. This article reports on results from a transaction log analysis of Vivisimo.com, which is a Web meta-search engine that dynamically clusters users' search results. A transaction log analysis was conducted on 2-week's worth of data collected from March 28 to April 4 and April 25 to May 2, 2004, representing 100% of site traffic during these periods and 2,029,734 queries overall. The results show that the highest percentage of queries contained two terms. The highest percentage of search sessions contained one query and was less than 1 minute in duration. Almost half of user interactions with clusters consisted of displaying a cluster's result set, and a small percentage of interactions showed cluster tree expansion. Findings show that 11.1% of search sessions were multitasking searches, and there are a broad variety of search topics in multitasking search sessions. Other searching interactions and statistics on repeat users of the search engine are reported. These results provide insights into search characteristics with a cluster-based Web search engine and extend research into Web searching trends.

Source

Journal of the American Society for Information Science and Technology. 57(2006) no.14, S.1875-1887
Jansen, B.J.; Spink, A.; Blakely, C.; Koshman, S.: Defining a session on Web search engines (2007) 0.01
```
0.013791813 = product of:
  0.020687718 = sum of:
    0.010677542 = weight(_text_:in in 285) [ClassicSimilarity], result of:
      0.010677542 = score(doc=285,freq=8.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.15028831 = fieldWeight in 285, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=285)
    0.010010177 = product of:
      0.020020355 = sum of:
        0.020020355 = weight(_text_:science in 285) [ClassicSimilarity], result of:
          0.020020355 = score(doc=285,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.1455159 = fieldWeight in 285, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=285)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Detecting query reformulations within a session by a Web searcher is an important area of research for designing more helpful searching systems and targeting content to particular users. Methods explored by other researchers include both qualitative (i.e., the use of human judges to manually analyze query patterns on usually small samples) and nondeterministic algorithms, typically using large amounts of training data to predict query modification during sessions. In this article, we explore three alternative methods for detection of session boundaries. All three methods are computationally straightforward and therefore easily implemented for detection of session changes. We examine 2,465,145 interactions from 534,507 users of Dogpile.com on May 6, 2005. We compare session analysis using (a) Internet Protocol address and cookie; (b) Internet Protocol address, cookie, and a temporal limit on intrasession interactions; and (c) Internet Protocol address, cookie, and query reformulation patterns. Overall, our analysis shows that defining sessions by query reformulation along with Internet Protocol address and cookie provides the best measure, resulting in an 82% increase in the count of sessions. Regardless of the method used, the mean session length was fewer than three queries, and the mean session duration was less than 30 min. Searchers most often modified their query by changing query terms (nearly 23% of all query modifications) rather than adding or deleting terms. Implications are that for measuring searching traffic, unique sessions may be a better indicator than the common metric of unique visitors. This research also sheds light on the more complex aspects of Web searching involving query modifications and may lead to advances in searching tools.

Source

Journal of the American Society for Information Science and Technology. 58(2007) no.6, S.862-871
Liu, Z.; Jansen, B.J.: ASK: A taxonomy of accuracy, social, and knowledge information seeking posts in social question and answering (2017) 0.01
```
0.013791813 = product of:
  0.020687718 = sum of:
    0.010677542 = weight(_text_:in in 3345) [ClassicSimilarity], result of:
      0.010677542 = score(doc=3345,freq=8.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.15028831 = fieldWeight in 3345, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3345)
    0.010010177 = product of:
      0.020020355 = sum of:
        0.020020355 = weight(_text_:science in 3345) [ClassicSimilarity], result of:
          0.020020355 = score(doc=3345,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.1455159 = fieldWeight in 3345, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3345)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Many people turn to their social networks to find information through the practice of question and answering. We believe it is necessary to use different answering strategies based on the type of questions to accommodate the different information needs. In this research, we propose the ASK taxonomy that categorizes questions posted on social networking sites into three types according to the nature of the questioner's inquiry of accuracy, social, or knowledge. To automatically decide which answering strategy to use, we develop a predictive model based on ASK question types using question features from the perspectives of lexical, topical, contextual, and syntactic as well as answer features. By applying the classifier on an annotated data set, we present a comprehensive analysis to compare questions in terms of their word usage, topical interests, temporal and spatial restrictions, syntactic structure, and response characteristics. Our research results show that the three types of questions exhibited different characteristics in the way they are asked. Our automatic classification algorithm achieves an 83% correct labeling result, showing the value of the ASK taxonomy for the design of social question and answering systems.

Source

Journal of the Association for Information Science and Technology. 68(2017) no.2, S.333-347
Jansen, B.J.; McNeese, M.D.: Evaluating the Effectiveness of and Patterns of Interactions With Automated Searching Assistance (2005) 0.01
```
0.012838133 = product of:
  0.019257199 = sum of:
    0.009247023 = weight(_text_:in in 4815) [ClassicSimilarity], result of:
      0.009247023 = score(doc=4815,freq=6.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.1301535 = fieldWeight in 4815, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4815)
    0.010010177 = product of:
      0.020020355 = sum of:
        0.020020355 = weight(_text_:science in 4815) [ClassicSimilarity], result of:
          0.020020355 = score(doc=4815,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.1455159 = fieldWeight in 4815, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4815)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

We report quantitative and qualitative results of an empirical evaluation to determine whether automated assistance improves searching performance and when searchers desire system intervention in the search process. Forty participants interacted with two fully functional information retrieval systems in a counterbalanced, within-participant study. The systems were identical in all respects except that one offered automated assistance and the other did not. The study used a client-side automated assistance application, an approximately 500,000-document Text REtrieval Conference content collection, and six topics. Results indicate that automated assistance can improve searching performance. However, the improvement is less dramatic than one might expect, with an approximately 20% performance increase, as measured by the number of userselected relevant documents. Concerning patterns of interaction, we identified 1,879 occurrences of searchersystem interactions and classified them into 9 major categories and 27 subcategories or states. Results indicate that there are predictable patterns of times when searchers desire and implement searching assistance. The most common three-state pattern is Execute Query-View Results: With Scrolling-View Assistance. Searchers appear receptive to automated assistance; there is a 71% implementation rate. There does not seem to be a correlation between the use of assistance and previous searching performance. We discuss the implications for the design of information retrieval systems and future research directions.

Source

Journal of the American Society for Information Science and Technology. 56(2005) no.14, S.1480-1503
Jansen, B.J.; Resnick, M.: ¬An examination of searcher's perceptions of nonsponsored and sponsored links during ecommerce Web searching (2006) 0.01
```
0.012838133 = product of:
  0.019257199 = sum of:
    0.009247023 = weight(_text_:in in 221) [ClassicSimilarity], result of:
      0.009247023 = score(doc=221,freq=6.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.1301535 = fieldWeight in 221, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=221)
    0.010010177 = product of:
      0.020020355 = sum of:
        0.020020355 = weight(_text_:science in 221) [ClassicSimilarity], result of:
          0.020020355 = score(doc=221,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.1455159 = fieldWeight in 221, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=221)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

In this article, we report results of an investigation into the effect of sponsored links on ecommerce information seeking on the Web. In this research, 56 participants each engaged in six ecommerce Web searching tasks. We extracted these tasks from the transaction log of a Web search engine, so they represent actual ecommerce searching information needs. Using 60 organic and 30 sponsored Web links, the quality of the Web search engine results was controlled by switching nonsponsored and sponsored links on half of the tasks for each participant. This allowed for investigating the bias toward sponsored links while controlling for quality of content. The study also investigated the relationship between searching self-efficacy, searching experience, types of ecommerce information needs, and the order of links on the viewing of sponsored links. Data included 2,453 interactions with links from result pages and 961 utterances evaluating these links. The results of the study indicate that there is a strong preference for nonsponsored links, with searchers viewing these results first more than 82% of the time. Searching self-efficacy and experience does not increase the likelihood of viewing sponsored links, and the order of the result listing does not appear to affect searcher evaluation of sponsored links. The implications for sponsored links as a long-term business model are discussed.

Source

Journal of the American Society for Information Science and Technology. 57(2006) no.14, S.1949-1961
Jansen, B.J.; Spink, A.; Koshman, S.: Web searcher interaction with the Dogpile.com metasearch engine (2007) 0.01
```
0.012838133 = product of:
  0.019257199 = sum of:
    0.009247023 = weight(_text_:in in 270) [ClassicSimilarity], result of:
      0.009247023 = score(doc=270,freq=6.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.1301535 = fieldWeight in 270, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=270)
    0.010010177 = product of:
      0.020020355 = sum of:
        0.020020355 = weight(_text_:science in 270) [ClassicSimilarity], result of:
          0.020020355 = score(doc=270,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.1455159 = fieldWeight in 270, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=270)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Metasearch engines are an intuitive method for improving the performance of Web search by increasing coverage, returning large numbers of results with a focus on relevance, and presenting alternative views of information needs. However, the use of metasearch engines in an operational environment is not well understood. In this study, we investigate the usage of Dogpile.com, a major Web metasearch engine, with the aim of discovering how Web searchers interact with metasearch engines. We report results examining 2,465,145 interactions from 534,507 users of Dogpile.com on May 6, 2005 and compare these results with findings from other Web searching studies. We collect data on geographical location of searchers, use of system feedback, content selection, sessions, queries, and term usage. Findings show that Dogpile.com searchers are mainly from the USA (84% of searchers), use about 3 terms per query (mean = 2.85), implement system feedback moderately (8.4% of users), and generally (56% of users) spend less than one minute interacting with the Web search engine. Overall, metasearchers seem to have higher degrees of interaction than searchers on non-metasearch engines, but their sessions are for a shorter period of time. These aspects of metasearching may be what define the differences from other forms of Web searching. We discuss the implications of our findings in relation to metasearch for Web searchers, search engines, and content providers.

Source

Journal of the American Society for Information Science and Technology. 58(2007) no.5, S.744-755
Jansen, B.J.; Liu, Z.; Simon, Z.: ¬The effect of ad rank on the performance of keyword advertising campaigns (2013) 0.01
```
0.012838133 = product of:
  0.019257199 = sum of:
    0.009247023 = weight(_text_:in in 1095) [ClassicSimilarity], result of:
      0.009247023 = score(doc=1095,freq=6.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.1301535 = fieldWeight in 1095, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1095)
    0.010010177 = product of:
      0.020020355 = sum of:
        0.020020355 = weight(_text_:science in 1095) [ClassicSimilarity], result of:
          0.020020355 = score(doc=1095,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.1455159 = fieldWeight in 1095, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1095)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

The goal of this research is to evaluate the effect of ad rank on the performance of keyword advertising campaigns. We examined a large-scale data file comprised of nearly 7,000,000 records spanning 33 consecutive months of a major US retailer's search engine marketing campaign. The theoretical foundation is serial position effect to explain searcher behavior when interacting with ranked ad listings. We control for temporal effects and use one-way analysis of variance (ANOVA) with Tamhane's T2 tests to examine the effect of ad rank on critical keyword advertising metrics, including clicks, cost-per-click, sales revenue, orders, items sold, and advertising return on investment. Our findings show significant ad rank effect on most of those metrics, although less effect on conversion rates. A primacy effect was found on both clicks and sales, indicating a general compelling performance of top-ranked ads listed on the first results page. Conversion rates, on the other hand, follow a relatively stable distribution except for the top 2 ads, which had significantly higher conversion rates. However, examining conversion potential (the effect of both clicks and conversion rate), we show that ad rank has a significant effect on the performance of keyword advertising campaigns. Conversion potential is a more accurate measure of the impact of an ad's position. In fact, the first ad position generates about 80% of the total profits, after controlling for advertising costs. In addition to providing theoretical grounding, the research results reported in this paper are beneficial to companies using search engine marketing as they strive to design more effective advertising campaigns.

Source

Journal of the American Society for Information Science and Technology. 64(2013) no.10, S.2115-2132
Coughlin, D.M.; Campbell, M.C.; Jansen, B.J.: ¬A web analytics approach for appraising electronic resources in academic libraries (2016) 0.01
```
0.012838133 = product of:
  0.019257199 = sum of:
    0.009247023 = weight(_text_:in in 2770) [ClassicSimilarity], result of:
      0.009247023 = score(doc=2770,freq=6.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.1301535 = fieldWeight in 2770, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2770)
    0.010010177 = product of:
      0.020020355 = sum of:
        0.020020355 = weight(_text_:science in 2770) [ClassicSimilarity], result of:
          0.020020355 = score(doc=2770,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.1455159 = fieldWeight in 2770, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2770)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

University libraries provide access to thousands of journals and spend millions of dollars annually on electronic resources. With several commercial entities providing these electronic resources, the result can be silo systems and processes to evaluate cost and usage of these resources, making it difficult to provide meaningful analytics. In this research, we examine a subset of journals from a large research library using a web analytics approach with the goal of developing a framework for the analysis of library subscriptions. This foundational approach is implemented by comparing the impact to the cost, titles, and usage for the subset of journals and by assessing the funding area. Overall, the results highlight the benefit of a web analytics evaluation framework for university libraries and the impact of classifying titles based on the funding area. Furthermore, they show the statistical difference in both use and cost among the various funding areas when ranked by cost, eliminating the outliers of heavily used and highly expensive journals. Future work includes refining this model for a larger scale analysis tying metrics to library organizational objectives and for the creation of an online application to automate this analysis.

Source

Journal of the Association for Information Science and Technology. 67(2016) no.3, S.518-534
Coughlin, D.M.; Jansen, B.J.: Modeling journal bibliometrics to predict downloads and inform purchase decisions at university research libraries (2016) 0.01
```
0.012838133 = product of:
  0.019257199 = sum of:
    0.009247023 = weight(_text_:in in 3094) [ClassicSimilarity], result of:
      0.009247023 = score(doc=3094,freq=6.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.1301535 = fieldWeight in 3094, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3094)
    0.010010177 = product of:
      0.020020355 = sum of:
        0.020020355 = weight(_text_:science in 3094) [ClassicSimilarity], result of:
          0.020020355 = score(doc=3094,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.1455159 = fieldWeight in 3094, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3094)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

University libraries provide access to thousands of online journals and other content, spending millions of dollars annually on these electronic resources. Providing access to these online resources is costly, and it is difficult both to analyze the value of this content to the institution and to discern those journals that comparatively provide more value. In this research, we examine 1,510 journals from a large research university library, representing more than 40% of the university's annual subscription cost for electronic resources at the time of the study. We utilize a web analytics approach for the creation of a linear regression model to predict usage among these journals. We categorize metrics into two classes: global (journal focused) and local (institution dependent). Using 275 journals for our training set, our analysis shows that a combination of global and local metrics creates the strongest model for predicting full-text downloads. Our linear regression model has an accuracy of more than 80% in predicting downloads for the 1,235 journals in our test set. The implications of the findings are that university libraries that use local metrics have better insight into the value of a journal and therefore more efficient cost content management.

Source

Journal of the Association for Information Science and Technology. 67(2016) no.9, S.2263-2273
Jansen, B.J.; Booth, D.L.; Spink, A.: Patterns of query reformulation during Web searching (2009) 0.01
```
0.012279158 = product of:
  0.018418737 = sum of:
    0.0064065247 = weight(_text_:in in 2936) [ClassicSimilarity], result of:
      0.0064065247 = score(doc=2936,freq=2.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.09017298 = fieldWeight in 2936, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=2936)
    0.012012213 = product of:
      0.024024425 = sum of:
        0.024024425 = weight(_text_:science in 2936) [ClassicSimilarity], result of:
          0.024024425 = score(doc=2936,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.17461908 = fieldWeight in 2936, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.046875 = fieldNorm(doc=2936)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Query reformulation is a key user behavior during Web search. Our research goal is to develop predictive models of query reformulation during Web searching. This article reports results from a study in which we automatically classified the query-reformulation patterns for 964,780 Web searching sessions, composed of 1,523,072 queries, to predict the next query reformulation. We employed an n-gram modeling approach to describe the probability of users transitioning from one query-reformulation state to another to predict their next state. We developed first-, second-, third-, and fourth-order models and evaluated each model for accuracy of prediction, coverage of the dataset, and complexity of the possible pattern set. The results show that Reformulation and Assistance account for approximately 45% of all query reformulations; furthermore, the results demonstrate that the first- and second-order models provide the best predictability, between 28 and 40% overall and higher than 70% for some patterns. Implications are that the n-gram approach can be used for improving searching systems and searching assistance.

Source

Journal of the American Society for Information Science and Technology. 60(2009) no.7, S.1358-1371
Ortiz-Cordova, A.; Jansen, B.J.: Classifying web search queries to identify high revenue generating customers (2012) 0.01
```
0.012279158 = product of:
  0.018418737 = sum of:
    0.0064065247 = weight(_text_:in in 279) [ClassicSimilarity], result of:
      0.0064065247 = score(doc=279,freq=2.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.09017298 = fieldWeight in 279, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=279)
    0.012012213 = product of:
      0.024024425 = sum of:
        0.024024425 = weight(_text_:science in 279) [ClassicSimilarity], result of:
          0.024024425 = score(doc=279,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.17461908 = fieldWeight in 279, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.046875 = fieldNorm(doc=279)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Traffic from search engines is important for most online businesses, with the majority of visitors to many websites being referred by search engines. Therefore, an understanding of this search engine traffic is critical to the success of these websites. Understanding search engine traffic means understanding the underlying intent of the query terms and the corresponding user behaviors of searchers submitting keywords. In this research, using 712,643 query keywords from a popular Spanish music website relying on contextual advertising as its business model, we use a k-means clustering algorithm to categorize the referral keywords with similar characteristics of onsite customer behavior, including attributes such as clickthrough rate and revenue. We identified 6 clusters of consumer keywords. Clusters range from a large number of users who are low impact to a small number of high impact users. We demonstrate how online businesses can leverage this segmentation clustering approach to provide a more tailored consumer experience. Implications are that businesses can effectively segment customers to develop better business models to increase advertising conversion rates.

Source

Journal of the American Society for Information Science and Technology. 63(2012) no.7, S.1426-1441
Tjondronegoro, D.; Spink, A.; Jansen, B.J.: ¬A study and comparison of multimedia Web searching : 1997-2006 (2009) 0.01
```
0.011706892 = product of:
  0.017560339 = sum of:
    0.007550162 = weight(_text_:in in 3090) [ClassicSimilarity], result of:
      0.007550162 = score(doc=3090,freq=4.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.10626988 = fieldWeight in 3090, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3090)
    0.010010177 = product of:
      0.020020355 = sum of:
        0.020020355 = weight(_text_:science in 3090) [ClassicSimilarity], result of:
          0.020020355 = score(doc=3090,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.1455159 = fieldWeight in 3090, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3090)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Searching for multimedia is an important activity for users of Web search engines. Studying user's interactions with Web search engine multimedia buttons, including image, audio, and video, is important for the development of multimedia Web search systems. This article provides results from a Weblog analysis study of multimedia Web searching by Dogpile users in 2006. The study analyzes the (a) duration, size, and structure of Web search queries and sessions; (b) user demographics; (c) most popular multimedia Web searching terms; and (d) use of advanced Web search techniques including Boolean and natural language. The current study findings are compared with results from previous multimedia Web searching studies. The key findings are: (a) Since 1997, image search consistently is the dominant media type searched followed by audio and video; (b) multimedia search duration is still short (>50% of searching episodes are <1 min), using few search terms; (c) many multimedia searches are for information about people, especially in audio search; and (d) multimedia search has begun to shift from entertainment to other categories such as medical, sports, and technology (based on the most repeated terms). Implications for design of Web multimedia search engines are discussed.

Source

Journal of the American Society for Information Science and Technology. 60(2009) no.9, S.1756-1768
Spink, A.; Jansen, B.J.: Web searching : public searching of the Web (2004) 0.01
```
0.009995343 = product of:
  0.014993015 = sum of:
    0.009987926 = weight(_text_:in in 1443) [ClassicSimilarity], result of:
      0.009987926 = score(doc=1443,freq=28.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.14058185 = fieldWeight in 1443, product of:
          5.2915025 = tf(freq=28.0), with freq of:
            28.0 = termFreq=28.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.01953125 = fieldNorm(doc=1443)
    0.0050050886 = product of:
      0.010010177 = sum of:
        0.010010177 = weight(_text_:science in 1443) [ClassicSimilarity], result of:
          0.010010177 = score(doc=1443,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.07275795 = fieldWeight in 1443, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.01953125 = fieldNorm(doc=1443)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Footnote

Rez. in: Information - Wissenschaft und Praxis 56(2004) H.1, S.61-62 (D. Lewandowski): "Die Autoren des vorliegenden Bandes haben sich in den letzten Jahren durch ihre zahlreichen Veröffentlichungen zum Verhalten von Suchmaschinen-Nutzern einen guten Namen gemacht. Das nun erschienene Buch bietet eine Zusammenfassung der verstreut publizierten Aufsätze und stellt deren Ergebnisse in den Kontext eines umfassenderen Forschungsansatzes. Spink und Jansen verwenden zur Analyse des Nutzungsverhaltens query logs von Suchmaschinen. In diesen werden vom Server Informationen protokolliert, die die Anfragen an diesen Server betreffen. Daten, die aus diesen Dateien gewonnen werden können, sind unter anderem die gestellten Suchanfragen, die Adresse des Rechners, von dem aus die Anfrage gestellt wurde, sowie die aus den Trefferlisten ausgewählten Dokumente. Der klare Vorteil der Analyse von Logfiles liegt in der Möglichkeit, große Datenmengen ohne hohen personellen Aufwand erheben zu können. Die Daten einer Vielzahl anonymer Nutzer können analysiert werden; ohne dass dabei die Datenerhebung das Nutzerverhalten beeinflusst. Dies ist bei Suchmaschinen von besonderer Bedeutung, weil sie im Gegensatz zu den meisten anderen professionellen Information-Retrieval-Systemen nicht nur im beruflichen Kontext, sondern auch (und vor allem) privat genutzt werden. Das Bild des Nutzungsverhaltens wird in Umfragen und Laboruntersuchungen verfälscht, weil Nutzer ihr Anfrageverhalten falsch einschätzen oder aber die Themen ihrer Anfragen nicht nennen möchten. Hier ist vor allem an Suchanfragen, die auf medizinische oder pornographische Inhalte gerichtet sind, zu denken. Die Analyse von Logfiles ist allerdings auch mit Problemen behaftet: So sind nicht alle gewünschten Daten überhaupt in den Logfiles enthalten (es fehlen alle Informationen über den einzelnen Nutzer), es werden keine qualitativen Informationen wie etwa der Grund einer Suche erfasst und die Logfiles sind aufgrund technischer Gegebenheiten teils unvollständig. Die Autoren schließen aus den genannten Vor- und Nachteilen, dass sich Logfiles gut für die Auswertung des Nutzerverhaltens eignen, bei der Auswertung jedoch die Ergebnisse von Untersuchungen, welche andere Methoden verwenden, berücksichtigt werden sollten.
Den Autoren wurden von den kommerziellen Suchmaschinen AltaVista, Excite und All the Web größere Datenbestände zur Verfügung gestellt. Die ausgewerteten Files umfassten jeweils alle an die jeweilige Suchmaschine an einem bestimmten Tag gestellten Anfragen. Die Daten wurden zwischen 199'] und 2002 erhoben; allerdings liegen nicht von allen Jahren Daten von allen Suchmaschinen vor, so dass einige der festgestellten Unterschiede im Nutzerverhalten sich wohl auf die unterschiedlichen Nutzergruppen der einzelnen Suchmaschinen zurückführen lassen. In einem Fall werden die Nutzergruppen sogar explizit nach den Suchmaschinen getrennt, so dass das Nutzerverhalten der europäischen Nutzer der Suchmaschine All the Web mit dem Verhalten der US-amerikanischen Nutzer verglichen wird. Die Analyse der Logfiles erfolgt auf unterschiedlichen Ebenen: Es werden sowohl die eingegebenen Suchbegriffe, die kompletten Suchanfragen, die Such-Sessions und die Anzahl der angesehenen Ergebnisseiten ermittelt. Bei den Suchbegriffen ist besonders interessant, dass die Spannbreite der Informationsbedürfnisse im Lauf der Jahre deutlich zugenommen hat. Zwar werden 20 Prozent aller eingegebenen Suchbegriffe regelmäßig verwendet, zehn Prozent kamen hingegen nur ein einziges Mal vor. Die thematischen Interessen der Suchmaschinen-Nutzer haben sich im Lauf der letzten Jahre ebenfalls gewandelt. Während in den Anfangsjahren viele Anfragen aus den beiden Themenfeldern Sex und Technologie stammten, gehen diese mittlerweile zurück. Dafür nehmen Anfragen im Bereich E-Commerce zu. Weiterhin zugenommen haben nicht-englischsprachige Begriffe sowie Zahlen und Akronyme. Die Popularität von Suchbegriffen ist auch saisonabhängig und wird durch aktuelle Nachrichten beeinflusst. Auf der Ebene der Suchanfragen zeigt sich weiterhin die vielfach belegte Tatsache, dass Suchanfragen in Web-Suchmaschinen extrem kurz sind. Die durchschnittliche Suchanfrage enthält je nach Suchmaschine zwischen 2,3 und 2,9 Terme. Dies deckt sich mit anderen Untersuchungen zu diesem Thema. Die Länge der Suchanfragen ist in den letzten Jahren leicht steigend; größere Sprünge hin zu längeren Anfragen sind jedoch nicht zu erwarten. Ebenso verhält es sich mit dem Einsatz von Operatoren: Nur etwa in jeder zehnten Anfrage kommen diese vor, wobei die Phrasensuche am häufigsten verwendet wird. Dass die SuchmaschinenNutzer noch weitgehend als Anfänger angesehen werden müssen, zeigt sich auch daran, dass sie pro Suchanfrage nur drei oder vier Dokumente aus der Trefferliste tatsächlich sichten.
In Hinblick auf die Informationsbedürfnisse ergibt sich eine weitere Besonderheit dadurch, dass Suchmaschinen nicht nur für eine Anfrageform genutzt werden. Eine "Spezialität" der Suchmaschinen ist die Beantwortung von navigationsorientierten Anfragen, beispielsweise nach der Homepage eines Unternehmens. Hier wird keine Menge von Dokumenten oder Fakteninformation verlangt; vielmehr ist eine Navigationshilfe gefragt. Solche Anfragen nehmen weiter zu. Die Untersuchung der Such-Sessions bringt Ergebnisse über die Formulierung und Bearbeitung der Suchanfragen zu einem Informationsbedürfnis zutage. Die Sessions dauern weit überwiegend weniger als 15 Minuten (dies inklusive Sichtung der Dokumente!), wobei etwa fünf Dokumente angesehen werden. Die Anzahl der angesehenen Ergebnisseiten hat im Lauf der Zeit abgenommen; dies könnte darauf zurückzuführen sein, dass es den Suchmaschinen im Lauf der Zeit gelungen ist, die Suchanfragen besser zu beantworten, so dass sich brauchbare Ergebnisse öfter bereits auf der ersten Ergebnisseite finden. Insgesamt bestätigt sich auch hier das Bild vom wenig fortgeschrittenen Suchmaschinen-Nutzer, der nach Eingabe einer unspezifischen Suchanfrage schnelle und gute Ergebnisse erwartet. Der zweite Teil des Buchs widmet sich einigen der bei den Suchmaschinen-Nutzern populären Themen und analysiert das Nutzerverhalten bei solchen Suchen. Dabei werden die eingegebenen Suchbegriffe und Anfragen untersucht. Die Bereiche sind E-Commerce, medizinische Themen, Sex und Multimedia. Anfragen aus dem Bereich E-Commerce sind in der Regel länger als allgemeine Anfragen. Sie werden seltener modifiziert und pro Anfrage werden weniger Dokumente angesehen. Einige generische Ausdrücke wie "shopping" werden sehr häufig verwendet. Der Anteil der E-Commerce-Anfragen ist hoch und die Autoren sehen die Notwendigkeit, spezielle Suchfunktionen für die Suche nach Unternehmenshomepages und Produkten zu erstellen bzw. zu verbessern. Nur zwischen drei und neun Prozent der Anfragen beziehen sich auf medizinische Themen, der Anteil dieser Anfragen nimmt tendenziell ab. Auch der Anteil der Anfragen nach sexuellen Inhalten dürfte mit einem Wert zwischen drei und knapp 1'7 Prozent geringer ausfallen als allgemein angenommen.

Series

Information science and knowledge management; 6

Search (32 results, page 1 of 2)

Authors

Years

Types

Themes

Subjects

Classifications