Search (13 results, page 1 of 1)

Spink, A.; Wolfram, D.; Jansen, B.J.; Saracevic, T.: Searching the Web : the public and their queries (2001) 0.03
```
0.025276989 = product of:
  0.050553977 = sum of:
    0.050553977 = product of:
      0.101107955 = sum of:
        0.101107955 = weight(_text_:web in 6980) [ClassicSimilarity], result of:
          0.101107955 = score(doc=6980,freq=34.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.59466785 = fieldWeight in 6980, product of:
              5.8309517 = tf(freq=34.0), with freq of:
                34.0 = termFreq=34.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03125 = fieldNorm(doc=6980)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

In previous articles, we reported the state of Web searching in 1997 (Jansen, Spink, & Saracevic, 2000) and in 1999 (Spink, Wolfram, Jansen, & Saracevic, 2001). Such snapshot studies and statistics on Web use appear regularly (OCLC, 1999), but provide little information about Web searching trends. In this article, we compare and contrast results from our two previous studies of Excite queries' data sets, each containing over 1 million queries submitted by over 200,000 Excite users collected on 16 September 1997 and 20 December 1999. We examine how public Web searching changing during that 2-year time period. As Table 1 shows, the overall structure of Web queries in some areas did not change, while in others we see change from 1997 to 1999. Our comparison shows how Web searching changed incrementally and also dramatically. We see some moves toward greater simplicity, including shorter queries (i.e., fewer terms) and shorter sessions (i.e., fewer queries per user), with little modification (addition or deletion) of terms in subsequent queries. The trend toward shorter queries suggests that Web information content should target specific terms in order to reach Web users. Another trend was to view fewer pages of results per query. Most Excite users examined only one page of results per query, since an Excite results page contains ten ranked Web sites. Were users satisfied with the results and did not need to view more pages? It appears that the public continues to have a low tolerance of wading through retrieved sites. This decline in interactivity levels is a disturbing finding for the future of Web searching. Queries that included Boolean operators were in the minority, but the percentage increased between the two time periods. Most Boolean use involved the AND operator with many mistakes. The use of relevance feedback almost doubled from 1997 to 1999, but overall use was still small. An unusually large number of terms were used with low frequency, such as personal names, spelling errors, non-English words, and Web-specific terms, such as URLs. Web query vocabulary contains more words than found in large English texts in general. The public language of Web queries has its own and unique characteristics. How did Web searching topics change from 1997 to 1999? We classified a random sample of 2,414 queries from 1997 and 2,539 queries from 1999 into 11 categories (Table 2). From 1997 to 1999, Web searching shifted from entertainment, recreation and sex, and pornography, preferences to e-commerce-related topics under commerce, travel, employment, and economy. This shift coincided with changes in information distribution on the publicly indexed Web.
Ajiferuke, I.; Wolfram, D.: Analysis of Web page image tag distribution characteristics (2005) 0.02
```
0.022525156 = product of:
  0.04505031 = sum of:
    0.04505031 = product of:
      0.09010062 = sum of:
        0.09010062 = weight(_text_:web in 1059) [ClassicSimilarity], result of:
          0.09010062 = score(doc=1059,freq=12.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.5299281 = fieldWeight in 1059, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=1059)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The authors investigate the frequency distribution of the use of image tags in Web pages. Using data sampled from top level Web pages across five top level domains and from sample pages within individual websites, the authors model observed patterns in the frequency of image tag usage by fitting collected data distributions to different theoretical models used in informetrics. Models tested include the modified power law (MPL), Mandelbrot (MDB), generalized waring (GW), generalized inverse Gaussian-Poisson (GIGP), and generalized negative binomial (GNB) distributions. The GIGP provided the best fit for data sets for top level pages across the top level domains tested. The poor fits of the models to the observed data distributions from specific websites were due to the multimodal nature of the observed data sets. Mixtures of the tested models for the data sets provided better fits. The ability to effectively model Web page attributes, such as the distribution of the number of image tags used per page, is needed for accurate simulation models of Web page content, and makes it possible to estimate the number of requests needed to display the complete content of Web pages.
Wolfram, D.: Search characteristics in different types of Web-based IR environments : are they the same? (2008) 0.02
```
0.021456998 = product of:
  0.042913996 = sum of:
    0.042913996 = product of:
      0.08582799 = sum of:
        0.08582799 = weight(_text_:web in 2093) [ClassicSimilarity], result of:
          0.08582799 = score(doc=2093,freq=8.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.50479853 = fieldWeight in 2093, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2093)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Transaction logs from four different Web-based information retrieval environments (bibliographic databank, OPAC, search engine, specialized search system) were analyzed for empirical regularities in search characteristics to determine whether users engage in different behaviors in different Web-based search environments. Descriptive statistics and relative frequency distributions related to term usage, query formulation, and session duration were tabulated. The analysis revealed that there are differences in these characteristics. Users were more likely to engage in extensive searching using the OPAC and specialized search system. Surprisingly, the bibliographic databank search environment resulted in the most parsimonious searching, more similar to a general search engine. Although on the surface Web-based search facilities may appear similar, users do engage in different search behaviors.

Wolfram, D.; Spink, A.; Jansen, B.J.; Saracevic, T.: Vox populi : the public searching of the Web (2001) 0.02

0.01839171 = product of:
  0.03678342 = sum of:
    0.03678342 = product of:
      0.07356684 = sum of:
        0.07356684 = weight(_text_:web in 6949) [ClassicSimilarity], result of:
          0.07356684 = score(doc=6949,freq=2.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.43268442 = fieldWeight in 6949, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.09375 = fieldNorm(doc=6949)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Wolfram, D.; Wang, P.; Zhang, J.: Identifying Web search session patterns using cluster analysis : a comparison of three search environments (2009) 0.02
```
0.015927691 = product of:
  0.031855382 = sum of:
    0.031855382 = product of:
      0.063710764 = sum of:
        0.063710764 = weight(_text_:web in 2796) [ClassicSimilarity], result of:
          0.063710764 = score(doc=2796,freq=6.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.37471575 = fieldWeight in 2796, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=2796)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Session characteristics taken from large transaction logs of three Web search environments (academic Web site, public search engine, consumer health information portal) were modeled using cluster analysis to determine if coherent session groups emerged for each environment and whether the types of session groups are similar across the three environments. The analysis revealed three distinct clusters of session behaviors common to each environment: hit and run sessions on focused topics, relatively brief sessions on popular topics, and sustained sessions using obscure terms with greater query modification. The findings also revealed shifts in session characteristics over time for one of the datasets, away from hit and run sessions toward more popular search topics. A better understanding of session characteristics can help system designers to develop more responsive systems to support search features that cater to identifiable groups of searchers based on their search behaviors. For example, the system may identify struggling searchers based on session behaviors that match those identified in the current study to provide context sensitive help.
Wolfram, D.; Xie, H.I.: Traditional IR for web users : a context for general audience digital libraries (2002) 0.02
```
0.015326426 = product of:
  0.030652853 = sum of:
    0.030652853 = product of:
      0.061305705 = sum of:
        0.061305705 = weight(_text_:web in 2589) [ClassicSimilarity], result of:
          0.061305705 = score(doc=2589,freq=8.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.36057037 = fieldWeight in 2589, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2589)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The emergence of general audience digital libraries (GADLs) defines a context that represents a hybrid of both "traditional" IR, using primarily bibliographic resources provided by database vendors, and "popular" IR, exemplified by public search systems available on the World Wide Web. Findings of a study investigating end-user searching and response to a GADL are reported. Data collected from a Web-based end-user survey and data logs of resource usage for a Web-based GADL were analyzed for user characteristics, patterns of access and use, and user feedback. Cross-tabulations using respondent demographics revealed several key differences in how the system was used and valued by users of different age groups. Older users valued the service more than younger users and engaged in different searching and viewing behaviors. The GADL more closely resembles traditional retrieval systems in terms of content and purpose of use, but is more similar to popular IR systems in terms of user behavior and accessibility. A model that defines the dual context of the GADL environment is derived from the data analysis and existing IR models in general and other specific contexts. The authors demonstrate the distinguishing characteristics of this IR context, and discuss implications for the development and evaluation of future GADLs to accommodate a variety of user needs and expectations.

Dimitroff, A.; Wolfram, D.: Searcher response in a hypertext-based bibliographic information retrieval system (1995) 0.01

0.014117276 = product of:
  0.028234553 = sum of:
    0.028234553 = product of:
      0.056469105 = sum of:
        0.056469105 = weight(_text_:22 in 187) [ClassicSimilarity], result of:
          0.056469105 = score(doc=187,freq=2.0), product of:
            0.18244034 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052098576 = queryNorm
            0.30952093 = fieldWeight in 187, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=187)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Journal of the American Society for Information Science. 46(1995) no.1, S.22-29

Xie, H.I.; Wolfram, D.: State digital library usability contributing organizational factors (2002) 0.01
```
0.013273074 = product of:
  0.026546149 = sum of:
    0.026546149 = product of:
      0.053092297 = sum of:
        0.053092297 = weight(_text_:web in 5221) [ClassicSimilarity], result of:
          0.053092297 = score(doc=5221,freq=6.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.3122631 = fieldWeight in 5221, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5221)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

In this issue Xie and Wolfram study the Wisconsin state digital library BadgerLink to determine the organizational factors that lead to different use requirements and the degree to which these are met, as well as impact on physical libraries. To this end, usage data from EBSCOhost and ProQuest logs for BadgerLink were analyzed, 313 Wisconsin libraries of all types were surveyed (76% response rate), and analyzed along with 81 responses to a voluntary web survey of end users. Heaviest users were K-12 schools and institutions of higher education. Heaviest use sites were the two largest state universities and the state's largest public library. Small libraries were infrequent users. Web survey respondents were mature working professionals. Sixty percent searched for specific information, but 46% reported browsing in subject areas. Libraries with dedicated Internet access reported more frequent usage than those with dial-up connection. Those who accessed from libraries reported more frequent use than those at work or at home. Libraries that trained end users reported more use, but the majority of the web survey respondents reported themselves as self-taught. Logs confirm reported subject interests. Three surrogates were requested for every full text document but full text availability is reported as the reason for use by 30% of users. Availability has led to the cancellation of subscriptions in many libraries that are important promoters of the service. A model will need to include interactions based upon the influence of each involved participant on the others. It will also need to include the extension of the activities of one participant to other participant organizations and the communication among these organizations.
Wang, F.; Wolfram, D.: Assessment of journal similarity based on citing discipline analysis (2015) 0.01
```
0.01083742 = product of:
  0.02167484 = sum of:
    0.02167484 = product of:
      0.04334968 = sum of:
        0.04334968 = weight(_text_:web in 1849) [ClassicSimilarity], result of:
          0.04334968 = score(doc=1849,freq=4.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.25496176 = fieldWeight in 1849, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1849)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This study compares the range of disciplines of citing journal articles to determine how closely related journals assigned to the same Web of Science research area are. The frequency distribution of disciplines by citing articles provides a signature for a cited journal that permits it to be compared with other journals using similarity comparison techniques. As an initial exploration, citing discipline data for 40 high-impact-factor journals assigned to the "information science and library science" category of the Web of Science were compared across 5 time periods. Similarity relationships were determined using multidimensional scaling and hierarchical cluster analysis to compare the outcomes produced by the proposed citing discipline and established cocitation methods. The maps and clustering outcomes reveal that a number of journals in allied areas of the information science and library science category may not be very closely related to each other or may not be appropriately situated in the category studied. The citing discipline similarity data resulted in similar outcomes with the cocitation data but with some notable differences. Because the citing discipline method relies on a citing perspective different from cocitations, it may provide a complementary way to compare journal similarity that is less labor intensive than cocitation analysis.

Ajiferuke, I.; Lu, K.; Wolfram, D.: ¬A comparison of citer and citation-based measure outcomes for multiple disciplines (2010) 0.01

0.010587957 = product of:
  0.021175914 = sum of:
    0.021175914 = product of:
      0.042351827 = sum of:
        0.042351827 = weight(_text_:22 in 4000) [ClassicSimilarity], result of:
          0.042351827 = score(doc=4000,freq=2.0), product of:
            0.18244034 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052098576 = queryNorm
            0.23214069 = fieldWeight in 4000, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=4000)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 28. 9.2010 12:54:22

Wolfram, D.: ¬The power to influence : an informetric analysis of the works of Hope Olson (2016) 0.01
```
0.009195855 = product of:
  0.01839171 = sum of:
    0.01839171 = product of:
      0.03678342 = sum of:
        0.03678342 = weight(_text_:web in 3170) [ClassicSimilarity], result of:
          0.03678342 = score(doc=3170,freq=2.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.21634221 = fieldWeight in 3170, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=3170)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This paper examines the influence of the works of Hope A. Olson by conducting an ego-centric informetric analysis of her published works. Publication and citation data were collected from Google Scholar and the Thomson Reuters Web of Science. Classic informetrics techniques were applied to the datasets including co-authorship analysis, citer analysis, citation and co-citation analysis and text-based analysis. Co-citation and text-based data were analyzed and visualized using VOSviewer and CiteSpace, respectively. The analysis of her citation identity reveals how Dr. Olson situates her own research within the knowledge landscape while the analysis of her citation image reveals how others have situated her work in relation to the authors with whom she has been co-cited. This reflection of Dr. Olson's research contributions reveals the influence of her scholarship not only on knowledge organization but other areas of library and information science and allied disciplines.

Castanha, R.C.G.; Wolfram, D.: ¬The domain of knowledge organization : a bibliometric analysis of prolific authors and their intellectual space (2018) 0.01

0.008823298 = product of:
  0.017646596 = sum of:
    0.017646596 = product of:
      0.03529319 = sum of:
        0.03529319 = weight(_text_:22 in 4150) [ClassicSimilarity], result of:
          0.03529319 = score(doc=4150,freq=2.0), product of:
            0.18244034 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052098576 = queryNorm
            0.19345059 = fieldWeight in 4150, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4150)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Knowledge organization. 45(2018) no.1, S.13-22

Zhang, J.; Wolfram, D.; Wang, P.; Hong, Y.; Gillis, R.: Visualization of health-subject analysis based on query term co-occurrences (2008) 0.01
```
0.007663213 = product of:
  0.015326426 = sum of:
    0.015326426 = product of:
      0.030652853 = sum of:
        0.030652853 = weight(_text_:web in 2376) [ClassicSimilarity], result of:
          0.030652853 = score(doc=2376,freq=2.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.18028519 = fieldWeight in 2376, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2376)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

A multidimensional-scaling approach is used to analyze frequently used medical-topic terms in queries submitted to a Web-based consumer health information system. Based on a year-long transaction log file, five medical focus keywords (stomach, hip, stroke, depression, and cholesterol) and their co-occurring query terms are analyzed. An overlap-coefficient similarity measure and a conversion measure are used to calculate the proximity of terms to one another based on their co-occurrences in queries. The impact of the dimensionality of the visual configuration, the cutoff point of term co-occurrence for inclusion in the analysis, and the Minkowski metric power k on the stress value are discussed. A visual clustering of groups of terms based on the proximity within each focus-keyword group is also conducted. Term distributions within each visual configuration are characterized and are compared with formal medical vocabulary. This investigation reveals that there are significant differences between consumer health query-term usage and more formal medical terminology used by medical professionals when describing the same medical subject. Future directions are discussed.

Search (13 results, page 1 of 1)

Authors

Years

Themes