Search (54 results, page 1 of 3)

Bar-Ilan, J.: ¬The use of Web search engines in information science research (2003) 0.18

0.18138695 = product of:
  0.24184927 = sum of:
    0.11577009 = weight(_text_:web in 4271) [ClassicSimilarity], result of:
      0.11577009 = score(doc=4271,freq=22.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.717526 = fieldWeight in 4271, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=4271)
    0.07918424 = weight(_text_:search in 4271) [ClassicSimilarity], result of:
      0.07918424 = score(doc=4271,freq=8.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.460814 = fieldWeight in 4271, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.046875 = fieldNorm(doc=4271)
    0.04689494 = product of:
      0.09378988 = sum of:
        0.09378988 = weight(_text_:engine in 4271) [ClassicSimilarity], result of:
          0.09378988 = score(doc=4271,freq=2.0), product of:
            0.26447627 = queryWeight, product of:
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.049439456 = queryNorm
            0.35462496 = fieldWeight in 4271, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.046875 = fieldNorm(doc=4271)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Abstract: The World Wide Web was created in 1989, but it has already become a major information channel and source, influencing our everyday lives, commercial transactions, and scientific communication, to mention just a few areas. The seventeenth-century philosopher Descartes proclaimed, "I think, therefore I am" (cogito, ergo sum). Today the Web is such an integral part of our lives that we could rephrase Descartes' statement as "I have a Web presence, therefore I am." Because many people, companies, and organizations take this notion seriously, in addition to more substantial reasons for publishing information an the Web, the number of Web pages is in the billions and growing constantly. However, it is not sufficient to have a Web presence; tools that enable users to locate Web pages are needed as well. The major tools for discovering and locating information an the Web are search engines. This review discusses the use of Web search engines in information science research. Before going into detail, we should define the terms "information science," "Web search engine," and "use" in the context of this review.

Oppenheim, C.; Morris, A.; McKnight, C.: ¬The evaluation of WWW search engines (2000) 0.13

0.12774871 = product of:
  0.17033161 = sum of:
    0.03490599 = weight(_text_:web in 4546) [ClassicSimilarity], result of:
      0.03490599 = score(doc=4546,freq=2.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.21634221 = fieldWeight in 4546, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=4546)
    0.08853068 = weight(_text_:search in 4546) [ClassicSimilarity], result of:
      0.08853068 = score(doc=4546,freq=10.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.51520574 = fieldWeight in 4546, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.046875 = fieldNorm(doc=4546)
    0.04689494 = product of:
      0.09378988 = sum of:
        0.09378988 = weight(_text_:engine in 4546) [ClassicSimilarity], result of:
          0.09378988 = score(doc=4546,freq=2.0), product of:
            0.26447627 = queryWeight, product of:
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.049439456 = queryNorm
            0.35462496 = fieldWeight in 4546, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.046875 = fieldNorm(doc=4546)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Abstract: The literature of the evaluation of Internet search engines is reviewed. Although there have been many studies, there has been little consistency in the way such studies have been carried out. This problem is exacerbated by the fact that recall is virtually impossible to calculate in the fast changing Internet environment, and therefore the traditional Cranfield type of evaluation is not usually possible. A variety of alternative evaluation methods has been suggested to overcome this difficulty. The authors recommend that a standardised set of tools is developed for the evaluation of web search engines so that, in future, comparisons can be made between search engines more effectively, and that variations in performance of any given search engine over time can be tracked. The paper itself does not provide such a standard set of tools, but it investigates the issues and makes preliminary recommendations of the types of tools needed

Rasmussen, E.M.: Indexing and retrieval for the Web (2002) 0.12
```
0.12008574 = product of:
  0.16011432 = sum of:
    0.07618698 = weight(_text_:web in 4285) [ClassicSimilarity], result of:
      0.07618698 = score(doc=4285,freq=28.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.47219574 = fieldWeight in 4285, product of:
          5.2915025 = tf(freq=28.0), with freq of:
            28.0 = termFreq=28.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4285)
    0.056571957 = weight(_text_:search in 4285) [ClassicSimilarity], result of:
      0.056571957 = score(doc=4285,freq=12.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.32922143 = fieldWeight in 4285, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4285)
    0.027355384 = product of:
      0.05471077 = sum of:
        0.05471077 = weight(_text_:engine in 4285) [ClassicSimilarity], result of:
          0.05471077 = score(doc=4285,freq=2.0), product of:
            0.26447627 = queryWeight, product of:
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.049439456 = queryNorm
            0.20686457 = fieldWeight in 4285, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.02734375 = fieldNorm(doc=4285)
      0.5 = coord(1/2)
  0.75 = coord(3/4)
```
Abstract

The introduction and growth of the World Wide Web (WWW, or Web) have resulted in a profound change in the way individuals and organizations access information. In terms of volume, nature, and accessibility, the characteristics of electronic information are significantly different from those of even five or six years ago. Control of, and access to, this flood of information rely heavily an automated techniques for indexing and retrieval. According to Gudivada, Raghavan, Grosky, and Kasanagottu (1997, p. 58), "The ability to search and retrieve information from the Web efficiently and effectively is an enabling technology for realizing its full potential." Almost 93 percent of those surveyed consider the Web an "indispensable" Internet technology, second only to e-mail (Graphie, Visualization & Usability Center, 1998). Although there are other ways of locating information an the Web (browsing or following directory structures), 85 percent of users identify Web pages by means of a search engine (Graphie, Visualization & Usability Center, 1998). A more recent study conducted by the Stanford Institute for the Quantitative Study of Society confirms the finding that searching for information is second only to e-mail as an Internet activity (Nie & Ebring, 2000, online). In fact, Nie and Ebring conclude, "... the Internet today is a giant public library with a decidedly commercial tilt. The most widespread use of the Internet today is as an information search utility for products, travel, hobbies, and general information. Virtually all users interviewed responded that they engaged in one or more of these information gathering activities."
Techniques for automated indexing and information retrieval (IR) have been developed, tested, and refined over the past 40 years, and are well documented (see, for example, Agosti & Smeaton, 1996; BaezaYates & Ribeiro-Neto, 1999a; Frakes & Baeza-Yates, 1992; Korfhage, 1997; Salton, 1989; Witten, Moffat, & Bell, 1999). With the introduction of the Web, and the capability to index and retrieve via search engines, these techniques have been extended to a new environment. They have been adopted, altered, and in some Gases extended to include new methods. "In short, search engines are indispensable for searching the Web, they employ a variety of relatively advanced IR techniques, and there are some peculiar aspects of search engines that make searching the Web different than more conventional information retrieval" (Gordon & Pathak, 1999, p. 145). The environment for information retrieval an the World Wide Web differs from that of "conventional" information retrieval in a number of fundamental ways. The collection is very large and changes continuously, with pages being added, deleted, and altered. Wide variability between the size, structure, focus, quality, and usefulness of documents makes Web documents much more heterogeneous than a typical electronic document collection. The wide variety of document types includes images, video, audio, and scripts, as well as many different document languages. Duplication of documents and sites is common. Documents are interconnected through networks of hyperlinks. Because of the size and dynamic nature of the Web, preprocessing all documents requires considerable resources and is often not feasible, certainly not an the frequent basis required to ensure currency. Query length is usually much shorter than in other environments-only a few words-and user behavior differs from that in other environments. These differences make the Web a novel environment for information retrieval (Baeza-Yates & Ribeiro-Neto, 1999b; Bharat & Henzinger, 1998; Huang, 2000).

Shue, J.-S.; Wu. S.: GAIS computer science bibliographies search (1997) 0.08

0.08405279 = product of:
  0.16810559 = sum of:
    0.105578996 = weight(_text_:search in 953) [ClassicSimilarity], result of:
      0.105578996 = score(doc=953,freq=8.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.6144187 = fieldWeight in 953, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0625 = fieldNorm(doc=953)
    0.06252659 = product of:
      0.12505318 = sum of:
        0.12505318 = weight(_text_:engine in 953) [ClassicSimilarity], result of:
          0.12505318 = score(doc=953,freq=2.0), product of:
            0.26447627 = queryWeight, product of:
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.049439456 = queryNorm
            0.47283328 = fieldWeight in 953, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.0625 = fieldNorm(doc=953)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: GAIS computer science bibliographies search is a WWW service providing a searchable interface on bibliographies related to computer science. It holds about 400.000 references, mirrored from the Informatics for Engineering and Science Department of the University of Karlsruhe, and allows full text searching through the search engine GAIS (Global Area Intelligent Search). Discusses its design and architecture

Thelwall, M.; Vaughan, L.; Björneborn, L.: Webometrics (2004) 0.06
```
0.060129207 = product of:
  0.12025841 = sum of:
    0.08726498 = weight(_text_:web in 4279) [ClassicSimilarity], result of:
      0.08726498 = score(doc=4279,freq=18.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.5408555 = fieldWeight in 4279, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4279)
    0.032993436 = weight(_text_:search in 4279) [ClassicSimilarity], result of:
      0.032993436 = score(doc=4279,freq=2.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.19200584 = fieldWeight in 4279, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4279)
  0.5 = coord(2/4)
```
Abstract

Webometrics, the quantitative study of Web-related phenomena, emerged from the realization that methods originally designed for bibliometric analysis of scientific journal article citation patterns could be applied to the Web, with commercial search engines providing the raw data. Almind and Ingwersen (1997) defined the field and gave it its name. Other pioneers included Rodriguez Gairin (1997) and Aguillo (1998). Larson (1996) undertook exploratory link structure analysis, as did Rousseau (1997). Webometrics encompasses research from fields beyond information science such as communication studies, statistical physics, and computer science. In this review we concentrate on link analysis, but also cover other aspects of webometrics, including Web log fle analysis. One theme that runs through this chapter is the messiness of Web data and the need for data cleansing heuristics. The uncontrolled Web creates numerous problems in the interpretation of results, for instance, from the automatic creation or replication of links. The loose connection between top-level domain specifications (e.g., com, edu, and org) and their actual content is also a frustrating problem. For example, many .com sites contain noncommercial content, although com is ostensibly the main commercial top-level domain. Indeed, a skeptical researcher could claim that obstacles of this kind are so great that all Web analyses lack value. As will be seen, one response to this view, a view shared by critics of evaluative bibliometrics, is to demonstrate that Web data correlate significantly with some non-Web data in order to prove that the Web data are not wholly random. A practical response has been to develop increasingly sophisticated data cleansing techniques and multiple data analysis methods.
Siqueira, J.; Martins, D.L.: Workflow models for aggregating cultural heritage data on the web : a systematic literature review (2022) 0.05
```
0.049141705 = product of:
  0.09828341 = sum of:
    0.041137107 = weight(_text_:web in 464) [ClassicSimilarity], result of:
      0.041137107 = score(doc=464,freq=4.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.25496176 = fieldWeight in 464, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=464)
    0.057146307 = weight(_text_:search in 464) [ClassicSimilarity], result of:
      0.057146307 = score(doc=464,freq=6.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.33256388 = fieldWeight in 464, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=464)
  0.5 = coord(2/4)
```
Abstract

In recent years, different cultural institutions have made efforts to spread culture through the construction of a unique search interface that integrates their digital objects and facilitates data retrieval for lay users. However, integrating cultural data is not a trivial task; therefore, this work performs a systematic literature review on data aggregation workflows, in order to answer five questions: What are the projects? What are the planned steps? Which technologies are used? Are the steps performed manually, automatically, or semi-automatically? Which perform semantic search? The searches were carried out in three databases: Networked Digital Library of Theses and Dissertations, Scopus and Web of Science. In Q01, 12 projects were selected. In Q02, 9 stages were identified: Harvesting, Ingestion, Mapping, Indexing, Storing, Monitoring, Enriching, Displaying, and Publishing LOD. In Q03, 19 different technologies were found it. In Q04, we identified that most of the solutions are semi-automatic and, in Q05, that most of them perform a semantic search. The analysis of the workflows allowed us to identify that there is no consensus regarding the stages, their nomenclatures, and technologies, besides presenting superficial discussions. But it allowed to identify the main steps for the implementation of the aggregation of cultural data.
Legg, C.: Ontologies on the Semantic Web (2007) 0.05
```
0.046107057 = product of:
  0.092214115 = sum of:
    0.06581937 = weight(_text_:web in 1979) [ClassicSimilarity], result of:
      0.06581937 = score(doc=1979,freq=16.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.4079388 = fieldWeight in 1979, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.03125 = fieldNorm(doc=1979)
    0.026394749 = weight(_text_:search in 1979) [ClassicSimilarity], result of:
      0.026394749 = score(doc=1979,freq=2.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.15360467 = fieldWeight in 1979, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.03125 = fieldNorm(doc=1979)
  0.5 = coord(2/4)
```
Abstract

As an informational technology, the World Wide Web has enjoyed spectacular success. In just ten years it has transformed the way information is produced, stored, and shared in arenas as diverse as shopping, family photo albums, and high-level academic research. The "Semantic Web" is touted by its developers as equally revolutionary, although it has not yet achieved anything like the Web's exponential uptake. It seeks to transcend a current limitation of the Web - that it largely requires indexing to be accomplished merely on specific character strings. Thus, a person searching for information about "turkey" (the bird) receives from current search engines many irrelevant pages about "Turkey" (the country) and nothing about the Spanish "pavo" even if he or she is a Spanish-speaker able to understand such pages. The Semantic Web vision is to develop technology to facilitate retrieval of information via meanings, not just spellings. For this to be possible, most commentators believe, Semantic Web applications will have to draw on some kind of shared, structured, machine-readable conceptual scheme. Thus, there has been a convergence between the Semantic Web research community and an older tradition with roots in classical Artificial Intelligence (AI) research (sometimes referred to as "knowledge representation") whose goal is to develop a formal ontology. A formal ontology is a machine-readable theory of the most fundamental concepts or "categories" required in order to understand information pertaining to any knowledge domain. A review of the attempts that have been made to realize this goal provides an opportunity to reflect in interestingly concrete ways on various research questions such as the following: - How explicit a machine-understandable theory of meaning is it possible or practical to construct? - How universal a machine-understandable theory of meaning is it possible or practical to construct? - How much (and what kind of) inference support is required to realize a machine-understandable theory of meaning? - What is it for a theory of meaning to be machine-understandable anyway?

Theme

Semantic Web

Thiele, H.: ¬The Dublin Core and Warwick framework : a review of the literature, March 1995 - September 1997 (1998) 0.04

0.04324353 = product of:
  0.08648706 = sum of:
    0.03959212 = weight(_text_:search in 1254) [ClassicSimilarity], result of:
      0.03959212 = score(doc=1254,freq=2.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.230407 = fieldWeight in 1254, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.046875 = fieldNorm(doc=1254)
    0.04689494 = product of:
      0.09378988 = sum of:
        0.09378988 = weight(_text_:engine in 1254) [ClassicSimilarity], result of:
          0.09378988 = score(doc=1254,freq=2.0), product of:
            0.26447627 = queryWeight, product of:
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.049439456 = queryNorm
            0.35462496 = fieldWeight in 1254, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.349498 = idf(docFreq=570, maxDocs=44218)
              0.046875 = fieldNorm(doc=1254)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: The purpose of this essay is to identify and explore the dynamics of the literature associated with the Dublin Core Workshop Series. The essay opens by identifying the problems that the Dublin Core Workshop Series is addressing, the status of the Internet at the time of the first workshop, and the contributions each workshop has made to the ongoing discussion. The body of the essay describes the characteristics of the literature, highlights key documents, and identifies the major researchers. The essay closes with evaluation of the literary trends and considerations of future research directions. The essay concludes that a shift from a descriptive emphasis to a more empirical form of literature is about to take place. Future research questions are identified in the areas of satisfying searcher needs, the impact of surrogate descriptions on search engine performance, and the effectiveness of surrogate descriptions in authenticating Internet resources.

Weiss, A.K.; Carstens, T.V.: ¬The year's work in cataloging, 1999 (2001) 0.04

0.040518112 = product of:
  0.081036225 = sum of:
    0.05759195 = weight(_text_:web in 6084) [ClassicSimilarity], result of:
      0.05759195 = score(doc=6084,freq=4.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.35694647 = fieldWeight in 6084, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0546875 = fieldNorm(doc=6084)
    0.023444273 = product of:
      0.046888545 = sum of:
        0.046888545 = weight(_text_:22 in 6084) [ClassicSimilarity], result of:
          0.046888545 = score(doc=6084,freq=2.0), product of:
            0.17312855 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049439456 = queryNorm
            0.2708308 = fieldWeight in 6084, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6084)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: The challenge of cataloging Web sites and electronic resources was the most important issue facing the cataloging world in the last year. This article reviews attempts to analyze and revise the cataloging code in view of the new electronic environment. The difficulties of applying traditional library cataloging standards to Web resources has led some to favor metadata as the best means of providing access to these materials. The appropriate education and training for library cataloging personnel remains crucial during this transitional period. Articles on user understanding of Library of Congress subject headings and on cataloging practice are also reviewed.
Date: 10. 9.2000 17:38:22

Julien, C.-A.; Leide, J.E.; Bouthillier, F.: Controlled user evaluations of information visualization interfaces for text retrieval : literature review and meta-analysis (2008) 0.04
```
0.037249055 = product of:
  0.07449811 = sum of:
    0.03490599 = weight(_text_:web in 1718) [ClassicSimilarity], result of:
      0.03490599 = score(doc=1718,freq=2.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.21634221 = fieldWeight in 1718, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=1718)
    0.03959212 = weight(_text_:search in 1718) [ClassicSimilarity], result of:
      0.03959212 = score(doc=1718,freq=2.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.230407 = fieldWeight in 1718, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.046875 = fieldNorm(doc=1718)
  0.5 = coord(2/4)
```
Abstract

This review describes experimental designs (users, search tasks, measures, etc.) used by 31 controlled user studies of information visualization (IV) tools for textual information retrieval (IR) and a meta-analysis of the reported statistical effects. Comparable experimental designs allow research designers to compare their results with other reports, and support the development of experimentally verified design guidelines concerning which IV techniques are better suited to which types of IR tasks. The studies generally use a within-subject design with 15 or more undergraduate students performing browsing to known-item tasks on sets of at least 1,000 full-text articles or Web pages on topics of general interest/news. Results of the meta-analysis (N = 8) showed no significant effects of the IV tool as compared with a text-only equivalent, but the set shows great variability suggesting an inadequate basis of comparison. Experimental design recommendations are provided which would support comparison of existing IV tools for IR usability testing.

Chambers, S.; Myall, C.: Cataloging and classification : review of the literature 2007-8 (2010) 0.03

0.032083966 = product of:
  0.06416793 = sum of:
    0.04072366 = weight(_text_:web in 4309) [ClassicSimilarity], result of:
      0.04072366 = score(doc=4309,freq=2.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.25239927 = fieldWeight in 4309, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4309)
    0.023444273 = product of:
      0.046888545 = sum of:
        0.046888545 = weight(_text_:22 in 4309) [ClassicSimilarity], result of:
          0.046888545 = score(doc=4309,freq=2.0), product of:
            0.17312855 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049439456 = queryNorm
            0.2708308 = fieldWeight in 4309, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4309)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: This paper surveys library literature on cataloging and classification published in 2007-8, indicating its extent and range in terms of types of literature, major subject areas, and themes. The paper reviews pertinent literature in the following areas: the future of bibliographic control, general cataloging standards and texts, Functional Requirements for Bibliographic Records (FRBR), cataloging varied resources, metadata and cataloging in the Web world, classification and subject access, questions of diversity and diverse perspectives, additional reports of practice and research, catalogers' education and careers, keeping current through columns and blogs, and cataloging history.
Date: 10. 9.2000 17:38:22

Chen, H.; Chau, M.: Web mining : machine learning for Web applications (2003) 0.03
```
0.02759561 = product of:
  0.11038244 = sum of:
    0.11038244 = weight(_text_:web in 4242) [ClassicSimilarity], result of:
      0.11038244 = score(doc=4242,freq=20.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.6841342 = fieldWeight in 4242, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=4242)
  0.25 = coord(1/4)
```
Abstract

With more than two billion pages created by millions of Web page authors and organizations, the World Wide Web is a tremendously rich knowledge base. The knowledge comes not only from the content of the pages themselves, but also from the unique characteristics of the Web, such as its hyperlink structure and its diversity of content and languages. Analysis of these characteristics often reveals interesting patterns and new knowledge. Such knowledge can be used to improve users' efficiency and effectiveness in searching for information an the Web, and also for applications unrelated to the Web, such as support for decision making or business management. The Web's size and its unstructured and dynamic content, as well as its multilingual nature, make the extraction of useful knowledge a challenging research problem. Furthermore, the Web generates a large amount of data in other formats that contain valuable information. For example, Web server logs' information about user access patterns can be used for information personalization or improving Web page design.
Dumais, S.T.: Latent semantic analysis (2003) 0.03
```
0.02587039 = product of:
  0.05174078 = sum of:
    0.017452994 = weight(_text_:web in 2462) [ClassicSimilarity], result of:
      0.017452994 = score(doc=2462,freq=2.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.108171105 = fieldWeight in 2462, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0234375 = fieldNorm(doc=2462)
    0.034287788 = weight(_text_:search in 2462) [ClassicSimilarity], result of:
      0.034287788 = score(doc=2462,freq=6.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.19953834 = fieldWeight in 2462, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0234375 = fieldNorm(doc=2462)
  0.5 = coord(2/4)
```
Abstract

Latent Semantic Analysis (LSA) was first introduced in Dumais, Furnas, Landauer, and Deerwester (1988) and Deerwester, Dumais, Furnas, Landauer, and Harshman (1990) as a technique for improving information retrieval. The key insight in LSA was to reduce the dimensionality of the information retrieval problem. Most approaches to retrieving information depend an a lexical match between words in the user's query and those in documents. Indeed, this lexical matching is the way that the popular Web and enterprise search engines work. Such systems are, however, far from ideal. We are all aware of the tremendous amount of irrelevant information that is retrieved when searching. We also fail to find much of the existing relevant material. LSA was designed to address these retrieval problems, using dimension reduction techniques. Fundamental characteristics of human word usage underlie these retrieval failures. People use a wide variety of words to describe the same object or concept (synonymy). Furnas, Landauer, Gomez, and Dumais (1987) showed that people generate the same keyword to describe well-known objects only 20 percent of the time. Poor agreement was also observed in studies of inter-indexer consistency (e.g., Chan, 1989; Tarr & Borko, 1974) in the generation of search terms (e.g., Fidel, 1985; Bates, 1986), and in the generation of hypertext links (Furner, Ellis, & Willett, 1999). Because searchers and authors often use different words, relevant materials are missed. Someone looking for documents an "human-computer interaction" will not find articles that use only the phrase "man-machine studies" or "human factors." People also use the same word to refer to different things (polysemy). Words like "saturn," "jaguar," or "chip" have several different meanings. A short query like "saturn" will thus return many irrelevant documents. The query "Saturn Gar" will return fewer irrelevant items, but it will miss some documents that use only the terms "Saturn automobile." In searching, there is a constant tension between being overly specific and missing relevant information, and being more general and returning irrelevant information.
A number of approaches have been developed in information retrieval to address the problems caused by the variability in word usage. Stemming is a popular technique used to normalize some kinds of surface-level variability by converting words to their morphological root. For example, the words "retrieve," "retrieval," "retrieved," and "retrieving" would all be converted to their root form, "retrieve." The root form is used for both document and query processing. Stemming sometimes helps retrieval, although not much (Harman, 1991; Hull, 1996). And, it does not address Gases where related words are not morphologically related (e.g., physician and doctor). Controlled vocabularies have also been used to limit variability by requiring that query and index terms belong to a pre-defined set of terms. Documents are indexed by a specified or authorized list of subject headings or index terms, called the controlled vocabulary. Library of Congress Subject Headings, Medical Subject Headings, Association for Computing Machinery (ACM) keywords, and Yellow Pages headings are examples of controlled vocabularies. If searchers can find the right controlled vocabulary terms, they do not have to think of all the morphologically related or synonymous terms that authors might have used. However, assigning controlled vocabulary terms in a consistent and thorough manner is a time-consuming and usually manual process. A good deal of research has been published about the effectiveness of controlled vocabulary indexing compared to full text indexing (e.g., Bates, 1998; Lancaster, 1986; Svenonius, 1986). The combination of both full text and controlled vocabularies is often better than either alone, although the size of the advantage is variable (Lancaster, 1986; Markey, Atherton, & Newton, 1982; Srinivasan, 1996). Richer thesauri have also been used to provide synonyms, generalizations, and specializations of users' search terms (see Srinivasan, 1992, for a review). Controlled vocabularies and thesaurus entries can be generated either manually or by the automatic analysis of large collections of texts.

Large, A.: Children, teenagers, and the Web (2004) 0.02

0.023270661 = product of:
  0.093082644 = sum of:
    0.093082644 = weight(_text_:web in 4154) [ClassicSimilarity], result of:
      0.093082644 = score(doc=4154,freq=2.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.5769126 = fieldWeight in 4154, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.125 = fieldNorm(doc=4154)
  0.25 = coord(1/4)

Yang, K.: Information retrieval on the Web (2004) 0.02
```
0.023270661 = product of:
  0.093082644 = sum of:
    0.093082644 = weight(_text_:web in 4278) [ClassicSimilarity], result of:
      0.093082644 = score(doc=4278,freq=32.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.5769126 = fieldWeight in 4278, product of:
          5.656854 = tf(freq=32.0), with freq of:
            32.0 = termFreq=32.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.03125 = fieldNorm(doc=4278)
  0.25 = coord(1/4)
```
Abstract

How do we find information an the Web? Although information on the Web is distributed and decentralized, the Web can be viewed as a single, virtual document collection. In that regard, the fundamental questions and approaches of traditional information retrieval (IR) research (e.g., term weighting, query expansion) are likely to be relevant in Web document retrieval. Findings from traditional IR research, however, may not always be applicable in a Web setting. The Web document collection - massive in size and diverse in content, format, purpose, and quality - challenges the validity of previous research findings that are based an relatively small and homogeneous test collections. Moreover, some traditional IR approaches, although applicable in theory, may be impossible or impractical to implement in a Web setting. For instance, the size, distribution, and dynamic nature of Web information make it extremely difficult to construct a complete and up-to-date data representation of the kind required for a model IR system. To further complicate matters, information seeking on the Web is diverse in character and unpredictable in nature. Web searchers come from all walks of life and are motivated by many kinds of information needs. The wide range of experience, knowledge, motivation, and purpose means that searchers can express diverse types of information needs in a wide variety of ways with differing criteria for satisfying those needs. Conventional evaluation measures, such as precision and recall, may no longer be appropriate for Web IR, where a representative test collection is all but impossible to construct. Finding information on the Web creates many new challenges for, and exacerbates some old problems in, IR research. At the same time, the Web is rich in new types of information not present in most IR test collections. Hyperlinks, usage statistics, document markup tags, and collections of topic hierarchies such as Yahoo! (http://www.yahoo.com) present an opportunity to leverage Web-specific document characteristics in novel ways that go beyond the term-based retrieval framework of traditional IR. Consequently, researchers in Web IR have reexamined the findings from traditional IR research.

Twidale, M.B.; Nichols, D.M.: Computer supported cooperative work in information search and retrieval (1999) 0.02

0.023095407 = product of:
  0.09238163 = sum of:
    0.09238163 = weight(_text_:search in 4691) [ClassicSimilarity], result of:
      0.09238163 = score(doc=4691,freq=2.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.5376164 = fieldWeight in 4691, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.109375 = fieldNorm(doc=4691)
  0.25 = coord(1/4)

Zhu, B.; Chen, H.: Information visualization (2004) 0.02
```
0.021728618 = product of:
  0.043457236 = sum of:
    0.02036183 = weight(_text_:web in 4276) [ClassicSimilarity], result of:
      0.02036183 = score(doc=4276,freq=2.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.12619963 = fieldWeight in 4276, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4276)
    0.023095407 = weight(_text_:search in 4276) [ClassicSimilarity], result of:
      0.023095407 = score(doc=4276,freq=2.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.1344041 = fieldWeight in 4276, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4276)
  0.5 = coord(2/4)
```
Abstract

Advanced technology has resulted in the generation of about one million terabytes of information every year. Ninety-reine percent of this is available in digital format (Keim, 2001). More information will be generated in the next three years than was created during all of previous human history (Keim, 2001). Collecting information is no longer a problem, but extracting value from information collections has become progressively more difficult. Various search engines have been developed to make it easier to locate information of interest, but these work well only for a person who has a specific goal and who understands what and how information is stored. This usually is not the Gase. Visualization was commonly thought of in terms of representing human mental processes (MacEachren, 1991; Miller, 1984). The concept is now associated with the amplification of these mental processes (Card, Mackinlay, & Shneiderman, 1999). Human eyes can process visual cues rapidly, whereas advanced information analysis techniques transform the computer into a powerful means of managing digitized information. Visualization offers a link between these two potent systems, the human eye and the computer (Gershon, Eick, & Card, 1998), helping to identify patterns and to extract insights from large amounts of information. The identification of patterns is important because it may lead to a scientific discovery, an interpretation of clues to solve a crime, the prediction of catastrophic weather, a successful financial investment, or a better understanding of human behavior in a computermediated environment. Visualization technology shows considerable promise for increasing the value of large-scale collections of information, as evidenced by several commercial applications of TreeMap (e.g., http://www.smartmoney.com) and Hyperbolic tree (e.g., http://www.inxight.com) to visualize large-scale hierarchical structures. Although the proliferation of visualization technologies dates from the 1990s where sophisticated hardware and software made increasingly faster generation of graphical objects possible, the role of visual aids in facilitating the construction of mental images has a long history. Visualization has been used to communicate ideas, to monitor trends implicit in data, and to explore large volumes of data for hypothesis generation. Imagine traveling to a strange place without a map, having to memorize physical and chemical properties of an element without Mendeleyev's periodic table, trying to understand the stock market without statistical diagrams, or browsing a collection of documents without interactive visual aids. A collection of information can lose its value simply because of the effort required for exhaustive exploration. Such frustrations can be overcome by visualization.
Visualization can be classified as scientific visualization, software visualization, or information visualization. Although the data differ, the underlying techniques have much in common. They use the same elements (visual cues) and follow the same rules of combining visual cues to deliver patterns. They all involve understanding human perception (Encarnacao, Foley, Bryson, & Feiner, 1994) and require domain knowledge (Tufte, 1990). Because most decisions are based an unstructured information, such as text documents, Web pages, or e-mail messages, this chapter focuses an the visualization of unstructured textual documents. The chapter reviews information visualization techniques developed over the last decade and examines how they have been applied in different domains. The first section provides the background by describing visualization history and giving overviews of scientific, software, and information visualization as well as the perceptual aspects of visualization. The next section assesses important visualization techniques that convert abstract information into visual objects and facilitate navigation through displays an a computer screen. It also explores information analysis algorithms that can be applied to identify or extract salient visualizable structures from collections of information. Information visualization systems that integrate different types of technologies to address problems in different domains are then surveyed; and we move an to a survey and critique of visualization system evaluation studies. The chapter concludes with a summary and identification of future research directions.
Chowdhury, G.G.: ¬The Internet and information retrieval research : a brief review (1999) 0.02
```
0.018663906 = product of:
  0.07465562 = sum of:
    0.07465562 = weight(_text_:search in 3424) [ClassicSimilarity], result of:
      0.07465562 = score(doc=3424,freq=4.0), product of:
        0.17183559 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.049439456 = queryNorm
        0.43445963 = fieldWeight in 3424, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0625 = fieldNorm(doc=3424)
  0.25 = coord(1/4)
```
Abstract

The Internet and related information services attract increasing interest from information retrieval researchers. A survey of recent publications shows that frequent topics are the effectiveness of search engines, information validation and quality, user studies, design of user interfaces, data structures and metadata, classification and vocabulary based aids, and indexing and search agents. Current research in these areas is briefly discussed. The changing balance between CD-ROM sources and traditional online searching is quite important and is noted
Haythornthwaite, C.; Hagar, C.: ¬The social worlds of the Web (2004) 0.02
```
0.01626087 = product of:
  0.06504348 = sum of:
    0.06504348 = weight(_text_:web in 4282) [ClassicSimilarity], result of:
      0.06504348 = score(doc=4282,freq=10.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.40312994 = fieldWeight in 4282, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4282)
  0.25 = coord(1/4)
```
Abstract

We know this Web world. We live in it, particularly those of us in developed countries. Even if we do not go online daily, we live with itour culture is imprinted with online activity and vocabulary: e-mailing colleagues, surfing the Web, posting Web pages, blogging, gender-bending in cyberspace, texting and instant messaging friends, engaging in ecommerce, entering an online chat room, or morphing in an online world. We use it-to conduct business, find information, talk with friends and colleagues. We know it is something separate, yet we incorporate it into our daily lives. We identify with it, bringing to it behaviors and expectations we hold for the world in general. We approach it as explorers and entrepreneurs, ready to move into unknown opportunities and territory; creators and engineers, eager to build new structures; utopians for whom "the world of the Web" represents the chance to start again and "get it right" this time; utilitarians, ready to get what we can out of the new structures; and dystopians, for whom this is just more evidence that there is no way to "get it right." The word "world" has many connotations. The Oxford English Dictionary (http://dictionary.oed.com) gives 27 definitions for the noun "world" including: - The sphere within which one's interests are bound up or one's activities find scope; (one's) sphere of action or thought; the "realm" within which one moves or lives. - A group or system of things or beings associated by common characteristics (denoted by a qualifying word or phrase), or considered as constituting a unity. - Human society considered in relation to its activities, difficulties, temptations, and the like; hence, contextually, the ways, practices, or customs of the people among whom one lives; the occupations and interests of society at large.
Marsh, S.; Dibben, M.R.: ¬The role of trust in information science and technology (2002) 0.02
```
0.015114739 = product of:
  0.060458954 = sum of:
    0.060458954 = weight(_text_:web in 4289) [ClassicSimilarity], result of:
      0.060458954 = score(doc=4289,freq=6.0), product of:
        0.16134618 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.049439456 = queryNorm
        0.37471575 = fieldWeight in 4289, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=4289)
  0.25 = coord(1/4)
```
Abstract

This chapter discusses the notion of trust as it relates to information science and technology, specifically user interfaces, autonomous agents, and information systems. We first present an in-depth discussion of the concept of trust in and of itself, moving an to applications and considerations of trust in relation to information technologies. We consider trust from a "soft" perspective-thus, although security concepts such as cryptography, virus protection, authentication, and so forth reinforce (or damage) the feelings of trust we may have in a system, they are not themselves constitutive of "trust." We discuss information technology from a human-centric viewpoint, where trust is a less well-structured but much more powerful phenomenon. With the proliferation of electronic commerce (e-commerce) and the World Wide Web (WWW, or Web), much has been made of the ability of individuals to explore the vast quantities of information available to them, to purchase goods (as diverse as vacations and cars) online, and to publish information an their personal Web sites.

Search (54 results, page 1 of 3)

Authors

Years

Types

Themes