Search (48 results, page 1 of 3)

Oppenheim, C.; Morris, A.; McKnight, C.: ¬The evaluation of WWW search engines (2000) 0.12

0.11736986 = product of:
  0.17605479 = sum of:
    0.09002314 = weight(_text_:search in 4546) [ClassicSimilarity], result of:
      0.09002314 = score(doc=4546,freq=10.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.51520574 = fieldWeight in 4546, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.046875 = fieldNorm(doc=4546)
    0.08603165 = product of:
      0.1720633 = sum of:
        0.1720633 = weight(_text_:engines in 4546) [ClassicSimilarity], result of:
          0.1720633 = score(doc=4546,freq=8.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.67362815 = fieldWeight in 4546, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=4546)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: The literature of the evaluation of Internet search engines is reviewed. Although there have been many studies, there has been little consistency in the way such studies have been carried out. This problem is exacerbated by the fact that recall is virtually impossible to calculate in the fast changing Internet environment, and therefore the traditional Cranfield type of evaluation is not usually possible. A variety of alternative evaluation methods has been suggested to overcome this difficulty. The authors recommend that a standardised set of tools is developed for the evaluation of web search engines so that, in future, comparisons can be made between search engines more effectively, and that variations in performance of any given search engine over time can be tracked. The paper itself does not provide such a standard set of tools, but it investigates the issues and makes preliminary recommendations of the types of tools needed

Bar-Ilan, J.: ¬The use of Web search engines in information science research (2003) 0.10
```
0.10334983 = product of:
  0.15502474 = sum of:
    0.08051914 = weight(_text_:search in 4271) [ClassicSimilarity], result of:
      0.08051914 = score(doc=4271,freq=8.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.460814 = fieldWeight in 4271, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.046875 = fieldNorm(doc=4271)
    0.074505605 = product of:
      0.14901121 = sum of:
        0.14901121 = weight(_text_:engines in 4271) [ClassicSimilarity], result of:
          0.14901121 = score(doc=4271,freq=6.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.58337915 = fieldWeight in 4271, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=4271)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

The World Wide Web was created in 1989, but it has already become a major information channel and source, influencing our everyday lives, commercial transactions, and scientific communication, to mention just a few areas. The seventeenth-century philosopher Descartes proclaimed, "I think, therefore I am" (cogito, ergo sum). Today the Web is such an integral part of our lives that we could rephrase Descartes' statement as "I have a Web presence, therefore I am." Because many people, companies, and organizations take this notion seriously, in addition to more substantial reasons for publishing information an the Web, the number of Web pages is in the billions and growing constantly. However, it is not sufficient to have a Web presence; tools that enable users to locate Web pages are needed as well. The major tools for discovering and locating information an the Web are search engines. This review discusses the use of Web search engines in information science research. Before going into detail, we should define the terms "information science," "Web search engine," and "use" in the context of this review.

Chowdhury, G.G.: ¬The Internet and information retrieval research : a brief review (1999) 0.09

0.088845745 = product of:
  0.13326861 = sum of:
    0.075914174 = weight(_text_:search in 3424) [ClassicSimilarity], result of:
      0.075914174 = score(doc=3424,freq=4.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.43445963 = fieldWeight in 3424, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0625 = fieldNorm(doc=3424)
    0.057354435 = product of:
      0.11470887 = sum of:
        0.11470887 = weight(_text_:engines in 3424) [ClassicSimilarity], result of:
          0.11470887 = score(doc=3424,freq=2.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.44908544 = fieldWeight in 3424, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0625 = fieldNorm(doc=3424)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: The Internet and related information services attract increasing interest from information retrieval researchers. A survey of recent publications shows that frequent topics are the effectiveness of search engines, information validation and quality, user studies, design of user interfaces, data structures and metadata, classification and vocabulary based aids, and indexing and search agents. Current research in these areas is briefly discussed. The changing balance between CD-ROM sources and traditional online searching is quite important and is noted

Rasmussen, E.M.: Indexing and retrieval for the Web (2002) 0.07
```
0.06732484 = product of:
  0.100987256 = sum of:
    0.05752565 = weight(_text_:search in 4285) [ClassicSimilarity], result of:
      0.05752565 = score(doc=4285,freq=12.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.32922143 = fieldWeight in 4285, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4285)
    0.043461602 = product of:
      0.086923204 = sum of:
        0.086923204 = weight(_text_:engines in 4285) [ClassicSimilarity], result of:
          0.086923204 = score(doc=4285,freq=6.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.34030452 = fieldWeight in 4285, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.02734375 = fieldNorm(doc=4285)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

The introduction and growth of the World Wide Web (WWW, or Web) have resulted in a profound change in the way individuals and organizations access information. In terms of volume, nature, and accessibility, the characteristics of electronic information are significantly different from those of even five or six years ago. Control of, and access to, this flood of information rely heavily an automated techniques for indexing and retrieval. According to Gudivada, Raghavan, Grosky, and Kasanagottu (1997, p. 58), "The ability to search and retrieve information from the Web efficiently and effectively is an enabling technology for realizing its full potential." Almost 93 percent of those surveyed consider the Web an "indispensable" Internet technology, second only to e-mail (Graphie, Visualization & Usability Center, 1998). Although there are other ways of locating information an the Web (browsing or following directory structures), 85 percent of users identify Web pages by means of a search engine (Graphie, Visualization & Usability Center, 1998). A more recent study conducted by the Stanford Institute for the Quantitative Study of Society confirms the finding that searching for information is second only to e-mail as an Internet activity (Nie & Ebring, 2000, online). In fact, Nie and Ebring conclude, "... the Internet today is a giant public library with a decidedly commercial tilt. The most widespread use of the Internet today is as an information search utility for products, travel, hobbies, and general information. Virtually all users interviewed responded that they engaged in one or more of these information gathering activities."
Techniques for automated indexing and information retrieval (IR) have been developed, tested, and refined over the past 40 years, and are well documented (see, for example, Agosti & Smeaton, 1996; BaezaYates & Ribeiro-Neto, 1999a; Frakes & Baeza-Yates, 1992; Korfhage, 1997; Salton, 1989; Witten, Moffat, & Bell, 1999). With the introduction of the Web, and the capability to index and retrieve via search engines, these techniques have been extended to a new environment. They have been adopted, altered, and in some Gases extended to include new methods. "In short, search engines are indispensable for searching the Web, they employ a variety of relatively advanced IR techniques, and there are some peculiar aspects of search engines that make searching the Web different than more conventional information retrieval" (Gordon & Pathak, 1999, p. 145). The environment for information retrieval an the World Wide Web differs from that of "conventional" information retrieval in a number of fundamental ways. The collection is very large and changes continuously, with pages being added, deleted, and altered. Wide variability between the size, structure, focus, quality, and usefulness of documents makes Web documents much more heterogeneous than a typical electronic document collection. The wide variety of document types includes images, video, audio, and scripts, as well as many different document languages. Duplication of documents and sites is common. Documents are interconnected through networks of hyperlinks. Because of the size and dynamic nature of the Web, preprocessing all documents requires considerable resources and is often not feasible, certainly not an the frequent basis required to ensure currency. Query length is usually much shorter than in other environments-only a few words-and user behavior differs from that in other environments. These differences make the Web a novel environment for information retrieval (Baeza-Yates & Ribeiro-Neto, 1999b; Bharat & Henzinger, 1998; Huang, 2000).

Harter, S.P.; Hert, C.A.: Evaluation of information retrieval systems : approaches, issues, and methods (1997) 0.06

0.06476976 = product of:
  0.09715463 = sum of:
    0.0469695 = weight(_text_:search in 2264) [ClassicSimilarity], result of:
      0.0469695 = score(doc=2264,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.2688082 = fieldWeight in 2264, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2264)
    0.05018513 = product of:
      0.10037026 = sum of:
        0.10037026 = weight(_text_:engines in 2264) [ClassicSimilarity], result of:
          0.10037026 = score(doc=2264,freq=2.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.39294976 = fieldWeight in 2264, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2264)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: State of the art review of information retrieval systems, defined as systems retrieving documents a sopposed to numerical data. Explains the classic Cranfield studies that have served as a standard for retrieval testing since the 1960s and discusses the Cranfield model and its relevance based measures of retrieval effectiveness. Details sosme of the problems with the Cranfield instruments and issues of validity and reliability, generalizability, usefulness and basic concepts. Discusses the evaluation of the Internet search engines in light of the Cranfield model, noting the very real differences between batch systems (Cranfield) and interactive systems (Internet). Because the Internet collection is not fixed, it is impossible to determine recall as a measure of retrieval effectiveness. considers future directions in evaluating information retrieval systems

Woodward, J.: Cataloging and classifying information resources on the Internet (1996) 0.06

0.05551693 = product of:
  0.08327539 = sum of:
    0.04025957 = weight(_text_:search in 7397) [ClassicSimilarity], result of:
      0.04025957 = score(doc=7397,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.230407 = fieldWeight in 7397, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.046875 = fieldNorm(doc=7397)
    0.043015826 = product of:
      0.08603165 = sum of:
        0.08603165 = weight(_text_:engines in 7397) [ClassicSimilarity], result of:
          0.08603165 = score(doc=7397,freq=2.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.33681408 = fieldWeight in 7397, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=7397)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: State of the art review exploring the problem of bibliographic citations to resources that exist only in electronic form where the cited items may no longer be locatable at the URL indicated. Notes that the Internet is currently in a state of near chaos in terms of access and organization, while searching, usually performed with word based search engines, is generally not adequate for the needs of most users. Reviews strategies used by librarians for cataloguing and classifying information resources on the Internet. Techniques used include: automatic classification projects and classified subject trees, like the BUBL Subject Tree; CyberDewey, and the WWW Virtual Library. Considers OPAC like library catalogues such as the UK's CATRIONA Project and OCLC's InterCat. Explores retrieval tools used with concept analysis and other non traditional proposals, which include some library expertise, usually the use of one of the major library classifications. Pays particular attention to the UDC

Cho, H.; Pham, M.T.N.; Leonard, K.N.; Urban, A.C.: ¬A systematic literature review on image information needs and behaviors (2022) 0.05
```
0.05490443 = product of:
  0.08235665 = sum of:
    0.053679425 = weight(_text_:search in 606) [ClassicSimilarity], result of:
      0.053679425 = score(doc=606,freq=8.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.30720934 = fieldWeight in 606, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.03125 = fieldNorm(doc=606)
    0.028677218 = product of:
      0.057354435 = sum of:
        0.057354435 = weight(_text_:engines in 606) [ClassicSimilarity], result of:
          0.057354435 = score(doc=606,freq=2.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.22454272 = fieldWeight in 606, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.03125 = fieldNorm(doc=606)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Purpose With ready access to search engines and social media platforms, the way people find image information has evolved and diversified in the past two decades. The purpose of this paper is to provide an overview of the literature on image information needs and behaviors. Design/methodology/approach Following an eight-step procedure for conducting systematic literature reviews, the paper presents an analysis of peer-reviewed work on image information needs and behaviors, with publications ranging from the years 1997 to 2019. Findings Application of the inclusion criteria led to 69 peer-reviewed works. These works were synthesized according to the following categories: research methods, users targeted, image types, identified needs, search behaviors and search obstacles. The reviewed studies show that people seek and use images for multiple reasons, including entertainment, illustration, aesthetic appreciation, knowledge construction, engagement, inspiration and social interactions. The reviewed studies also report that common strategies for image searches include keyword searches with short queries, browsing, specialization and reformulation. Observed trends suggest common deployment of query analysis, survey questionnaires and undergraduate participant pools to research image information needs and behavior. Originality/value At this point, after more than two decades of image information needs research, a holistic systematic review of the literature was long overdue. The way users find image information has evolved and diversified due to technological developments in image retrieval. By synthesizing this burgeoning field into specific foci, this systematic literature review provides a foundation for future empirical investigation. With this foundation set, the paper then pinpoints key research gaps to investigate, particularly the influence of user expertise, a need for more diverse population samples, a dearth of qualitative data, new search features and information and visual literacies instruction.
Thelwall, M.; Vaughan, L.; Björneborn, L.: Webometrics (2004) 0.05
```
0.04626411 = product of:
  0.06939616 = sum of:
    0.03354964 = weight(_text_:search in 4279) [ClassicSimilarity], result of:
      0.03354964 = score(doc=4279,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.19200584 = fieldWeight in 4279, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4279)
    0.03584652 = product of:
      0.07169304 = sum of:
        0.07169304 = weight(_text_:engines in 4279) [ClassicSimilarity], result of:
          0.07169304 = score(doc=4279,freq=2.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.2806784 = fieldWeight in 4279, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4279)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Webometrics, the quantitative study of Web-related phenomena, emerged from the realization that methods originally designed for bibliometric analysis of scientific journal article citation patterns could be applied to the Web, with commercial search engines providing the raw data. Almind and Ingwersen (1997) defined the field and gave it its name. Other pioneers included Rodriguez Gairin (1997) and Aguillo (1998). Larson (1996) undertook exploratory link structure analysis, as did Rousseau (1997). Webometrics encompasses research from fields beyond information science such as communication studies, statistical physics, and computer science. In this review we concentrate on link analysis, but also cover other aspects of webometrics, including Web log fle analysis. One theme that runs through this chapter is the messiness of Web data and the need for data cleansing heuristics. The uncontrolled Web creates numerous problems in the interpretation of results, for instance, from the automatic creation or replication of links. The loose connection between top-level domain specifications (e.g., com, edu, and org) and their actual content is also a frustrating problem. For example, many .com sites contain noncommercial content, although com is ostensibly the main commercial top-level domain. Indeed, a skeptical researcher could claim that obstacles of this kind are so great that all Web analyses lack value. As will be seen, one response to this view, a view shared by critics of evaluative bibliometrics, is to demonstrate that Web data correlate significantly with some non-Web data in order to prove that the Web data are not wholly random. A practical response has been to develop increasingly sophisticated data cleansing techniques and multiple data analysis methods.
Dumais, S.T.: Latent semantic analysis (2003) 0.04
```
0.037582483 = product of:
  0.056373723 = sum of:
    0.03486581 = weight(_text_:search in 2462) [ClassicSimilarity], result of:
      0.03486581 = score(doc=2462,freq=6.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.19953834 = fieldWeight in 2462, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0234375 = fieldNorm(doc=2462)
    0.021507913 = product of:
      0.043015826 = sum of:
        0.043015826 = weight(_text_:engines in 2462) [ClassicSimilarity], result of:
          0.043015826 = score(doc=2462,freq=2.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.16840704 = fieldWeight in 2462, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0234375 = fieldNorm(doc=2462)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Latent Semantic Analysis (LSA) was first introduced in Dumais, Furnas, Landauer, and Deerwester (1988) and Deerwester, Dumais, Furnas, Landauer, and Harshman (1990) as a technique for improving information retrieval. The key insight in LSA was to reduce the dimensionality of the information retrieval problem. Most approaches to retrieving information depend an a lexical match between words in the user's query and those in documents. Indeed, this lexical matching is the way that the popular Web and enterprise search engines work. Such systems are, however, far from ideal. We are all aware of the tremendous amount of irrelevant information that is retrieved when searching. We also fail to find much of the existing relevant material. LSA was designed to address these retrieval problems, using dimension reduction techniques. Fundamental characteristics of human word usage underlie these retrieval failures. People use a wide variety of words to describe the same object or concept (synonymy). Furnas, Landauer, Gomez, and Dumais (1987) showed that people generate the same keyword to describe well-known objects only 20 percent of the time. Poor agreement was also observed in studies of inter-indexer consistency (e.g., Chan, 1989; Tarr & Borko, 1974) in the generation of search terms (e.g., Fidel, 1985; Bates, 1986), and in the generation of hypertext links (Furner, Ellis, & Willett, 1999). Because searchers and authors often use different words, relevant materials are missed. Someone looking for documents an "human-computer interaction" will not find articles that use only the phrase "man-machine studies" or "human factors." People also use the same word to refer to different things (polysemy). Words like "saturn," "jaguar," or "chip" have several different meanings. A short query like "saturn" will thus return many irrelevant documents. The query "Saturn Gar" will return fewer irrelevant items, but it will miss some documents that use only the terms "Saturn automobile." In searching, there is a constant tension between being overly specific and missing relevant information, and being more general and returning irrelevant information.
A number of approaches have been developed in information retrieval to address the problems caused by the variability in word usage. Stemming is a popular technique used to normalize some kinds of surface-level variability by converting words to their morphological root. For example, the words "retrieve," "retrieval," "retrieved," and "retrieving" would all be converted to their root form, "retrieve." The root form is used for both document and query processing. Stemming sometimes helps retrieval, although not much (Harman, 1991; Hull, 1996). And, it does not address Gases where related words are not morphologically related (e.g., physician and doctor). Controlled vocabularies have also been used to limit variability by requiring that query and index terms belong to a pre-defined set of terms. Documents are indexed by a specified or authorized list of subject headings or index terms, called the controlled vocabulary. Library of Congress Subject Headings, Medical Subject Headings, Association for Computing Machinery (ACM) keywords, and Yellow Pages headings are examples of controlled vocabularies. If searchers can find the right controlled vocabulary terms, they do not have to think of all the morphologically related or synonymous terms that authors might have used. However, assigning controlled vocabulary terms in a consistent and thorough manner is a time-consuming and usually manual process. A good deal of research has been published about the effectiveness of controlled vocabulary indexing compared to full text indexing (e.g., Bates, 1998; Lancaster, 1986; Svenonius, 1986). The combination of both full text and controlled vocabularies is often better than either alone, although the size of the advantage is variable (Lancaster, 1986; Markey, Atherton, & Newton, 1982; Srinivasan, 1996). Richer thesauri have also been used to provide synonyms, generalizations, and specializations of users' search terms (see Srinivasan, 1992, for a review). Controlled vocabularies and thesaurus entries can be generated either manually or by the automatic analysis of large collections of texts.
Legg, C.: Ontologies on the Semantic Web (2007) 0.04
```
0.037011288 = product of:
  0.05551693 = sum of:
    0.026839713 = weight(_text_:search in 1979) [ClassicSimilarity], result of:
      0.026839713 = score(doc=1979,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.15360467 = fieldWeight in 1979, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.03125 = fieldNorm(doc=1979)
    0.028677218 = product of:
      0.057354435 = sum of:
        0.057354435 = weight(_text_:engines in 1979) [ClassicSimilarity], result of:
          0.057354435 = score(doc=1979,freq=2.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.22454272 = fieldWeight in 1979, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.03125 = fieldNorm(doc=1979)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

As an informational technology, the World Wide Web has enjoyed spectacular success. In just ten years it has transformed the way information is produced, stored, and shared in arenas as diverse as shopping, family photo albums, and high-level academic research. The "Semantic Web" is touted by its developers as equally revolutionary, although it has not yet achieved anything like the Web's exponential uptake. It seeks to transcend a current limitation of the Web - that it largely requires indexing to be accomplished merely on specific character strings. Thus, a person searching for information about "turkey" (the bird) receives from current search engines many irrelevant pages about "Turkey" (the country) and nothing about the Spanish "pavo" even if he or she is a Spanish-speaker able to understand such pages. The Semantic Web vision is to develop technology to facilitate retrieval of information via meanings, not just spellings. For this to be possible, most commentators believe, Semantic Web applications will have to draw on some kind of shared, structured, machine-readable conceptual scheme. Thus, there has been a convergence between the Semantic Web research community and an older tradition with roots in classical Artificial Intelligence (AI) research (sometimes referred to as "knowledge representation") whose goal is to develop a formal ontology. A formal ontology is a machine-readable theory of the most fundamental concepts or "categories" required in order to understand information pertaining to any knowledge domain. A review of the attempts that have been made to realize this goal provides an opportunity to reflect in interestingly concrete ways on various research questions such as the following: - How explicit a machine-understandable theory of meaning is it possible or practical to construct? - How universal a machine-understandable theory of meaning is it possible or practical to construct? - How much (and what kind of) inference support is required to realize a machine-understandable theory of meaning? - What is it for a theory of meaning to be machine-understandable anyway?

Shue, J.-S.; Wu. S.: GAIS computer science bibliographies search (1997) 0.04

0.035786286 = product of:
  0.10735885 = sum of:
    0.10735885 = weight(_text_:search in 953) [ClassicSimilarity], result of:
      0.10735885 = score(doc=953,freq=8.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.6144187 = fieldWeight in 953, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0625 = fieldNorm(doc=953)
  0.33333334 = coord(1/3)

Abstract: GAIS computer science bibliographies search is a WWW service providing a searchable interface on bibliographies related to computer science. It holds about 400.000 references, mirrored from the Informatics for Engineering and Science Department of the University of Karlsruhe, and allows full text searching through the search engine GAIS (Global Area Intelligent Search). Discusses its design and architecture

Zhu, B.; Chen, H.: Information visualization (2004) 0.03
```
0.03238488 = product of:
  0.048577316 = sum of:
    0.02348475 = weight(_text_:search in 4276) [ClassicSimilarity], result of:
      0.02348475 = score(doc=4276,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.1344041 = fieldWeight in 4276, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4276)
    0.025092565 = product of:
      0.05018513 = sum of:
        0.05018513 = weight(_text_:engines in 4276) [ClassicSimilarity], result of:
          0.05018513 = score(doc=4276,freq=2.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.19647488 = fieldWeight in 4276, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.02734375 = fieldNorm(doc=4276)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Advanced technology has resulted in the generation of about one million terabytes of information every year. Ninety-reine percent of this is available in digital format (Keim, 2001). More information will be generated in the next three years than was created during all of previous human history (Keim, 2001). Collecting information is no longer a problem, but extracting value from information collections has become progressively more difficult. Various search engines have been developed to make it easier to locate information of interest, but these work well only for a person who has a specific goal and who understands what and how information is stored. This usually is not the Gase. Visualization was commonly thought of in terms of representing human mental processes (MacEachren, 1991; Miller, 1984). The concept is now associated with the amplification of these mental processes (Card, Mackinlay, & Shneiderman, 1999). Human eyes can process visual cues rapidly, whereas advanced information analysis techniques transform the computer into a powerful means of managing digitized information. Visualization offers a link between these two potent systems, the human eye and the computer (Gershon, Eick, & Card, 1998), helping to identify patterns and to extract insights from large amounts of information. The identification of patterns is important because it may lead to a scientific discovery, an interpretation of clues to solve a crime, the prediction of catastrophic weather, a successful financial investment, or a better understanding of human behavior in a computermediated environment. Visualization technology shows considerable promise for increasing the value of large-scale collections of information, as evidenced by several commercial applications of TreeMap (e.g., http://www.smartmoney.com) and Hyperbolic tree (e.g., http://www.inxight.com) to visualize large-scale hierarchical structures. Although the proliferation of visualization technologies dates from the 1990s where sophisticated hardware and software made increasingly faster generation of graphical objects possible, the role of visual aids in facilitating the construction of mental images has a long history. Visualization has been used to communicate ideas, to monitor trends implicit in data, and to explore large volumes of data for hypothesis generation. Imagine traveling to a strange place without a map, having to memorize physical and chemical properties of an element without Mendeleyev's periodic table, trying to understand the stock market without statistical diagrams, or browsing a collection of documents without interactive visual aids. A collection of information can lose its value simply because of the effort required for exhaustive exploration. Such frustrations can be overcome by visualization.

Twidale, M.B.; Nichols, D.M.: Computer supported cooperative work in information search and retrieval (1999) 0.03

0.031313002 = product of:
  0.093939 = sum of:
    0.093939 = weight(_text_:search in 4691) [ClassicSimilarity], result of:
      0.093939 = score(doc=4691,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.5376164 = fieldWeight in 4691, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.109375 = fieldNorm(doc=4691)
  0.33333334 = coord(1/3)

Siqueira, J.; Martins, D.L.: Workflow models for aggregating cultural heritage data on the web : a systematic literature review (2022) 0.02
```
0.019369897 = product of:
  0.058109686 = sum of:
    0.058109686 = weight(_text_:search in 464) [ClassicSimilarity], result of:
      0.058109686 = score(doc=464,freq=6.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.33256388 = fieldWeight in 464, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=464)
  0.33333334 = coord(1/3)
```
Abstract

In recent years, different cultural institutions have made efforts to spread culture through the construction of a unique search interface that integrates their digital objects and facilitates data retrieval for lay users. However, integrating cultural data is not a trivial task; therefore, this work performs a systematic literature review on data aggregation workflows, in order to answer five questions: What are the projects? What are the planned steps? Which technologies are used? Are the steps performed manually, automatically, or semi-automatically? Which perform semantic search? The searches were carried out in three databases: Networked Digital Library of Theses and Dissertations, Scopus and Web of Science. In Q01, 12 projects were selected. In Q02, 9 stages were identified: Harvesting, Ingestion, Mapping, Indexing, Storing, Monitoring, Enriching, Displaying, and Publishing LOD. In Q03, 19 different technologies were found it. In Q04, we identified that most of the solutions are semi-automatic and, in Q05, that most of them perform a semantic search. The analysis of the workflows allowed us to identify that there is no consensus regarding the stages, their nomenclatures, and technologies, besides presenting superficial discussions. But it allowed to identify the main steps for the implementation of the aggregation of cultural data.
Vakkari, P.: Task-based information searching (2002) 0.02
```
0.018978544 = product of:
  0.056935627 = sum of:
    0.056935627 = weight(_text_:search in 4288) [ClassicSimilarity], result of:
      0.056935627 = score(doc=4288,freq=4.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.3258447 = fieldWeight in 4288, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.046875 = fieldNorm(doc=4288)
  0.33333334 = coord(1/3)
```
Abstract

The rationale for using information systems is to find information that helps us in our daily activities, be they tasks or interests. Systems are expected to support us in searching for and identifying useful information. Although the activities and tasks performed by humans generate information needs and searching, they have attracted little attention in studies of information searching. Such studies have concentrated an search tasks rather than the activities that trigger them. It is obvious that our understanding of information searching is only partial, if we are not able to connect aspects of searching to the related task. The expected contribution of information to the task is reflected in relevance assessments of the information items found, and in the search tactics and use of the system in general. Taking the task into account seems to be a necessary condition for understanding and explaining information searching, and, by extension, for effective systems design.
Yu, N.: Readings & Web resources for faceted classification 0.02
```
0.018978544 = product of:
  0.056935627 = sum of:
    0.056935627 = weight(_text_:search in 4394) [ClassicSimilarity], result of:
      0.056935627 = score(doc=4394,freq=4.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.3258447 = fieldWeight in 4394, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.046875 = fieldNorm(doc=4394)
  0.33333334 = coord(1/3)
```
Abstract

The term "facet" has been used in various places, while in most cases it is just a buzz word to replace what is indeed "aspect" or "category". The references below either define and explain the original concept of facet or provide guidelines for building 'real' faceted search/browse. I was interested in faceted classification because it seems to be a natural and efficient way for organizing and browsing Web collections. However, to automatically generate facets and their isolates is extremely difficult since it involves concept extraction and concept grouping, both of which are difficult problems by themselves. And it is almost impossible to achieve mutually exclusive and jointly exhaustive 'true' facets without human judgment. Nowadays, faceted search/browse widely exists, implicitly or explicitly, on a majority of retail websites due to the multi-aspects nature of the data. However, it is still rarely seen on any digital library sites. (I could be wrong since I haven't kept myself updated with this field for a while.)

Enser, P.G.B.: Visual image retrieval (2008) 0.02

0.01816343 = product of:
  0.054490287 = sum of:
    0.054490287 = product of:
      0.10898057 = sum of:
        0.10898057 = weight(_text_:22 in 3281) [ClassicSimilarity], result of:
          0.10898057 = score(doc=3281,freq=2.0), product of:
            0.17604718 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05027291 = queryNorm
            0.61904186 = fieldWeight in 3281, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=3281)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 22. 1.2012 13:01:26

Morris, S.A.: Mapping research specialties (2008) 0.02

0.01816343 = product of:
  0.054490287 = sum of:
    0.054490287 = product of:
      0.10898057 = sum of:
        0.10898057 = weight(_text_:22 in 3962) [ClassicSimilarity], result of:
          0.10898057 = score(doc=3962,freq=2.0), product of:
            0.17604718 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05027291 = queryNorm
            0.61904186 = fieldWeight in 3962, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=3962)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 13. 7.2008 9:30:22

Fallis, D.: Social epistemology and information science (2006) 0.02

0.01816343 = product of:
  0.054490287 = sum of:
    0.054490287 = product of:
      0.10898057 = sum of:
        0.10898057 = weight(_text_:22 in 4368) [ClassicSimilarity], result of:
          0.10898057 = score(doc=4368,freq=2.0), product of:
            0.17604718 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05027291 = queryNorm
            0.61904186 = fieldWeight in 4368, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=4368)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 13. 7.2008 19:22:28

Nicolaisen, J.: Citation analysis (2007) 0.02

0.01816343 = product of:
  0.054490287 = sum of:
    0.054490287 = product of:
      0.10898057 = sum of:
        0.10898057 = weight(_text_:22 in 6091) [ClassicSimilarity], result of:
          0.10898057 = score(doc=6091,freq=2.0), product of:
            0.17604718 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05027291 = queryNorm
            0.61904186 = fieldWeight in 6091, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=6091)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 13. 7.2008 19:53:22

Search (48 results, page 1 of 3)

Authors

Years

Types

Themes