Search (69 results, page 1 of 4)

Chen, H.; Chau, M.: Web mining : machine learning for Web applications (2003) 0.09

0.08848925 = product of:
  0.1769785 = sum of:
    0.038619664 = weight(_text_:wide in 4242) [ClassicSimilarity], result of:
      0.038619664 = score(doc=4242,freq=2.0), product of:
        0.13148437 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.029675366 = queryNorm
        0.29372054 = fieldWeight in 4242, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.046875 = fieldNorm(doc=4242)
    0.06625557 = weight(_text_:web in 4242) [ClassicSimilarity], result of:
      0.06625557 = score(doc=4242,freq=20.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.6841342 = fieldWeight in 4242, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=4242)
    0.027816659 = weight(_text_:data in 4242) [ClassicSimilarity], result of:
      0.027816659 = score(doc=4242,freq=4.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.29644224 = fieldWeight in 4242, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=4242)
    0.044286616 = product of:
      0.08857323 = sum of:
        0.08857323 = weight(_text_:mining in 4242) [ClassicSimilarity], result of:
          0.08857323 = score(doc=4242,freq=4.0), product of:
            0.16744171 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.029675366 = queryNorm
            0.5289795 = fieldWeight in 4242, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.046875 = fieldNorm(doc=4242)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: With more than two billion pages created by millions of Web page authors and organizations, the World Wide Web is a tremendously rich knowledge base. The knowledge comes not only from the content of the pages themselves, but also from the unique characteristics of the Web, such as its hyperlink structure and its diversity of content and languages. Analysis of these characteristics often reveals interesting patterns and new knowledge. Such knowledge can be used to improve users' efficiency and effectiveness in searching for information an the Web, and also for applications unrelated to the Web, such as support for decision making or business management. The Web's size and its unstructured and dynamic content, as well as its multilingual nature, make the extraction of useful knowledge a challenging research problem. Furthermore, the Web generates a large amount of data in other formats that contain valuable information. For example, Web server logs' information about user access patterns can be used for information personalization or improving Web page design.
Theme: Data Mining

Yang, K.: Information retrieval on the Web (2004) 0.04
```
0.039523296 = product of:
  0.10539546 = sum of:
    0.03641097 = weight(_text_:wide in 4278) [ClassicSimilarity], result of:
      0.03641097 = score(doc=4278,freq=4.0), product of:
        0.13148437 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.029675366 = queryNorm
        0.2769224 = fieldWeight in 4278, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.03125 = fieldNorm(doc=4278)
    0.0558716 = weight(_text_:web in 4278) [ClassicSimilarity], result of:
      0.0558716 = score(doc=4278,freq=32.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.5769126 = fieldWeight in 4278, product of:
          5.656854 = tf(freq=32.0), with freq of:
            32.0 = termFreq=32.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.03125 = fieldNorm(doc=4278)
    0.013112898 = weight(_text_:data in 4278) [ClassicSimilarity], result of:
      0.013112898 = score(doc=4278,freq=2.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.1397442 = fieldWeight in 4278, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03125 = fieldNorm(doc=4278)
  0.375 = coord(3/8)
```
Abstract

How do we find information an the Web? Although information on the Web is distributed and decentralized, the Web can be viewed as a single, virtual document collection. In that regard, the fundamental questions and approaches of traditional information retrieval (IR) research (e.g., term weighting, query expansion) are likely to be relevant in Web document retrieval. Findings from traditional IR research, however, may not always be applicable in a Web setting. The Web document collection - massive in size and diverse in content, format, purpose, and quality - challenges the validity of previous research findings that are based an relatively small and homogeneous test collections. Moreover, some traditional IR approaches, although applicable in theory, may be impossible or impractical to implement in a Web setting. For instance, the size, distribution, and dynamic nature of Web information make it extremely difficult to construct a complete and up-to-date data representation of the kind required for a model IR system. To further complicate matters, information seeking on the Web is diverse in character and unpredictable in nature. Web searchers come from all walks of life and are motivated by many kinds of information needs. The wide range of experience, knowledge, motivation, and purpose means that searchers can express diverse types of information needs in a wide variety of ways with differing criteria for satisfying those needs. Conventional evaluation measures, such as precision and recall, may no longer be appropriate for Web IR, where a representative test collection is all but impossible to construct. Finding information on the Web creates many new challenges for, and exacerbates some old problems in, IR research. At the same time, the Web is rich in new types of information not present in most IR test collections. Hyperlinks, usage statistics, document markup tags, and collections of topic hierarchies such as Yahoo! (http://www.yahoo.com) present an opportunity to leverage Web-specific document characteristics in novel ways that go beyond the term-based retrieval framework of traditional IR. Consequently, researchers in Web IR have reexamined the findings from traditional IR research.

Blake, C.: Text mining (2011) 0.04

0.037307642 = product of:
  0.14923057 = sum of:
    0.045895144 = weight(_text_:data in 1599) [ClassicSimilarity], result of:
      0.045895144 = score(doc=1599,freq=2.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.48910472 = fieldWeight in 1599, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.109375 = fieldNorm(doc=1599)
    0.10333543 = product of:
      0.20667087 = sum of:
        0.20667087 = weight(_text_:mining in 1599) [ClassicSimilarity], result of:
          0.20667087 = score(doc=1599,freq=4.0), product of:
            0.16744171 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.029675366 = queryNorm
            1.2342855 = fieldWeight in 1599, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.109375 = fieldNorm(doc=1599)
      0.5 = coord(1/2)
  0.25 = coord(2/8)

Theme: Data Mining

Benoit, G.: Data mining (2002) 0.03
```
0.0346215 = product of:
  0.138486 = sum of:
    0.055633318 = weight(_text_:data in 4296) [ClassicSimilarity], result of:
      0.055633318 = score(doc=4296,freq=16.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.5928845 = fieldWeight in 4296, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=4296)
    0.08285268 = product of:
      0.16570535 = sum of:
        0.16570535 = weight(_text_:mining in 4296) [ClassicSimilarity], result of:
          0.16570535 = score(doc=4296,freq=14.0), product of:
            0.16744171 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.029675366 = queryNorm
            0.9896301 = fieldWeight in 4296, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.046875 = fieldNorm(doc=4296)
      0.5 = coord(1/2)
  0.25 = coord(2/8)
```
Abstract

Data mining (DM) is a multistaged process of extracting previously unanticipated knowledge from large databases, and applying the results to decision making. Data mining tools detect patterns from the data and infer associations and rules from them. The extracted information may then be applied to prediction or classification models by identifying relations within the data records or between databases. Those patterns and rules can then guide decision making and forecast the effects of those decisions. However, this definition may be applied equally to "knowledge discovery in databases" (KDD). Indeed, in the recent literature of DM and KDD, a source of confusion has emerged, making it difficult to determine the exact parameters of both. KDD is sometimes viewed as the broader discipline, of which data mining is merely a component-specifically pattern extraction, evaluation, and cleansing methods (Raghavan, Deogun, & Sever, 1998, p. 397). Thurasingham (1999, p. 2) remarked that "knowledge discovery," "pattern discovery," "data dredging," "information extraction," and "knowledge mining" are all employed as synonyms for DM. Trybula, in his ARIST chapter an text mining, observed that the "existing work [in KDD] is confusing because the terminology is inconsistent and poorly defined.

Theme

Data Mining

Trybula, W.J.: Data mining and knowledge discovery (1997) 0.03

0.03449368 = product of:
  0.13797472 = sum of:
    0.06490554 = weight(_text_:data in 2300) [ClassicSimilarity], result of:
      0.06490554 = score(doc=2300,freq=16.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.69169855 = fieldWeight in 2300, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2300)
    0.073069185 = product of:
      0.14613837 = sum of:
        0.14613837 = weight(_text_:mining in 2300) [ClassicSimilarity], result of:
          0.14613837 = score(doc=2300,freq=8.0), product of:
            0.16744171 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.029675366 = queryNorm
            0.8727716 = fieldWeight in 2300, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2300)
      0.5 = coord(1/2)
  0.25 = coord(2/8)

Abstract: State of the art review of the recently developed concepts of data mining (defined as the automated process of evaluating data and finding relationships) and knowledge discovery (defined as the automated process of extracting information, especially unpredicted relationships or previously unknown patterns among the data) with particular reference to numerical data. Includes: the knowledge acquisition process; data mining; evaluation methods; and knowledge discovery. Concludes that existing work in the field are confusing because the terminology is inconsistent and poorly defined. Although methods are available for analyzing and cleaning databases, better coordinated efforts should be directed toward providing users with improved means of structuring search mechanisms to explore the data for relationships
Theme: Data Mining

Bath, P.A.: Data mining in health and medical information (2003) 0.03

0.031192835 = product of:
  0.12477134 = sum of:
    0.052451592 = weight(_text_:data in 4263) [ClassicSimilarity], result of:
      0.052451592 = score(doc=4263,freq=8.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.5589768 = fieldWeight in 4263, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0625 = fieldNorm(doc=4263)
    0.072319746 = product of:
      0.14463949 = sum of:
        0.14463949 = weight(_text_:mining in 4263) [ClassicSimilarity], result of:
          0.14463949 = score(doc=4263,freq=6.0), product of:
            0.16744171 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.029675366 = queryNorm
            0.86381996 = fieldWeight in 4263, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0625 = fieldNorm(doc=4263)
      0.5 = coord(1/2)
  0.25 = coord(2/8)

Abstract: Data mining (DM) is part of a process by which information can be extracted from data or databases and used to inform decision making in a variety of contexts (Benoit, 2002; Michalski, Bratka & Kubat, 1997). DM includes a range of tools and methods for extractiog information; their use in the commercial sector for knowledge extraction and discovery has been one of the main driving forces in their development (Adriaans & Zantinge, 1996; Benoit, 2002). DM has been developed and applied in numerous areas. This review describes its use in analyzing health and medical information.
Theme: Data Mining

Bar-Ilan, J.: ¬The use of Web search engines in information science research (2003) 0.03
```
0.027027272 = product of:
  0.10810909 = sum of:
    0.038619664 = weight(_text_:wide in 4271) [ClassicSimilarity], result of:
      0.038619664 = score(doc=4271,freq=2.0), product of:
        0.13148437 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.029675366 = queryNorm
        0.29372054 = fieldWeight in 4271, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.046875 = fieldNorm(doc=4271)
    0.06948943 = weight(_text_:web in 4271) [ClassicSimilarity], result of:
      0.06948943 = score(doc=4271,freq=22.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.717526 = fieldWeight in 4271, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=4271)
  0.25 = coord(2/8)
```
Abstract

The World Wide Web was created in 1989, but it has already become a major information channel and source, influencing our everyday lives, commercial transactions, and scientific communication, to mention just a few areas. The seventeenth-century philosopher Descartes proclaimed, "I think, therefore I am" (cogito, ergo sum). Today the Web is such an integral part of our lives that we could rephrase Descartes' statement as "I have a Web presence, therefore I am." Because many people, companies, and organizations take this notion seriously, in addition to more substantial reasons for publishing information an the Web, the number of Web pages is in the billions and growing constantly. However, it is not sufficient to have a Web presence; tools that enable users to locate Web pages are needed as well. The major tools for discovering and locating information an the Web are search engines. This review discusses the use of Web search engines in information science research. Before going into detail, we should define the terms "information science," "Web search engine," and "use" in the context of this review.
Thelwall, M.; Vaughan, L.; Björneborn, L.: Webometrics (2004) 0.02
```
0.02468518 = product of:
  0.09874072 = sum of:
    0.052379623 = weight(_text_:web in 4279) [ClassicSimilarity], result of:
      0.052379623 = score(doc=4279,freq=18.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.5408555 = fieldWeight in 4279, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4279)
    0.046361096 = weight(_text_:data in 4279) [ClassicSimilarity], result of:
      0.046361096 = score(doc=4279,freq=16.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.49407038 = fieldWeight in 4279, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4279)
  0.25 = coord(2/8)
```
Abstract

Webometrics, the quantitative study of Web-related phenomena, emerged from the realization that methods originally designed for bibliometric analysis of scientific journal article citation patterns could be applied to the Web, with commercial search engines providing the raw data. Almind and Ingwersen (1997) defined the field and gave it its name. Other pioneers included Rodriguez Gairin (1997) and Aguillo (1998). Larson (1996) undertook exploratory link structure analysis, as did Rousseau (1997). Webometrics encompasses research from fields beyond information science such as communication studies, statistical physics, and computer science. In this review we concentrate on link analysis, but also cover other aspects of webometrics, including Web log fle analysis. One theme that runs through this chapter is the messiness of Web data and the need for data cleansing heuristics. The uncontrolled Web creates numerous problems in the interpretation of results, for instance, from the automatic creation or replication of links. The loose connection between top-level domain specifications (e.g., com, edu, and org) and their actual content is also a frustrating problem. For example, many .com sites contain noncommercial content, although com is ostensibly the main commercial top-level domain. Indeed, a skeptical researcher could claim that obstacles of this kind are so great that all Web analyses lack value. As will be seen, one response to this view, a view shared by critics of evaluative bibliometrics, is to demonstrate that Web data correlate significantly with some non-Web data in order to prove that the Web data are not wholly random. A practical response has been to develop increasingly sophisticated data cleansing techniques and multiple data analysis methods.
Rasmussen, E.M.: Indexing and retrieval for the Web (2002) 0.02
```
0.022696622 = product of:
  0.09078649 = sum of:
    0.045056276 = weight(_text_:wide in 4285) [ClassicSimilarity], result of:
      0.045056276 = score(doc=4285,freq=8.0), product of:
        0.13148437 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.029675366 = queryNorm
        0.342674 = fieldWeight in 4285, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4285)
    0.045730207 = weight(_text_:web in 4285) [ClassicSimilarity], result of:
      0.045730207 = score(doc=4285,freq=28.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.47219574 = fieldWeight in 4285, product of:
          5.2915025 = tf(freq=28.0), with freq of:
            28.0 = termFreq=28.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4285)
  0.25 = coord(2/8)
```
Abstract

The introduction and growth of the World Wide Web (WWW, or Web) have resulted in a profound change in the way individuals and organizations access information. In terms of volume, nature, and accessibility, the characteristics of electronic information are significantly different from those of even five or six years ago. Control of, and access to, this flood of information rely heavily an automated techniques for indexing and retrieval. According to Gudivada, Raghavan, Grosky, and Kasanagottu (1997, p. 58), "The ability to search and retrieve information from the Web efficiently and effectively is an enabling technology for realizing its full potential." Almost 93 percent of those surveyed consider the Web an "indispensable" Internet technology, second only to e-mail (Graphie, Visualization & Usability Center, 1998). Although there are other ways of locating information an the Web (browsing or following directory structures), 85 percent of users identify Web pages by means of a search engine (Graphie, Visualization & Usability Center, 1998). A more recent study conducted by the Stanford Institute for the Quantitative Study of Society confirms the finding that searching for information is second only to e-mail as an Internet activity (Nie & Ebring, 2000, online). In fact, Nie and Ebring conclude, "... the Internet today is a giant public library with a decidedly commercial tilt. The most widespread use of the Internet today is as an information search utility for products, travel, hobbies, and general information. Virtually all users interviewed responded that they engaged in one or more of these information gathering activities."
Techniques for automated indexing and information retrieval (IR) have been developed, tested, and refined over the past 40 years, and are well documented (see, for example, Agosti & Smeaton, 1996; BaezaYates & Ribeiro-Neto, 1999a; Frakes & Baeza-Yates, 1992; Korfhage, 1997; Salton, 1989; Witten, Moffat, & Bell, 1999). With the introduction of the Web, and the capability to index and retrieve via search engines, these techniques have been extended to a new environment. They have been adopted, altered, and in some Gases extended to include new methods. "In short, search engines are indispensable for searching the Web, they employ a variety of relatively advanced IR techniques, and there are some peculiar aspects of search engines that make searching the Web different than more conventional information retrieval" (Gordon & Pathak, 1999, p. 145). The environment for information retrieval an the World Wide Web differs from that of "conventional" information retrieval in a number of fundamental ways. The collection is very large and changes continuously, with pages being added, deleted, and altered. Wide variability between the size, structure, focus, quality, and usefulness of documents makes Web documents much more heterogeneous than a typical electronic document collection. The wide variety of document types includes images, video, audio, and scripts, as well as many different document languages. Duplication of documents and sites is common. Documents are interconnected through networks of hyperlinks. Because of the size and dynamic nature of the Web, preprocessing all documents requires considerable resources and is often not feasible, certainly not an the frequent basis required to ensure currency. Query length is usually much shorter than in other environments-only a few words-and user behavior differs from that in other environments. These differences make the Web a novel environment for information retrieval (Baeza-Yates & Ribeiro-Neto, 1999b; Bharat & Henzinger, 1998; Huang, 2000).
Marsh, S.; Dibben, M.R.: ¬The role of trust in information science and technology (2002) 0.02
```
0.018727332 = product of:
  0.07490933 = sum of:
    0.038619664 = weight(_text_:wide in 4289) [ClassicSimilarity], result of:
      0.038619664 = score(doc=4289,freq=2.0), product of:
        0.13148437 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.029675366 = queryNorm
        0.29372054 = fieldWeight in 4289, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.046875 = fieldNorm(doc=4289)
    0.03628967 = weight(_text_:web in 4289) [ClassicSimilarity], result of:
      0.03628967 = score(doc=4289,freq=6.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.37471575 = fieldWeight in 4289, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=4289)
  0.25 = coord(2/8)
```
Abstract

This chapter discusses the notion of trust as it relates to information science and technology, specifically user interfaces, autonomous agents, and information systems. We first present an in-depth discussion of the concept of trust in and of itself, moving an to applications and considerations of trust in relation to information technologies. We consider trust from a "soft" perspective-thus, although security concepts such as cryptography, virus protection, authentication, and so forth reinforce (or damage) the feelings of trust we may have in a system, they are not themselves constitutive of "trust." We discuss information technology from a human-centric viewpoint, where trust is a less well-structured but much more powerful phenomenon. With the proliferation of electronic commerce (e-commerce) and the World Wide Web (WWW, or Web), much has been made of the ability of individuals to explore the vast quantities of information available to them, to purchase goods (as diverse as vacations and cars) online, and to publish information an their personal Web sites.
Dumais, S.T.: Latent semantic analysis (2003) 0.02
```
0.017041288 = product of:
  0.045443438 = sum of:
    0.019309832 = weight(_text_:wide in 2462) [ClassicSimilarity], result of:
      0.019309832 = score(doc=2462,freq=2.0), product of:
        0.13148437 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.029675366 = queryNorm
        0.14686027 = fieldWeight in 2462, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0234375 = fieldNorm(doc=2462)
    0.010475924 = weight(_text_:web in 2462) [ClassicSimilarity], result of:
      0.010475924 = score(doc=2462,freq=2.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.108171105 = fieldWeight in 2462, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0234375 = fieldNorm(doc=2462)
    0.015657684 = product of:
      0.031315368 = sum of:
        0.031315368 = weight(_text_:mining in 2462) [ClassicSimilarity], result of:
          0.031315368 = score(doc=2462,freq=2.0), product of:
            0.16744171 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.029675366 = queryNorm
            0.18702249 = fieldWeight in 2462, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0234375 = fieldNorm(doc=2462)
      0.5 = coord(1/2)
  0.375 = coord(3/8)
```
Abstract

Latent Semantic Analysis (LSA) was first introduced in Dumais, Furnas, Landauer, and Deerwester (1988) and Deerwester, Dumais, Furnas, Landauer, and Harshman (1990) as a technique for improving information retrieval. The key insight in LSA was to reduce the dimensionality of the information retrieval problem. Most approaches to retrieving information depend an a lexical match between words in the user's query and those in documents. Indeed, this lexical matching is the way that the popular Web and enterprise search engines work. Such systems are, however, far from ideal. We are all aware of the tremendous amount of irrelevant information that is retrieved when searching. We also fail to find much of the existing relevant material. LSA was designed to address these retrieval problems, using dimension reduction techniques. Fundamental characteristics of human word usage underlie these retrieval failures. People use a wide variety of words to describe the same object or concept (synonymy). Furnas, Landauer, Gomez, and Dumais (1987) showed that people generate the same keyword to describe well-known objects only 20 percent of the time. Poor agreement was also observed in studies of inter-indexer consistency (e.g., Chan, 1989; Tarr & Borko, 1974) in the generation of search terms (e.g., Fidel, 1985; Bates, 1986), and in the generation of hypertext links (Furner, Ellis, & Willett, 1999). Because searchers and authors often use different words, relevant materials are missed. Someone looking for documents an "human-computer interaction" will not find articles that use only the phrase "man-machine studies" or "human factors." People also use the same word to refer to different things (polysemy). Words like "saturn," "jaguar," or "chip" have several different meanings. A short query like "saturn" will thus return many irrelevant documents. The query "Saturn Gar" will return fewer irrelevant items, but it will miss some documents that use only the terms "Saturn automobile." In searching, there is a constant tension between being overly specific and missing relevant information, and being more general and returning irrelevant information.
With the advent of large-scale collections of full text, statistical approaches are being used more and more to analyze the relationships among terms and documents. LSA takes this approach. LSA induces knowledge about the meanings of documents and words by analyzing large collections of texts. The approach simultaneously models the relationships among documents based an their constituent words, and the relationships between words based an their occurrence in documents. By using fewer dimensions for representation than there are unique words, LSA induces similarities among terms that are useful in solving the information retrieval problems described earlier. LSA is a fully automatic statistical approach to extracting relations among words by means of their contexts of use in documents, passages, or sentences. It makes no use of natural language processing techniques for analyzing morphological, syntactic, or semantic relations. Nor does it use humanly constructed resources like dictionaries, thesauri, lexical reference systems (e.g., WordNet), semantic networks, or other knowledge representations. Its only input is large amounts of texts. LSA is an unsupervised learning technique. It starts with a large collection of texts, builds a term-document matrix, and tries to uncover some similarity structures that are useful for information retrieval and related text-analysis problems. Several recent ARIST chapters have focused an text mining and discovery (Benoit, 2002; Solomon, 2002; Trybula, 2000). These chapters provide complementary coverage of the field of text analysis.
Legg, C.: Ontologies on the Semantic Web (2007) 0.02
```
0.016313408 = product of:
  0.06525363 = sum of:
    0.025746442 = weight(_text_:wide in 1979) [ClassicSimilarity], result of:
      0.025746442 = score(doc=1979,freq=2.0), product of:
        0.13148437 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.029675366 = queryNorm
        0.1958137 = fieldWeight in 1979, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.03125 = fieldNorm(doc=1979)
    0.039507188 = weight(_text_:web in 1979) [ClassicSimilarity], result of:
      0.039507188 = score(doc=1979,freq=16.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.4079388 = fieldWeight in 1979, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.03125 = fieldNorm(doc=1979)
  0.25 = coord(2/8)
```
Abstract

As an informational technology, the World Wide Web has enjoyed spectacular success. In just ten years it has transformed the way information is produced, stored, and shared in arenas as diverse as shopping, family photo albums, and high-level academic research. The "Semantic Web" is touted by its developers as equally revolutionary, although it has not yet achieved anything like the Web's exponential uptake. It seeks to transcend a current limitation of the Web - that it largely requires indexing to be accomplished merely on specific character strings. Thus, a person searching for information about "turkey" (the bird) receives from current search engines many irrelevant pages about "Turkey" (the country) and nothing about the Spanish "pavo" even if he or she is a Spanish-speaker able to understand such pages. The Semantic Web vision is to develop technology to facilitate retrieval of information via meanings, not just spellings. For this to be possible, most commentators believe, Semantic Web applications will have to draw on some kind of shared, structured, machine-readable conceptual scheme. Thus, there has been a convergence between the Semantic Web research community and an older tradition with roots in classical Artificial Intelligence (AI) research (sometimes referred to as "knowledge representation") whose goal is to develop a formal ontology. A formal ontology is a machine-readable theory of the most fundamental concepts or "categories" required in order to understand information pertaining to any knowledge domain. A review of the attempts that have been made to realize this goal provides an opportunity to reflect in interestingly concrete ways on various research questions such as the following: - How explicit a machine-understandable theory of meaning is it possible or practical to construct? - How universal a machine-understandable theory of meaning is it possible or practical to construct? - How much (and what kind of) inference support is required to realize a machine-understandable theory of meaning? - What is it for a theory of meaning to be machine-understandable anyway?

Theme

Semantic Web
Siqueira, J.; Martins, D.L.: Workflow models for aggregating cultural heritage data on the web : a systematic literature review (2022) 0.02
```
0.015335915 = product of:
  0.06134366 = sum of:
    0.024691992 = weight(_text_:web in 464) [ClassicSimilarity], result of:
      0.024691992 = score(doc=464,freq=4.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.25496176 = fieldWeight in 464, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=464)
    0.036651667 = weight(_text_:data in 464) [ClassicSimilarity], result of:
      0.036651667 = score(doc=464,freq=10.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.39059696 = fieldWeight in 464, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=464)
  0.25 = coord(2/8)
```
Abstract

In recent years, different cultural institutions have made efforts to spread culture through the construction of a unique search interface that integrates their digital objects and facilitates data retrieval for lay users. However, integrating cultural data is not a trivial task; therefore, this work performs a systematic literature review on data aggregation workflows, in order to answer five questions: What are the projects? What are the planned steps? Which technologies are used? Are the steps performed manually, automatically, or semi-automatically? Which perform semantic search? The searches were carried out in three databases: Networked Digital Library of Theses and Dissertations, Scopus and Web of Science. In Q01, 12 projects were selected. In Q02, 9 stages were identified: Harvesting, Ingestion, Mapping, Indexing, Storing, Monitoring, Enriching, Displaying, and Publishing LOD. In Q03, 19 different technologies were found it. In Q04, we identified that most of the solutions are semi-automatic and, in Q05, that most of them perform a semantic search. The analysis of the workflows allowed us to identify that there is no consensus regarding the stages, their nomenclatures, and technologies, besides presenting superficial discussions. But it allowed to identify the main steps for the implementation of the aggregation of cultural data.
Chowdhury, G.G.: Natural language processing (2002) 0.01
```
0.014892878 = product of:
  0.059571512 = sum of:
    0.038619664 = weight(_text_:wide in 4284) [ClassicSimilarity], result of:
      0.038619664 = score(doc=4284,freq=2.0), product of:
        0.13148437 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.029675366 = queryNorm
        0.29372054 = fieldWeight in 4284, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.046875 = fieldNorm(doc=4284)
    0.020951848 = weight(_text_:web in 4284) [ClassicSimilarity], result of:
      0.020951848 = score(doc=4284,freq=2.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.21634221 = fieldWeight in 4284, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=4284)
  0.25 = coord(2/8)
```
Abstract

Natural Language Processing (NLP) is an area of research and application that explores how computers can be used to understand and manipulate natural language text or speech to do useful things. NLP researchers aim to gather knowledge an how human beings understand and use language so that appropriate tools and techniques can be developed to make computer systems understand and manipulate natural languages to perform desired tasks. The foundations of NLP lie in a number of disciplines, namely, computer and information sciences, linguistics, mathematics, electrical and electronic engineering, artificial intelligence and robotics, and psychology. Applications of NLP include a number of fields of study, such as machine translation, natural language text processing and summarization, user interfaces, multilingual and cross-language information retrieval (CLIR), speech recognition, artificial intelligence, and expert systems. One important application area that is relatively new and has not been covered in previous ARIST chapters an NLP relates to the proliferation of the World Wide Web and digital libraries.
Yu, N.: Readings & Web resources for faceted classification 0.01
```
0.012324934 = product of:
  0.049299736 = sum of:
    0.029630389 = weight(_text_:web in 4394) [ClassicSimilarity], result of:
      0.029630389 = score(doc=4394,freq=4.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.3059541 = fieldWeight in 4394, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=4394)
    0.019669347 = weight(_text_:data in 4394) [ClassicSimilarity], result of:
      0.019669347 = score(doc=4394,freq=2.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.2096163 = fieldWeight in 4394, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=4394)
  0.25 = coord(2/8)
```
Abstract

The term "facet" has been used in various places, while in most cases it is just a buzz word to replace what is indeed "aspect" or "category". The references below either define and explain the original concept of facet or provide guidelines for building 'real' faceted search/browse. I was interested in faceted classification because it seems to be a natural and efficient way for organizing and browsing Web collections. However, to automatically generate facets and their isolates is extremely difficult since it involves concept extraction and concept grouping, both of which are difficult problems by themselves. And it is almost impossible to achieve mutually exclusive and jointly exhaustive 'true' facets without human judgment. Nowadays, faceted search/browse widely exists, implicitly or explicitly, on a majority of retail websites due to the multi-aspects nature of the data. However, it is still rarely seen on any digital library sites. (I could be wrong since I haven't kept myself updated with this field for a while.)

Weiss, A.K.; Carstens, T.V.: ¬The year's work in cataloging, 1999 (2001) 0.01

0.012160225 = product of:
  0.0486409 = sum of:
    0.03456879 = weight(_text_:web in 6084) [ClassicSimilarity], result of:
      0.03456879 = score(doc=6084,freq=4.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.35694647 = fieldWeight in 6084, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0546875 = fieldNorm(doc=6084)
    0.014072108 = product of:
      0.028144216 = sum of:
        0.028144216 = weight(_text_:22 in 6084) [ClassicSimilarity], result of:
          0.028144216 = score(doc=6084,freq=2.0), product of:
            0.103918076 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.029675366 = queryNorm
            0.2708308 = fieldWeight in 6084, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6084)
      0.5 = coord(1/2)
  0.25 = coord(2/8)

Abstract: The challenge of cataloging Web sites and electronic resources was the most important issue facing the cataloging world in the last year. This article reviews attempts to analyze and revise the cataloging code in view of the new electronic environment. The difficulties of applying traditional library cataloging standards to Web resources has led some to favor metadata as the best means of providing access to these materials. The appropriate education and training for library cataloging personnel remains crucial during this transitional period. Articles on user understanding of Library of Congress subject headings and on cataloging practice are also reviewed.
Date: 10. 9.2000 17:38:22

Denton, W.: Putting facets on the Web : an annotated bibliography (2003) 0.01
```
0.0115832295 = product of:
  0.046332918 = sum of:
    0.016091526 = weight(_text_:wide in 2467) [ClassicSimilarity], result of:
      0.016091526 = score(doc=2467,freq=2.0), product of:
        0.13148437 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.029675366 = queryNorm
        0.122383565 = fieldWeight in 2467, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.01953125 = fieldNorm(doc=2467)
    0.03024139 = weight(_text_:web in 2467) [ClassicSimilarity], result of:
      0.03024139 = score(doc=2467,freq=24.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.3122631 = fieldWeight in 2467, product of:
          4.8989797 = tf(freq=24.0), with freq of:
            24.0 = termFreq=24.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.01953125 = fieldNorm(doc=2467)
  0.25 = coord(2/8)
```
Abstract

This is a classified, annotated bibliography about how to design faceted classification systems and make them usable on the World Wide Web. It is the first of three works I will be doing. The second, based on the material here and elsewhere, will discuss how to actually make the faceted system and put it online. The third will be a report of how I did just that, what worked, what didn't, and what I learned. Almost every article or book listed here begins with an explanation of what a faceted classification system is, so I won't (but see Steckel in Background below if you don't already know). They all agree that faceted systems are very appropriate for the web. Even pre-web articles (such as Duncan's in Background, below) assert that hypertext and facets will go together well. Combined, it is possible to take a set of documents and classify them or apply subject headings to describe what they are about, then build a navigational structure so that any user, no matter how he or she approaches the material, no matter what his or her goals, can move and search in a way that makes sense to them, but still get to the same useful results as someone else following a different path to the same goal. There is no one way that everyone will always use when looking for information. The more flexible the organization of the information, the more accommodating it is. Facets are more flexible for hypertext browsing than any enumerative or hierarchical system.
Consider movie listings in newspapers. Most Canadian newspapers list movie showtimes in two large blocks, for the two major theatre chains. The listings are ordered by region (in large cities), then theatre, then movie, and finally by showtime. Anyone wondering where and when a particular movie is playing must scan the complete listings. Determining what movies are playing in the next half hour is very difficult. When movie listings went onto the web, most sites used a simple faceted organization, always with movie name and theatre, and perhaps with region or neighbourhood (thankfully, theatre chains were left out). They make it easy to pick a theatre and see what movies are playing there, or to pick a movie and see what theatres are showing it. To complete the system, the sites should allow users to browse by neighbourhood and showtime, and to order the results in any way they desired. Thus could people easily find answers to such questions as, "Where is the new James Bond movie playing?" "What's showing at the Roxy tonight?" "I'm going to be out in in Little Finland this afternoon with three hours to kill starting at 2 ... is anything interesting playing?" A hypertext, faceted classification system makes more useful information more easily available to the user. Reading the books and articles below in chronological order will show a certain progression: suggestions that faceting and hypertext might work well, confidence that facets would work well if only someone would make such a system, and finally the beginning of serious work on actually designing, building, and testing faceted web sites. There is a solid basis of how to make faceted classifications (see Vickery in Recommended), but their application online is just starting. Work on XFML (see Van Dijck's work in Recommended) the Exchangeable Faceted Metadata Language, will make this easier. If it follows previous patterns, parts of the Internet community will embrace the idea and make open source software available for others to reuse. It will be particularly beneficial if professionals in both information studies and computer science can work together to build working systems, standards, and code. Each can benefit from the other's expertise in what can be a very complicated and technical area. One particularly nice thing about this area of research is that people interested in combining facets and the web often have web sites where they post their writings.
This bibliography is not meant to be exhaustive, but unfortunately it is not as complete as I wanted. Some books and articles are not be included, but they may be used in my future work. (These include two books and one article by B.C. Vickery: Faceted Classification Schemes (New Brunswick, NJ: Rutgers, 1966), Classification and Indexing in Science, 3rd ed. (London: Butterworths, 1975), and "Knowledge Representation: A Brief Review" (Journal of Documentation 42 no. 3 (September 1986): 145-159; and A.C. Foskett's "The Future of Faceted Classification" in The Future of Classification, edited by Rita Marcella and Arthur Maltby (Aldershot, England: Gower, 2000): 69-80). Nevertheless, I hope this bibliography will be useful for those both new to or familiar with faceted hypertext systems. Some very basic resources are listed, as well as some very advanced ones. Some example web sites are mentioned, but there is no detailed technical discussion of any software. The user interface to any web site is extremely important, and this is briefly mentioned in two or three places (for example the discussion of lawforwa.org (see Example Web Sites)). The larger question of how to display information graphically and with hypertext is outside the scope of this bibliography. There are five sections: Recommended, Background, Not Relevant, Example Web Sites, and Mailing Lists. Background material is either introductory, advanced, or of peripheral interest, and can be read after the Recommended resources if the reader wants to know more. The Not Relevant category contains articles that may appear in bibliographies but are not relevant for my purposes.
Genereux, C.: Building connections : a review of the serials literature 2004 through 2005 (2007) 0.01
```
0.009969616 = product of:
  0.039878465 = sum of:
    0.027816659 = weight(_text_:data in 2548) [ClassicSimilarity], result of:
      0.027816659 = score(doc=2548,freq=4.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.29644224 = fieldWeight in 2548, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=2548)
    0.012061807 = product of:
      0.024123615 = sum of:
        0.024123615 = weight(_text_:22 in 2548) [ClassicSimilarity], result of:
          0.024123615 = score(doc=2548,freq=2.0), product of:
            0.103918076 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.029675366 = queryNorm
            0.23214069 = fieldWeight in 2548, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2548)
      0.5 = coord(1/2)
  0.25 = coord(2/8)
```
Abstract

This review of 2004 and 2005 serials literature covers the themes of cost, management, and access. Interwoven through the serials literature of these two years are the importance of collaboration, communication, and linkages between scholars, publishers, subscription agents and other intermediaries, and librarians. The emphasis in the literature is on electronic serials and their impact on publishing, libraries, and vendors. In response to the crisis of escalating journal prices and libraries' dissatisfaction with the Big Deal licensing agreements, Open Access journals and publishing models were promoted. Libraries subscribed to or licensed increasing numbers of electronic serials. As a result, libraries sought ways to better manage licensing and subscription data (not handled by traditional integrated library systems) by implementing electronic resources management systems. In order to provide users with better, faster, and more current information on and access to electronic serials, libraries implemented tools and services to provide A-Z title lists, title by title coverage data, MARC records, and OpenURL link resolvers.

Date

10. 9.2000 17:38:22

Chambers, S.; Myall, C.: Cataloging and classification : review of the literature 2007-8 (2010) 0.01

0.009628983 = product of:
  0.038515933 = sum of:
    0.024443826 = weight(_text_:web in 4309) [ClassicSimilarity], result of:
      0.024443826 = score(doc=4309,freq=2.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.25239927 = fieldWeight in 4309, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4309)
    0.014072108 = product of:
      0.028144216 = sum of:
        0.028144216 = weight(_text_:22 in 4309) [ClassicSimilarity], result of:
          0.028144216 = score(doc=4309,freq=2.0), product of:
            0.103918076 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.029675366 = queryNorm
            0.2708308 = fieldWeight in 4309, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4309)
      0.5 = coord(1/2)
  0.25 = coord(2/8)

Abstract: This paper surveys library literature on cataloging and classification published in 2007-8, indicating its extent and range in terms of types of literature, major subject areas, and themes. The paper reviews pertinent literature in the following areas: the future of bibliographic control, general cataloging standards and texts, Functional Requirements for Bibliographic Records (FRBR), cataloging varied resources, metadata and cataloging in the Web world, classification and subject access, questions of diversity and diverse perspectives, additional reports of practice and research, catalogers' education and careers, keeping current through columns and blogs, and cataloging history.
Date: 10. 9.2000 17:38:22

Candela, L.; Castelli, D.; Manghi, P.; Tani, A.: Data journals : a survey (2015) 0.01
```
0.008864855 = product of:
  0.07091884 = sum of:
    0.07091884 = weight(_text_:data in 2156) [ClassicSimilarity], result of:
      0.07091884 = score(doc=2156,freq=26.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.75578237 = fieldWeight in 2156, product of:
          5.0990195 = tf(freq=26.0), with freq of:
            26.0 = termFreq=26.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=2156)
  0.125 = coord(1/8)
```
Abstract

Data occupy a key role in our information society. However, although the amount of published data continues to grow and terms such as data deluge and big data today characterize numerous (research) initiatives, much work is still needed in the direction of publishing data in order to make them effectively discoverable, available, and reusable by others. Several barriers hinder data publishing, from lack of attribution and rewards, vague citation practices, and quality issues to a rather general lack of a data-sharing culture. Lately, data journals have overcome some of these barriers. In this study of more than 100 currently existing data journals, we describe the approaches they promote for data set description, availability, citation, quality, and open access. We close by identifying ways to expand and strengthen the data journals approach as a means to promote data set access and exploitation.

Search (69 results, page 1 of 4)

Authors

Years

Languages

Types

Themes