Search (73 results, page 1 of 4)

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.31

0.30909637 = product of:
  0.37091565 = sum of:
    0.07054476 = product of:
      0.21163426 = sum of:
        0.21163426 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
          0.21163426 = score(doc=562,freq=2.0), product of:
            0.37656134 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.044416238 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.33333334 = coord(1/3)
    0.031359423 = weight(_text_:web in 562) [ClassicSimilarity], result of:
      0.031359423 = score(doc=562,freq=2.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.21634221 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.21163426 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
      0.21163426 = score(doc=562,freq=2.0), product of:
        0.37656134 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.044416238 = queryNorm
        0.56201804 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.039323866 = weight(_text_:computer in 562) [ClassicSimilarity], result of:
      0.039323866 = score(doc=562,freq=2.0), product of:
        0.16231956 = queryWeight, product of:
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.044416238 = queryNorm
        0.24226204 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.01805336 = product of:
      0.03610672 = sum of:
        0.03610672 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
          0.03610672 = score(doc=562,freq=2.0), product of:
            0.1555381 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.044416238 = queryNorm
            0.23214069 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.5 = coord(1/2)
  0.8333333 = coord(5/6)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32
Imprint: Washington, DC : IEEE Computer Society

Jenkins, C.: Automatic classification of Web resources using Java and Dewey Decimal Classification (1998) 0.13

0.1318309 = product of:
  0.19774634 = sum of:
    0.067437425 = weight(_text_:wide in 1673) [ClassicSimilarity], result of:
      0.067437425 = score(doc=1673,freq=2.0), product of:
        0.19679762 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.044416238 = queryNorm
        0.342674 = fieldWeight in 1673, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1673)
    0.063368805 = weight(_text_:web in 1673) [ClassicSimilarity], result of:
      0.063368805 = score(doc=1673,freq=6.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.43716836 = fieldWeight in 1673, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1673)
    0.04587784 = weight(_text_:computer in 1673) [ClassicSimilarity], result of:
      0.04587784 = score(doc=1673,freq=2.0), product of:
        0.16231956 = queryWeight, product of:
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.044416238 = queryNorm
        0.28263903 = fieldWeight in 1673, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1673)
    0.021062255 = product of:
      0.04212451 = sum of:
        0.04212451 = weight(_text_:22 in 1673) [ClassicSimilarity], result of:
          0.04212451 = score(doc=1673,freq=2.0), product of:
            0.1555381 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.044416238 = queryNorm
            0.2708308 = fieldWeight in 1673, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1673)
      0.5 = coord(1/2)
  0.6666667 = coord(4/6)

Abstract: The Wolverhampton Web Library (WWLib) is a WWW search engine that provides access to UK based information. The experimental version developed in 1995, was a success but highlighted the need for a much higher degree of automation. An interesting feature of the experimental WWLib was that it organised information according to DDC. Discusses the advantages of classification and describes the automatic classifier that is being developed in Java as part of the new, fully automated WWLib
Date: 1. 8.1996 22:08:06
Footnote: Contribution to a special issue devoted to the Proceedings of the 7th International World Wide Web Conference, held 14-18 April 1998, Brisbane, Australia; vgl. auch: http://www7.scu.edu.au/programme/posters/1846/com1846.htm.
Source: Computer networks and ISDN systems. 30(1998) nos.1/7, S.646-648

Khoo, C.S.G.; Ng, K.; Ou, S.: ¬An exploratory study of human clustering of Web pages (2003) 0.06
```
0.061496366 = product of:
  0.12299273 = sum of:
    0.03853567 = weight(_text_:wide in 2741) [ClassicSimilarity], result of:
      0.03853567 = score(doc=2741,freq=2.0), product of:
        0.19679762 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.044416238 = queryNorm
        0.1958137 = fieldWeight in 2741, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.03125 = fieldNorm(doc=2741)
    0.07242149 = weight(_text_:web in 2741) [ClassicSimilarity], result of:
      0.07242149 = score(doc=2741,freq=24.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.49962097 = fieldWeight in 2741, product of:
          4.8989797 = tf(freq=24.0), with freq of:
            24.0 = termFreq=24.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.03125 = fieldNorm(doc=2741)
    0.012035574 = product of:
      0.024071148 = sum of:
        0.024071148 = weight(_text_:22 in 2741) [ClassicSimilarity], result of:
          0.024071148 = score(doc=2741,freq=2.0), product of:
            0.1555381 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.044416238 = queryNorm
            0.15476047 = fieldWeight in 2741, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=2741)
      0.5 = coord(1/2)
  0.5 = coord(3/6)
```
Abstract

This study seeks to find out how human beings cluster Web pages naturally. Twenty Web pages retrieved by the Northem Light search engine for each of 10 queries were sorted by 3 subjects into categories that were natural or meaningful to them. lt was found that different subjects clustered the same set of Web pages quite differently and created different categories. The average inter-subject similarity of the clusters created was a low 0.27. Subjects created an average of 5.4 clusters for each sorting. The categories constructed can be divided into 10 types. About 1/3 of the categories created were topical. Another 20% of the categories relate to the degree of relevance or usefulness. The rest of the categories were subject-independent categories such as format, purpose, authoritativeness and direction to other sources. The authors plan to develop automatic methods for categorizing Web pages using the common categories created by the subjects. lt is hoped that the techniques developed can be used by Web search engines to automatically organize Web pages retrieved into categories that are natural to users. 1. Introduction The World Wide Web is an increasingly important source of information for people globally because of its ease of access, the ease of publishing, its ability to transcend geographic and national boundaries, its flexibility and heterogeneity and its dynamic nature. However, Web users also find it increasingly difficult to locate relevant and useful information in this vast information storehouse. Web search engines, despite their scope and power, appear to be quite ineffective. They retrieve too many pages, and though they attempt to rank retrieved pages in order of probable relevance, often the relevant documents do not appear in the top-ranked 10 or 20 documents displayed. Several studies have found that users do not know how to use the advanced features of Web search engines, and do not know how to formulate and re-formulate queries. Users also typically exert minimal effort in performing, evaluating and refining their searches, and are unwilling to scan more than 10 or 20 items retrieved (Jansen, Spink, Bateman & Saracevic, 1998). This suggests that the conventional ranked-list display of search results does not satisfy user requirements, and that better ways of presenting and summarizing search results have to be developed. One promising approach is to group retrieved pages into clusters or categories to allow users to navigate immediately to the "promising" clusters where the most useful Web pages are likely to be located. This approach has been adopted by a number of search engines (notably Northem Light) and search agents.

Date

12. 9.2004 9:56:22
Kwon, O.W.; Lee, J.H.: Text categorization based on k-nearest neighbor approach for web site classification (2003) 0.05
```
0.0497939 = product of:
  0.1493817 = sum of:
    0.04816959 = weight(_text_:wide in 1070) [ClassicSimilarity], result of:
      0.04816959 = score(doc=1070,freq=2.0), product of:
        0.19679762 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.044416238 = queryNorm
        0.24476713 = fieldWeight in 1070, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1070)
    0.10121211 = weight(_text_:web in 1070) [ClassicSimilarity], result of:
      0.10121211 = score(doc=1070,freq=30.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.69824153 = fieldWeight in 1070, product of:
          5.477226 = tf(freq=30.0), with freq of:
            30.0 = termFreq=30.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1070)
  0.33333334 = coord(2/6)
```
Abstract

Automatic categorization is a viable method to deal with the scaling problem on the World Wide Web. For Web site classification, this paper proposes the use of Web pages linked with the home page in a different manner from the sole use of home pages in previous research. To implement our proposed method, we derive a scheme for Web site classification based on the k-nearest neighbor (k-NN) approach. It consists of three phases: Web page selection (connectivity analysis), Web page classification, and Web site classification. Given a Web site, the Web page selection chooses several representative Web pages using connectivity analysis. The k-NN classifier next classifies each of the selected Web pages. Finally, the classified Web pages are extended to a classification of the entire Web site. To improve performance, we supplement the k-NN approach with a feature selection method and a term weighting scheme using markup tags, and also reform its document-document similarity measure. In our experiments on a Korean commercial Web directory, the proposed system, using both a home page and its linked pages, improved the performance of micro-averaging breakeven point by 30.02%, compared with an ordinary classification which uses a home page only.

Wätjen, H.-J.: Automatisches Sammeln, Klassifizieren und Indexieren von wissenschaftlich relevanten Informationsressourcen im deutschen World Wide Web : das DFG-Projekt GERHARD (1998) 0.05

0.04953496 = product of:
  0.14860488 = sum of:
    0.09633918 = weight(_text_:wide in 3066) [ClassicSimilarity], result of:
      0.09633918 = score(doc=3066,freq=2.0), product of:
        0.19679762 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.044416238 = queryNorm
        0.48953426 = fieldWeight in 3066, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.078125 = fieldNorm(doc=3066)
    0.052265707 = weight(_text_:web in 3066) [ClassicSimilarity], result of:
      0.052265707 = score(doc=3066,freq=2.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.36057037 = fieldWeight in 3066, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.078125 = fieldNorm(doc=3066)
  0.33333334 = coord(2/6)

Möller, G.: Automatic classification of the World Wide Web using Universal Decimal Classification (1999) 0.05

0.04953496 = product of:
  0.14860488 = sum of:
    0.09633918 = weight(_text_:wide in 494) [ClassicSimilarity], result of:
      0.09633918 = score(doc=494,freq=2.0), product of:
        0.19679762 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.044416238 = queryNorm
        0.48953426 = fieldWeight in 494, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.078125 = fieldNorm(doc=494)
    0.052265707 = weight(_text_:web in 494) [ClassicSimilarity], result of:
      0.052265707 = score(doc=494,freq=2.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.36057037 = fieldWeight in 494, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.078125 = fieldNorm(doc=494)
  0.33333334 = coord(2/6)

Reiner, U.: Automatische DDC-Klassifizierung bibliografischer Titeldatensätze der Deutschen Nationalbibliografie (2009) 0.05

0.048049595 = product of:
  0.09609919 = sum of:
    0.05449767 = weight(_text_:wide in 3284) [ClassicSimilarity], result of:
      0.05449767 = score(doc=3284,freq=4.0), product of:
        0.19679762 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.044416238 = queryNorm
        0.2769224 = fieldWeight in 3284, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.03125 = fieldNorm(doc=3284)
    0.029565949 = weight(_text_:web in 3284) [ClassicSimilarity], result of:
      0.029565949 = score(doc=3284,freq=4.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.2039694 = fieldWeight in 3284, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.03125 = fieldNorm(doc=3284)
    0.012035574 = product of:
      0.024071148 = sum of:
        0.024071148 = weight(_text_:22 in 3284) [ClassicSimilarity], result of:
          0.024071148 = score(doc=3284,freq=2.0), product of:
            0.1555381 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.044416238 = queryNorm
            0.15476047 = fieldWeight in 3284, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=3284)
      0.5 = coord(1/2)
  0.5 = coord(3/6)

Abstract: Die Menge der zu klassifizierenden Veröffentlichungen steigt spätestens seit der Existenz des World Wide Web schneller an, als sie intellektuell sachlich erschlossen werden kann. Daher werden Verfahren gesucht, um die Klassifizierung von Textobjekten zu automatisieren oder die intellektuelle Klassifizierung zumindest zu unterstützen. Seit 1968 gibt es Verfahren zur automatischen Dokumentenklassifizierung (Information Retrieval, kurz: IR) und seit 1992 zur automatischen Textklassifizierung (ATC: Automated Text Categorization). Seit immer mehr digitale Objekte im World Wide Web zur Verfügung stehen, haben Arbeiten zur automatischen Textklassifizierung seit ca. 1998 verstärkt zugenommen. Dazu gehören seit 1996 auch Arbeiten zur automatischen DDC-Klassifizierung bzw. RVK-Klassifizierung von bibliografischen Titeldatensätzen und Volltextdokumenten. Bei den Entwicklungen handelt es sich unseres Wissens bislang um experimentelle und keine im ständigen Betrieb befindlichen Systeme. Auch das VZG-Projekt Colibri/DDC ist seit 2006 u. a. mit der automatischen DDC-Klassifizierung befasst. Die diesbezüglichen Untersuchungen und Entwicklungen dienen zur Beantwortung der Forschungsfrage: "Ist es möglich, eine inhaltlich stimmige DDC-Titelklassifikation aller GVK-PLUS-Titeldatensätze automatisch zu erzielen?"
Date: 22. 1.2010 14:41:24

Koch, T.: Experiments with automatic classification of WAIS databases and indexing of WWW : some results from the Nordic WAIS/WWW project (1994) 0.04

0.043985642 = product of:
  0.13195692 = sum of:
    0.09537092 = weight(_text_:wide in 7209) [ClassicSimilarity], result of:
      0.09537092 = score(doc=7209,freq=4.0), product of:
        0.19679762 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.044416238 = queryNorm
        0.4846142 = fieldWeight in 7209, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7209)
    0.036585998 = weight(_text_:web in 7209) [ClassicSimilarity], result of:
      0.036585998 = score(doc=7209,freq=2.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.25239927 = fieldWeight in 7209, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7209)
  0.33333334 = coord(2/6)

Abstract: The Nordic WAIS/WWW project sponsored by NORDINFO is a joint project between Lund University Library and the National Technological Library of Denmark. It aims to improve the existing networked information discovery and retrieval tools Wide Area Information System (WAIS) and World Wide Web (WWW), and to move towards unifying WWW and WAIS. Details current results focusing on the WAIS side of the project. Describes research into automatic indexing and classification of WAIS sources, development of an orientation tool for WAIS, and development of a WAIS index of WWW resources

Wätjen, H.-J.: GERHARD : Automatisches Sammeln, Klassifizieren und Indexieren von wissenschaftlich relevanten Informationsressourcen im deutschen World Wide Web (1998) 0.04

0.039725944 = product of:
  0.11917783 = sum of:
    0.067437425 = weight(_text_:wide in 3064) [ClassicSimilarity], result of:
      0.067437425 = score(doc=3064,freq=2.0), product of:
        0.19679762 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.044416238 = queryNorm
        0.342674 = fieldWeight in 3064, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3064)
    0.05174041 = weight(_text_:web in 3064) [ClassicSimilarity], result of:
      0.05174041 = score(doc=3064,freq=4.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.35694647 = fieldWeight in 3064, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3064)
  0.33333334 = coord(2/6)

Abstract: Die intellektuelle Erschließung des Internet befindet sich in einer Krise. Yahoo und andere Dienste können mit dem Wachstum des Web nicht mithalten. GERHARD ist derzeit weltweit der einzige Such- und Navigationsdienst, der die mit einem Roboter gesammelten Internetressourcen mit computerlinguistischen und statistischen Verfahren auch automatisch vollständig klassifiziert. Weit über eine Million HTML-Dokumente von wissenschaftlich relevanten Servern in Deutschland können wie bei anderen Suchmaschinen in der Datenbank gesucht, aber auch über die Navigation in der dreisprachigen Universalen Dezimalklassifikation (ETH-Bibliothek Zürich) recherchiert werden

Subramanian, S.; Shafer, K.E.: Clustering (1998) 0.04

0.039268494 = product of:
  0.11780548 = sum of:
    0.052265707 = weight(_text_:web in 1103) [ClassicSimilarity], result of:
      0.052265707 = score(doc=1103,freq=2.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.36057037 = fieldWeight in 1103, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.078125 = fieldNorm(doc=1103)
    0.06553978 = weight(_text_:computer in 1103) [ClassicSimilarity], result of:
      0.06553978 = score(doc=1103,freq=2.0), product of:
        0.16231956 = queryWeight, product of:
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.044416238 = queryNorm
        0.40377006 = fieldWeight in 1103, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.078125 = fieldNorm(doc=1103)
  0.33333334 = coord(2/6)

Abstract: This article presents our exploration of computer science clustering algorithms as they relate to the Scorpion system. Scorpion is a research project at OCLC that explores the indexing and cataloging of electronic resources. For a more complete description of the Scorpion, please visit the Scorpion Web site at <http://purl.oclc.org/scorpion>

Classification, automation, and new media : Proceedings of the 24th Annual Conference of the Gesellschaft für Klassifikation e.V., University of Passau, March 15 - 17, 2000 (2002) 0.04

0.037795175 = product of:
  0.11338552 = sum of:
    0.06812209 = weight(_text_:wide in 5997) [ClassicSimilarity], result of:
      0.06812209 = score(doc=5997,freq=4.0), product of:
        0.19679762 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.044416238 = queryNorm
        0.34615302 = fieldWeight in 5997, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5997)
    0.045263432 = weight(_text_:web in 5997) [ClassicSimilarity], result of:
      0.045263432 = score(doc=5997,freq=6.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.3122631 = fieldWeight in 5997, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5997)
  0.33333334 = coord(2/6)

Content: Data Analysis, Statistics, and Classification.- Pattern Recognition and Automation.- Data Mining, Information Processing, and Automation.- New Media, Web Mining, and Automation.- Applications in Management Science, Finance, and Marketing.- Applications in Medicine, Biology, Archaeology, and Others.- Author Index.- Subject Index.
RSWK: World Wide Web / Wissensorganisation / Kongress / Passau <2000>
Subject: World Wide Web / Wissensorganisation / Kongress / Passau <2000>

Egbert, J.; Biber, D.; Davies, M.: Developing a bottom-up, user-based method of web register classification (2015) 0.04

0.035583735 = product of:
  0.1067512 = sum of:
    0.08869784 = weight(_text_:web in 2158) [ClassicSimilarity], result of:
      0.08869784 = score(doc=2158,freq=16.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.6119082 = fieldWeight in 2158, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=2158)
    0.01805336 = product of:
      0.03610672 = sum of:
        0.03610672 = weight(_text_:22 in 2158) [ClassicSimilarity], result of:
          0.03610672 = score(doc=2158,freq=2.0), product of:
            0.1555381 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.044416238 = queryNorm
            0.23214069 = fieldWeight in 2158, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2158)
      0.5 = coord(1/2)
  0.33333334 = coord(2/6)

Abstract: This paper introduces a project to develop a reliable, cost-effective method for classifying Internet texts into register categories, and apply that approach to the analysis of a large corpus of web documents. To date, the project has proceeded in 2 key phases. First, we developed a bottom-up method for web register classification, asking end users of the web to utilize a decision-tree survey to code relevant situational characteristics of web documents, resulting in a bottom-up identification of register and subregister categories. We present details regarding the development and testing of this method through a series of 10 pilot studies. Then, in the second phase of our project we applied this procedure to a corpus of 53,000 web documents. An analysis of the results demonstrates the effectiveness of these methods for web register classification and provides a preliminary description of the types and distribution of registers on the web.
Date: 4. 8.2015 19:22:04

Krüger, C.: Evaluation des WWW-Suchdienstes GERHARD unter besonderer Beachtung automatischer Indexierung (1999) 0.04
```
0.03502651 = product of:
  0.105079524 = sum of:
    0.06812209 = weight(_text_:wide in 1777) [ClassicSimilarity], result of:
      0.06812209 = score(doc=1777,freq=4.0), product of:
        0.19679762 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.044416238 = queryNorm
        0.34615302 = fieldWeight in 1777, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1777)
    0.036957435 = weight(_text_:web in 1777) [ClassicSimilarity], result of:
      0.036957435 = score(doc=1777,freq=4.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.25496176 = fieldWeight in 1777, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1777)
  0.33333334 = coord(2/6)
```
Abstract

Die vorliegende Arbeit beinhaltet eine Beschreibung und Evaluation des WWW - Suchdienstes GERHARD (German Harvest Automated Retrieval and Directory). GERHARD ist ein Such- und Navigationssystem für das deutsche World Wide Web, weiches ausschließlich wissenschaftlich relevante Dokumente sammelt, und diese auf der Basis computerlinguistischer und statistischer Methoden automatisch mit Hilfe eines bibliothekarischen Klassifikationssystems klassifiziert. Mit dem DFG - Projekt GERHARD ist der Versuch unternommen worden, mit einem auf einem automatischen Klassifizierungsverfahren basierenden World Wide Web - Dienst eine Alternative zu herkömmlichen Methoden der Interneterschließung zu entwickeln. GERHARD ist im deutschsprachigen Raum das einzige Verzeichnis von Internetressourcen, dessen Erstellung und Aktualisierung vollständig automatisch (also maschinell) erfolgt. GERHARD beschränkt sich dabei auf den Nachweis von Dokumenten auf wissenschaftlichen WWW - Servern. Die Grundidee dabei war, kostenintensive intellektuelle Erschließung und Klassifizierung von lnternetseiten durch computerlinguistische und statistische Methoden zu ersetzen, um auf diese Weise die nachgewiesenen Internetressourcen automatisch auf das Vokabular eines bibliothekarischen Klassifikationssystems abzubilden. GERHARD steht für German Harvest Automated Retrieval and Directory. Die WWW - Adresse (URL) von GERHARD lautet: http://www.gerhard.de. Im Rahmen der vorliegenden Diplomarbeit soll eine Beschreibung des Dienstes mit besonderem Schwerpunkt auf dem zugrundeliegenden Indexierungs- bzw. Klassifizierungssystem erfolgen und anschließend mit Hilfe eines kleinen Retrievaltests die Effektivität von GERHARD überprüft werden.

Miyamoto, S.: Information clustering based an fuzzy multisets (2003) 0.03

0.034674477 = product of:
  0.10402343 = sum of:
    0.067437425 = weight(_text_:wide in 1071) [ClassicSimilarity], result of:
      0.067437425 = score(doc=1071,freq=2.0), product of:
        0.19679762 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.044416238 = queryNorm
        0.342674 = fieldWeight in 1071, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1071)
    0.036585998 = weight(_text_:web in 1071) [ClassicSimilarity], result of:
      0.036585998 = score(doc=1071,freq=2.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.25239927 = fieldWeight in 1071, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1071)
  0.33333334 = coord(2/6)

Abstract: A fuzzy multiset model for information clustering is proposed with application to information retrieval on the World Wide Web. Noting that a search engine retrieves multiple occurrences of the same subjects with possibly different degrees of relevance, we observe that fuzzy multisets provide an appropriate model of information retrieval on the WWW. Information clustering which means both term clustering and document clustering is considered. Three methods of the hard c-means, fuzzy c-means, and an agglomerative method using cluster centers are proposed. Two distances between fuzzy multisets and algorithms for calculating cluster centers are defined. Theoretical properties concerning the clustering algorithms are studied. Illustrative examples are given to show how the algorithms work.

Golub, K.: Automated subject classification of textual documents in the context of Web-based hierarchical browsing (2011) 0.03
```
0.034050807 = product of:
  0.10215242 = sum of:
    0.057803504 = weight(_text_:wide in 4558) [ClassicSimilarity], result of:
      0.057803504 = score(doc=4558,freq=2.0), product of:
        0.19679762 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.044416238 = queryNorm
        0.29372054 = fieldWeight in 4558, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.046875 = fieldNorm(doc=4558)
    0.04434892 = weight(_text_:web in 4558) [ClassicSimilarity], result of:
      0.04434892 = score(doc=4558,freq=4.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.3059541 = fieldWeight in 4558, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=4558)
  0.33333334 = coord(2/6)
```
Abstract

While automated methods for information organization have been around for several decades now, exponential growth of the World Wide Web has put them into the forefront of research in different communities, within which several approaches can be identified: 1) machine learning (algorithms that allow computers to improve their performance based on learning from pre-existing data); 2) document clustering (algorithms for unsupervised document organization and automated topic extraction); and 3) string matching (algorithms that match given strings within larger text). Here the aim was to automatically organize textual documents into hierarchical structures for subject browsing. The string-matching approach was tested using a controlled vocabulary (containing pre-selected and pre-defined authorized terms, each corresponding to only one concept). The results imply that an appropriate controlled vocabulary, with a sufficient number of entry terms designating classes, could in itself be a solution for automated classification. Then, if the same controlled vocabulary had an appropriat hierarchical structure, it would at the same time provide a good browsing structure for the collection of automatically classified documents.

HaCohen-Kerner, Y. et al.: Classification using various machine learning methods and combinations of key-phrases and visual features (2016) 0.03

0.031876236 = product of:
  0.09562871 = sum of:
    0.06553978 = weight(_text_:computer in 2748) [ClassicSimilarity], result of:
      0.06553978 = score(doc=2748,freq=2.0), product of:
        0.16231956 = queryWeight, product of:
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.044416238 = queryNorm
        0.40377006 = fieldWeight in 2748, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.078125 = fieldNorm(doc=2748)
    0.030088935 = product of:
      0.06017787 = sum of:
        0.06017787 = weight(_text_:22 in 2748) [ClassicSimilarity], result of:
          0.06017787 = score(doc=2748,freq=2.0), product of:
            0.1555381 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.044416238 = queryNorm
            0.38690117 = fieldWeight in 2748, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2748)
      0.5 = coord(1/2)
  0.33333334 = coord(2/6)

Date: 1. 2.2016 18:25:22
Series: Lecture notes in computer science ; 9398

Golub, K.: Automated subject classification of textual web documents (2006) 0.03
```
0.028345197 = product of:
  0.08503559 = sum of:
    0.052265707 = weight(_text_:web in 5600) [ClassicSimilarity], result of:
      0.052265707 = score(doc=5600,freq=8.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.36057037 = fieldWeight in 5600, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5600)
    0.03276989 = weight(_text_:computer in 5600) [ClassicSimilarity], result of:
      0.03276989 = score(doc=5600,freq=2.0), product of:
        0.16231956 = queryWeight, product of:
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.044416238 = queryNorm
        0.20188503 = fieldWeight in 5600, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5600)
  0.33333334 = coord(2/6)
```
Abstract

Purpose - To provide an integrated perspective to similarities and differences between approaches to automated classification in different research communities (machine learning, information retrieval and library science), and point to problems with the approaches and automated classification as such. Design/methodology/approach - A range of works dealing with automated classification of full-text web documents are discussed. Explorations of individual approaches are given in the following sections: special features (description, differences, evaluation), application and characteristics of web pages. Findings - Provides major similarities and differences between the three approaches: document pre-processing and utilization of web-specific document characteristics is common to all the approaches; major differences are in applied algorithms, employment or not of the vector space model and of controlled vocabularies. Problems of automated classification are recognized. Research limitations/implications - The paper does not attempt to provide an exhaustive bibliography of related resources. Practical implications - As an integrated overview of approaches from different research communities with application examples, it is very useful for students in library and information science and computer science, as well as for practitioners. Researchers from one community have the information on how similar tasks are conducted in different communities. Originality/value - To the author's knowledge, no review paper on automated text classification attempted to discuss more than one community's approach from an integrated perspective.
Yao, H.; Etzkorn, L.H.; Virani, S.: Automated classification and retrieval of reusable software components (2008) 0.02
```
0.02476748 = product of:
  0.07430244 = sum of:
    0.04816959 = weight(_text_:wide in 1382) [ClassicSimilarity], result of:
      0.04816959 = score(doc=1382,freq=2.0), product of:
        0.19679762 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.044416238 = queryNorm
        0.24476713 = fieldWeight in 1382, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1382)
    0.026132854 = weight(_text_:web in 1382) [ClassicSimilarity], result of:
      0.026132854 = score(doc=1382,freq=2.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.18028519 = fieldWeight in 1382, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1382)
  0.33333334 = coord(2/6)
```
Abstract

The authors describe their research which improves software reuse by using an automated approach to semantically search for and retrieve reusable software components in large software component repositories and on the World Wide Web (WWW). Using automation and smart (semantic) techniques, their approach speeds up the search and retrieval of reusable software components, while retaining good accuracy, and therefore improves the affordability of software reuse. A program understanding of software components and natural language understanding of user queries was employed. Then the software component descriptions were compared by matching the resulting semantic representations of the user queries to the semantic representations of the software components to search for software components that best match the user queries. A proof of concept system was developed to test the authors' approach. The results of this proof of concept system were compared to human experts, and statistical analysis was performed on the collected experimental data. The results from these experiments demonstrate that this automated semantic-based approach for software reusable component classification and retrieval is successful when compared to the labor-intensive results from the experts, thus showing that this approach can significantly benefit software reuse classification and retrieval.
Search Engines and Beyond : Developing efficient knowledge management systems, April 19-20 1999, Boston, Mass (1999) 0.02
```
0.019813985 = product of:
  0.059441954 = sum of:
    0.03853567 = weight(_text_:wide in 2596) [ClassicSimilarity], result of:
      0.03853567 = score(doc=2596,freq=2.0), product of:
        0.19679762 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.044416238 = queryNorm
        0.1958137 = fieldWeight in 2596, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.03125 = fieldNorm(doc=2596)
    0.020906283 = weight(_text_:web in 2596) [ClassicSimilarity], result of:
      0.020906283 = score(doc=2596,freq=2.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.14422815 = fieldWeight in 2596, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.03125 = fieldNorm(doc=2596)
  0.33333334 = coord(2/6)
```
Content

Ramana Rao (Inxight, Palo Alto, CA) 7 ± 2 Insights on achieving Effective Information Access Session One: Updates and a twelve month perspective Danny Sullivan (Search Engine Watch, US / England) Portalization and other search trends Carol Tenopir (University of Tennessee) Search realities faced by end users and professional searchers Session Two: Today's search engines and beyond Daniel Hoogterp (Retrieval Technologies, McLean, VA) Effective presentation and utilization of search techniques Rick Kenny (Fulcrum Technologies, Ontario, Canada) Beyond document clustering: The knowledge impact statement Gary Stock (Ingenius, Kalamazoo, MI) Automated change monitoring Gary Culliss (Direct Hit, Wellesley Hills, MA) User popularity ranked search engines Byron Dom (IBM, CA) Automatically finding the best pages on the World Wide Web (CLEVER) Peter Tomassi (LookSmart, San Francisco, CA) Adding human intellect to search technology Session Three: Panel discussion: Human v automated categorization and editing Ev Brenner (New York, NY)- Chairman James Callan (University of Massachusetts, MA) Marc Krellenstein (Northern Light Technology, Cambridge, MA) Dan Miller (Ask Jeeves, Berkeley, CA) Session Four: Updates and a twelve month perspective Steve Arnold (AIT, Harrods Creek, KY) Review: The leading edge in search and retrieval software Ellen Voorhees (NIST, Gaithersburg, MD) TREC update Session Five: Search engines now and beyond Intelligent Agents John Snyder (Muscat, Cambridge, England) Practical issues behind intelligent agents Text summarization Therese Firmin, (Dept of Defense, Ft George G. Meade, MD) The TIPSTER/SUMMAC evaluation of automatic text summarization systems Cross language searching Elizabeth Liddy (TextWise, Syracuse, NY) A conceptual interlingua approach to cross-language retrieval. Video search and retrieval Armon Amir (IBM, Almaden, CA) CueVideo: Modular system for automatic indexing and browsing of video/audio Speech recognition Michael Witbrock (Lycos, Waltham, MA) Retrieval of spoken documents Visualization James A. Wise (Integral Visuals, Richland, WA) Information visualization in the new millennium: Emerging science or passing fashion? Text mining David Evans (Claritech, Pittsburgh, PA) Text mining - towards decision support
Groß, T.; Faden, M.: Automatische Indexierung elektronischer Dokumente an der Deutschen Zentralbibliothek für Wirtschaftswissenschaften : Bericht über die Jahrestagung der Internationalen Buchwissenschaftlichen Gesellschaft (2010) 0.02
```
0.019813985 = product of:
  0.059441954 = sum of:
    0.03853567 = weight(_text_:wide in 4051) [ClassicSimilarity], result of:
      0.03853567 = score(doc=4051,freq=2.0), product of:
        0.19679762 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.044416238 = queryNorm
        0.1958137 = fieldWeight in 4051, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.03125 = fieldNorm(doc=4051)
    0.020906283 = weight(_text_:web in 4051) [ClassicSimilarity], result of:
      0.020906283 = score(doc=4051,freq=2.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.14422815 = fieldWeight in 4051, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.03125 = fieldNorm(doc=4051)
  0.33333334 = coord(2/6)
```
Abstract

Die zunehmende Verfügbarmachung digitaler Informationen in den letzten Jahren sowie die Aussicht auf ein weiteres Ansteigen der sogenannten Datenflut kumulieren in einem grundlegenden, sich weiter verstärkenden Informationsstrukturierungsproblem. Die stetige Zunahme von digitalen Informationsressourcen im World Wide Web sichert zwar jederzeit und ortsungebunden den Zugriff auf verschiedene Informationen; offen bleibt der strukturierte Zugang, insbesondere zu wissenschaftlichen Ressourcen. Angesichts der steigenden Anzahl elektronischer Inhalte und vor dem Hintergrund stagnierender bzw. knapper werdender personeller Ressourcen in der Sacherschließun schafft keine Bibliothek bzw. kein Bibliotheksverbund es mehr, weder aktuell noch zukünftig, alle digitalen Daten zu erfassen, zu strukturieren und zueinander in Beziehung zu setzen. In der Informationsgesellschaft des 21. Jahrhunderts wird es aber zunehmend wichtiger, die in der Flut verschwundenen wissenschaftlichen Informationen zeitnah, angemessen und vollständig zu strukturieren und somit als Basis für eine Wissensgenerierung wieder nutzbar zu machen. Eine normierte Inhaltserschließung digitaler Informationsressourcen ist deshalb für die Deutsche Zentralbibliothek für Wirtschaftswissenschaften (ZBW) als wichtige Informationsinfrastruktureinrichtung in diesem Bereich ein entscheidender und auch erfolgskritischer Aspekt im Wettbewerb mit anderen Informationsdienstleistern. Weil die traditionelle intellektuelle Sacherschließung aber nicht beliebig skalierbar ist - mit dem Anstieg der Zahl an Online-Dokumenten steigt proportional auch der personelle Ressourcenbedarf an Fachreferenten, wenn ein gewisser Qualitätsstandard gehalten werden soll - bedarf es zukünftig anderer Sacherschließungsverfahren. Automatisierte Verschlagwortungsmethoden werden dabei als einzige Möglichkeit angesehen, die bibliothekarische Sacherschließung auch im digitalen Zeitalter zukunftsfest auszugestalten. Zudem können maschinelle Ansätze dazu beitragen, die Heterogenitäten (Indexierungsinkonsistenzen) zwischen den einzelnen Sacherschließer zu nivellieren, und somit zu einer homogeneren Erschließung des Bibliotheksbestandes beitragen.

Search (73 results, page 1 of 4)

Authors

Years

Languages

Types

Themes

Subjects