Search (11 results, page 1 of 1)

Jenkins, C.: Automatic classification of Web resources using Java and Dewey Decimal Classification (1998) 0.04

0.038049873 = product of:
  0.10146633 = sum of:
    0.045056276 = weight(_text_:wide in 1673) [ClassicSimilarity], result of:
      0.045056276 = score(doc=1673,freq=2.0), product of:
        0.13148437 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.029675366 = queryNorm
        0.342674 = fieldWeight in 1673, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1673)
    0.042337947 = weight(_text_:web in 1673) [ClassicSimilarity], result of:
      0.042337947 = score(doc=1673,freq=6.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.43716836 = fieldWeight in 1673, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1673)
    0.014072108 = product of:
      0.028144216 = sum of:
        0.028144216 = weight(_text_:22 in 1673) [ClassicSimilarity], result of:
          0.028144216 = score(doc=1673,freq=2.0), product of:
            0.103918076 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.029675366 = queryNorm
            0.2708308 = fieldWeight in 1673, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1673)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: The Wolverhampton Web Library (WWLib) is a WWW search engine that provides access to UK based information. The experimental version developed in 1995, was a success but highlighted the need for a much higher degree of automation. An interesting feature of the experimental WWLib was that it organised information according to DDC. Discusses the advantages of classification and describes the automatic classifier that is being developed in Java as part of the new, fully automated WWLib
Date: 1. 8.1996 22:08:06
Footnote: Contribution to a special issue devoted to the Proceedings of the 7th International World Wide Web Conference, held 14-18 April 1998, Brisbane, Australia; vgl. auch: http://www7.scu.edu.au/programme/posters/1846/com1846.htm.

Möller, G.: Automatic classification of the World Wide Web using Universal Decimal Classification (1999) 0.02

0.024821464 = product of:
  0.099285856 = sum of:
    0.0643661 = weight(_text_:wide in 494) [ClassicSimilarity], result of:
      0.0643661 = score(doc=494,freq=2.0), product of:
        0.13148437 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.029675366 = queryNorm
        0.48953426 = fieldWeight in 494, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.078125 = fieldNorm(doc=494)
    0.03491975 = weight(_text_:web in 494) [ClassicSimilarity], result of:
      0.03491975 = score(doc=494,freq=2.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.36057037 = fieldWeight in 494, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.078125 = fieldNorm(doc=494)
  0.25 = coord(2/8)

Koch, T.: Experiments with automatic classification of WAIS databases and indexing of WWW : some results from the Nordic WAIS/WWW project (1994) 0.02

0.022040755 = product of:
  0.08816302 = sum of:
    0.06371919 = weight(_text_:wide in 7209) [ClassicSimilarity], result of:
      0.06371919 = score(doc=7209,freq=4.0), product of:
        0.13148437 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.029675366 = queryNorm
        0.4846142 = fieldWeight in 7209, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7209)
    0.024443826 = weight(_text_:web in 7209) [ClassicSimilarity], result of:
      0.024443826 = score(doc=7209,freq=2.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.25239927 = fieldWeight in 7209, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7209)
  0.25 = coord(2/8)

Abstract: The Nordic WAIS/WWW project sponsored by NORDINFO is a joint project between Lund University Library and the National Technological Library of Denmark. It aims to improve the existing networked information discovery and retrieval tools Wide Area Information System (WAIS) and World Wide Web (WWW), and to move towards unifying WWW and WAIS. Details current results focusing on the WAIS side of the project. Describes research into automatic indexing and classification of WAIS sources, development of an orientation tool for WAIS, and development of a WAIS index of WWW resources

Wätjen, H.-J.: GERHARD : Automatisches Sammeln, Klassifizieren und Indexieren von wissenschaftlich relevanten Informationsressourcen im deutschen World Wide Web (1998) 0.02

0.019906268 = product of:
  0.07962507 = sum of:
    0.045056276 = weight(_text_:wide in 3064) [ClassicSimilarity], result of:
      0.045056276 = score(doc=3064,freq=2.0), product of:
        0.13148437 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.029675366 = queryNorm
        0.342674 = fieldWeight in 3064, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3064)
    0.03456879 = weight(_text_:web in 3064) [ClassicSimilarity], result of:
      0.03456879 = score(doc=3064,freq=4.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.35694647 = fieldWeight in 3064, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3064)
  0.25 = coord(2/8)

Abstract: Die intellektuelle Erschließung des Internet befindet sich in einer Krise. Yahoo und andere Dienste können mit dem Wachstum des Web nicht mithalten. GERHARD ist derzeit weltweit der einzige Such- und Navigationsdienst, der die mit einem Roboter gesammelten Internetressourcen mit computerlinguistischen und statistischen Verfahren auch automatisch vollständig klassifiziert. Weit über eine Million HTML-Dokumente von wissenschaftlich relevanten Servern in Deutschland können wie bei anderen Suchmaschinen in der Datenbank gesucht, aber auch über die Navigation in der dreisprachigen Universalen Dezimalklassifikation (ETH-Bibliothek Zürich) recherchiert werden

Dolin, R.; Agrawal, D.; El Abbadi, A.; Pearlman, J.: Using automated classification for summarizing and selecting heterogeneous information sources (1998) 0.02
```
0.019184694 = product of:
  0.05115918 = sum of:
    0.019309832 = weight(_text_:wide in 1253) [ClassicSimilarity], result of:
      0.019309832 = score(doc=1253,freq=2.0), product of:
        0.13148437 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.029675366 = queryNorm
        0.14686027 = fieldWeight in 1253, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0234375 = fieldNorm(doc=1253)
    0.0148151945 = weight(_text_:web in 1253) [ClassicSimilarity], result of:
      0.0148151945 = score(doc=1253,freq=4.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.15297705 = fieldWeight in 1253, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0234375 = fieldNorm(doc=1253)
    0.017034154 = weight(_text_:data in 1253) [ClassicSimilarity], result of:
      0.017034154 = score(doc=1253,freq=6.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.18153305 = fieldWeight in 1253, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0234375 = fieldNorm(doc=1253)
  0.375 = coord(3/8)
```
Abstract

Information retrieval over the Internet increasingly requires the filtering of thousands of heterogeneous information sources. Important sources of information include not only traditional databases with structured data and queries, but also increasing numbers of non-traditional, semi- or unstructured collections such as Web sites, FTP archives, etc. As the number and variability of sources increases, new ways of automatically summarizing, discovering, and selecting collections relevant to a user's query are needed. One such method involves the use of classification schemes, such as the Library of Congress Classification (LCC), within which a collection may be represented based on its content, irrespective of the structure of the actual data or documents. For such a system to be useful in a large-scale distributed environment, it must be easy to use for both collection managers and users. As a result, it must be possible to classify documents automatically within a classification scheme. Furthermore, there must be a straightforward and intuitive interface with which the user may use the scheme to assist in information retrieval (IR). Our work with the Alexandria Digital Library (ADL) Project focuses on geo-referenced information, whether text, maps, aerial photographs, or satellite images. As a result, we have emphasized techniques which work with both text and non-text, such as combined textual and graphical queries, multi-dimensional indexing, and IR methods which are not solely dependent on words or phrases. Part of this work involves locating relevant online sources of information. In particular, we have designed and are currently testing aspects of an architecture, Pharos, which we believe will scale up to 1.000.000 heterogeneous sources. Pharos accommodates heterogeneity in content and format, both among multiple sources as well as within a single source. That is, we consider sources to include Web sites, FTP archives, newsgroups, and full digital libraries; all of these systems can include a wide variety of content and multimedia data formats. Pharos is based on the use of hierarchical classification schemes. These include not only well-known 'subject' (or 'concept') based schemes such as the Dewey Decimal System and the LCC, but also, for example, geographic classifications, which might be constructed as layers of smaller and smaller hierarchical longitude/latitude boxes. Pharos is designed to work with sophisticated queries which utilize subjects, geographical locations, temporal specifications, and other types of information domains. The Pharos architecture requires that hierarchically structured collection metadata be extracted so that it can be partitioned in such a way as to greatly enhance scalability. Automated classification is important to Pharos because it allows information sources to extract the requisite collection metadata automatically that must be distributed.
Dolin, R.; Agrawal, D.; El Abbadi, A.; Pearlman, J.: Using automated classification for summarizing and selecting heterogeneous information sources (1998) 0.01
```
0.012192126 = product of:
  0.048768505 = sum of:
    0.020951848 = weight(_text_:web in 316) [ClassicSimilarity], result of:
      0.020951848 = score(doc=316,freq=2.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.21634221 = fieldWeight in 316, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=316)
    0.027816659 = weight(_text_:data in 316) [ClassicSimilarity], result of:
      0.027816659 = score(doc=316,freq=4.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.29644224 = fieldWeight in 316, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=316)
  0.25 = coord(2/8)
```
Abstract

Information retrieval over the Internet increasingly requires the filtering of thousands of heterogeneous information sources. Important sources of information include not only traditional databases with structured data and queries, but also increasing numbers of non-traditional, semi- or unstructured collections such as Web sites, FTP archives, etc. As the number and variability of sources increases, new ways of automatically summarizing, discovering, and selecting collections relevant to a user's query are needed. One such method involves the use of classification schemes, such as the Library of Congress Classification (LCC) [10], within which a collection may be represented based on its content, irrespective of the structure of the actual data or documents. For such a system to be useful in a large-scale distributed environment, it must be easy to use for both collection managers and users. As a result, it must be possible to classify documents automatically within a classification scheme. Furthermore, there must be a straightforward and intuitive interface with which the user may use the scheme to assist in information retrieval (IR).

Dubin, D.: Dimensions and discriminability (1998) 0.01

0.00925492 = product of:
  0.03701968 = sum of:
    0.022947572 = weight(_text_:data in 2338) [ClassicSimilarity], result of:
      0.022947572 = score(doc=2338,freq=2.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.24455236 = fieldWeight in 2338, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2338)
    0.014072108 = product of:
      0.028144216 = sum of:
        0.028144216 = weight(_text_:22 in 2338) [ClassicSimilarity], result of:
          0.028144216 = score(doc=2338,freq=2.0), product of:
            0.103918076 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.029675366 = queryNorm
            0.2708308 = fieldWeight in 2338, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2338)
      0.5 = coord(1/2)
  0.25 = coord(2/8)

Date: 22. 9.1997 19:16:05
Source: Visualizing subject access for 21st century information resources: Papers presented at the 1997 Clinic on Library Applications of Data Processing, 2-4 Mar 1997, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign. Ed.: P.A. Cochrane et al

McKiernan, G.: Automated categorisation of Web resources : a profile of selected projects, research, products, and services (1996) 0.00

0.0043649687 = product of:
  0.03491975 = sum of:
    0.03491975 = weight(_text_:web in 2533) [ClassicSimilarity], result of:
      0.03491975 = score(doc=2533,freq=2.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.36057037 = fieldWeight in 2533, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.078125 = fieldNorm(doc=2533)
  0.125 = coord(1/8)

Vizine-Goetz, D.: NetLab / OCLC collaboration seeks to improve Web searching (1999) 0.00

0.0043649687 = product of:
  0.03491975 = sum of:
    0.03491975 = weight(_text_:web in 4180) [ClassicSimilarity], result of:
      0.03491975 = score(doc=4180,freq=2.0), product of:
        0.096845865 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029675366 = queryNorm
        0.36057037 = fieldWeight in 4180, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.078125 = fieldNorm(doc=4180)
  0.125 = coord(1/8)

Rose, J.R.; Gasteiger, J.: HORACE: an automatic system for the hierarchical classification of chemical reactions (1994) 0.00
```
0.0028684465 = product of:
  0.022947572 = sum of:
    0.022947572 = weight(_text_:data in 7696) [ClassicSimilarity], result of:
      0.022947572 = score(doc=7696,freq=2.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.24455236 = fieldWeight in 7696, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7696)
  0.125 = coord(1/8)
```
Abstract

Describes an automatic classification system for classifying chemical reactions. A detailed study of the classification of chemical reactions, based on topological and physicochemical features, is followed by an analysis of the hierarchical classification produced by the HORACE algorithm (Hierarchical Organization of Reactions through Attribute and Condition Eduction), which combines both approaches in a synergistic manner. The searching and updating of reaction hierarchies is demonstrated with the hierarchies produced for 2 data sets by the HORACE algorithm. Shows that reaction hierarchies provide an efficient access to reaction information and indicate the main reaction types for a given reaction scheme, define the scope of a reaction type, enable searchers to find unusual reactions, and can help in locating the reactions most relevant for a given problem
Ruocco, A.S.; Frieder, O.: Clustering and classification of large document bases in a parallel environment (1997) 0.00
```
0.0028684465 = product of:
  0.022947572 = sum of:
    0.022947572 = weight(_text_:data in 1661) [ClassicSimilarity], result of:
      0.022947572 = score(doc=1661,freq=2.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.24455236 = fieldWeight in 1661, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1661)
  0.125 = coord(1/8)
```
Abstract

Proposes the use of parallel computing systems to overcome the computationally intense clustering process. Examines 2 operations: clustering a document set and classifying the document set. Uses a subset of the TIPSTER corpus, specifically, articles from the Wall Street Journal. Document set classification was performed without the large storage requirements for ancillary data matrices. The time performance of the parallel systems was an improvement over sequential systems times, and produced the same clustering and classification scheme. Results show near linear speed up in higher threshold clustering applications

Search (11 results, page 1 of 1)

Authors

Languages

Themes