Search (8 results, page 1 of 1)

Reiner, U.: Automatische DDC-Klassifizierung von bibliografischen Titeldatensätzen (2009) 0.01

0.0061047617 = product of:
  0.030523809 = sum of:
    0.030523809 = product of:
      0.061047617 = sum of:
        0.061047617 = weight(_text_:22 in 611) [ClassicSimilarity], result of:
          0.061047617 = score(doc=611,freq=2.0), product of:
            0.15778607 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04505818 = queryNorm
            0.38690117 = fieldWeight in 611, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=611)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Date: 22. 8.2009 12:54:24

Autonomy, Inc.: Automatic classification (o.J.) 0.01

0.0056314636 = product of:
  0.028157318 = sum of:
    0.028157318 = product of:
      0.056314636 = sum of:
        0.056314636 = weight(_text_:data in 1666) [ClassicSimilarity], result of:
          0.056314636 = score(doc=1666,freq=4.0), product of:
            0.14247625 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.04505818 = queryNorm
            0.3952563 = fieldWeight in 1666, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0625 = fieldNorm(doc=1666)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Abstract: Autonomy's Classification solutions remove the necessity for organizations to rely on human intervention or manual processing of information, such as manual tagging, typically required to make most other e-business applications work. Autonomy's ability to consistently and accurately classify data automatically is a unique infrastructure solution that overcomes the predicaments surrounding the exponential growth of unstructured data.

Automatic classification research at OCLC (2002) 0.00

0.004273333 = product of:
  0.021366665 = sum of:
    0.021366665 = product of:
      0.04273333 = sum of:
        0.04273333 = weight(_text_:22 in 1563) [ClassicSimilarity], result of:
          0.04273333 = score(doc=1563,freq=2.0), product of:
            0.15778607 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04505818 = queryNorm
            0.2708308 = fieldWeight in 1563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1563)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Date: 5. 5.2003 9:22:09

Dolin, R.; Agrawal, D.; El Abbadi, A.; Pearlman, J.: Using automated classification for summarizing and selecting heterogeneous information sources (1998) 0.00
```
0.0042235977 = product of:
  0.021117989 = sum of:
    0.021117989 = product of:
      0.042235978 = sum of:
        0.042235978 = weight(_text_:data in 316) [ClassicSimilarity], result of:
          0.042235978 = score(doc=316,freq=4.0), product of:
            0.14247625 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.04505818 = queryNorm
            0.29644224 = fieldWeight in 316, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=316)
      0.5 = coord(1/2)
  0.2 = coord(1/5)
```
Abstract

Information retrieval over the Internet increasingly requires the filtering of thousands of heterogeneous information sources. Important sources of information include not only traditional databases with structured data and queries, but also increasing numbers of non-traditional, semi- or unstructured collections such as Web sites, FTP archives, etc. As the number and variability of sources increases, new ways of automatically summarizing, discovering, and selecting collections relevant to a user's query are needed. One such method involves the use of classification schemes, such as the Library of Congress Classification (LCC) [10], within which a collection may be represented based on its content, irrespective of the structure of the actual data or documents. For such a system to be useful in a large-scale distributed environment, it must be easy to use for both collection managers and users. As a result, it must be possible to classify documents automatically within a classification scheme. Furthermore, there must be a straightforward and intuitive interface with which the user may use the scheme to assist in information retrieval (IR).
Dolin, R.; Agrawal, D.; El Abbadi, A.; Pearlman, J.: Using automated classification for summarizing and selecting heterogeneous information sources (1998) 0.00
```
0.0025864148 = product of:
  0.012932074 = sum of:
    0.012932074 = product of:
      0.025864149 = sum of:
        0.025864149 = weight(_text_:data in 1253) [ClassicSimilarity], result of:
          0.025864149 = score(doc=1253,freq=6.0), product of:
            0.14247625 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.04505818 = queryNorm
            0.18153305 = fieldWeight in 1253, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0234375 = fieldNorm(doc=1253)
      0.5 = coord(1/2)
  0.2 = coord(1/5)
```
Abstract

Information retrieval over the Internet increasingly requires the filtering of thousands of heterogeneous information sources. Important sources of information include not only traditional databases with structured data and queries, but also increasing numbers of non-traditional, semi- or unstructured collections such as Web sites, FTP archives, etc. As the number and variability of sources increases, new ways of automatically summarizing, discovering, and selecting collections relevant to a user's query are needed. One such method involves the use of classification schemes, such as the Library of Congress Classification (LCC), within which a collection may be represented based on its content, irrespective of the structure of the actual data or documents. For such a system to be useful in a large-scale distributed environment, it must be easy to use for both collection managers and users. As a result, it must be possible to classify documents automatically within a classification scheme. Furthermore, there must be a straightforward and intuitive interface with which the user may use the scheme to assist in information retrieval (IR). Our work with the Alexandria Digital Library (ADL) Project focuses on geo-referenced information, whether text, maps, aerial photographs, or satellite images. As a result, we have emphasized techniques which work with both text and non-text, such as combined textual and graphical queries, multi-dimensional indexing, and IR methods which are not solely dependent on words or phrases. Part of this work involves locating relevant online sources of information. In particular, we have designed and are currently testing aspects of an architecture, Pharos, which we believe will scale up to 1.000.000 heterogeneous sources. Pharos accommodates heterogeneity in content and format, both among multiple sources as well as within a single source. That is, we consider sources to include Web sites, FTP archives, newsgroups, and full digital libraries; all of these systems can include a wide variety of content and multimedia data formats. Pharos is based on the use of hierarchical classification schemes. These include not only well-known 'subject' (or 'concept') based schemes such as the Dewey Decimal System and the LCC, but also, for example, geographic classifications, which might be constructed as layers of smaller and smaller hierarchical longitude/latitude boxes. Pharos is designed to work with sophisticated queries which utilize subjects, geographical locations, temporal specifications, and other types of information domains. The Pharos architecture requires that hierarchically structured collection metadata be extracted so that it can be partitioned in such a way as to greatly enhance scalability. Automated classification is important to Pharos because it allows information sources to extract the requisite collection metadata automatically that must be distributed.
Wartena, C.; Sommer, M.: Automatic classification of scientific records using the German Subject Heading Authority File (SWD) (2012) 0.00
```
0.0024887787 = product of:
  0.012443894 = sum of:
    0.012443894 = product of:
      0.024887787 = sum of:
        0.024887787 = weight(_text_:data in 472) [ClassicSimilarity], result of:
          0.024887787 = score(doc=472,freq=2.0), product of:
            0.14247625 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.04505818 = queryNorm
            0.17468026 = fieldWeight in 472, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=472)
      0.5 = coord(1/2)
  0.2 = coord(1/5)
```
Abstract

The following paper deals with an automatic text classification method which does not require training documents. For this method the German Subject Heading Authority File (SWD), provided by the linked data service of the German National Library is used. Recently the SWD was enriched with notations of the Dewey Decimal Classification (DDC). In consequence it became possible to utilize the subject headings as textual representations for the notations of the DDC. Basically, we we derive the classification of a text from the classification of the words in the text given by the thesaurus. The method was tested by classifying 3826 OAI-Records from 7 different repositories. Mean reciprocal rank and recall were chosen as evaluation measure. Direct comparison to a machine learning method has shown that this method is definitely competitive. Thus we can conclude that the enriched version of the SWD provides high quality information with a broad coverage for classification of German scientific articles.

Reiner, U.: Automatische DDC-Klassifizierung bibliografischer Titeldatensätze der Deutschen Nationalbibliografie (2009) 0.00

0.0024419045 = product of:
  0.012209523 = sum of:
    0.012209523 = product of:
      0.024419045 = sum of:
        0.024419045 = weight(_text_:22 in 3284) [ClassicSimilarity], result of:
          0.024419045 = score(doc=3284,freq=2.0), product of:
            0.15778607 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04505818 = queryNorm
            0.15476047 = fieldWeight in 3284, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=3284)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Date: 22. 1.2010 14:41:24

Koch, T.; Ardö, A.; Brümmer, A.: ¬The building and maintenance of robot based internet search services : A review of current indexing and data collection methods. Prepared to meet the requirements of Work Package 3 of EU Telematics for Research, project DESIRE. Version D3.11v0.3 (Draft version 3) (1996) 0.00

0.001991023 = product of:
  0.009955115 = sum of:
    0.009955115 = product of:
      0.01991023 = sum of:
        0.01991023 = weight(_text_:data in 1669) [ClassicSimilarity], result of:
          0.01991023 = score(doc=1669,freq=2.0), product of:
            0.14247625 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.04505818 = queryNorm
            0.1397442 = fieldWeight in 1669, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03125 = fieldNorm(doc=1669)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Search (8 results, page 1 of 1)

Authors

Years

Languages

Types

Themes