Search (7 results, page 1 of 1)

Koch, T.; Ardö, A.: Automatic classification of full-text HTML-documents from one specific subject area : DESIRE II D3.6a, Working Paper 2 (2000) 0.01

0.013708099 = product of:
  0.054832395 = sum of:
    0.054832395 = product of:
      0.10966479 = sum of:
        0.10966479 = weight(_text_:software in 1667) [ClassicSimilarity], result of:
          0.10966479 = score(doc=1667,freq=6.0), product of:
            0.18056466 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.045514934 = queryNorm
            0.6073436 = fieldWeight in 1667, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.0625 = fieldNorm(doc=1667)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Content: 1 Introduction / 2 Method overview / 3 Ei thesaurus preprocessing / 4 Automatic classification process: 4.1 Matching -- 4.2 Weighting -- 4.3 Preparation for display / 5 Results of the classification process / 6 Evaluations / 7 Software / 8 Other applications / 9 Experiments with universal classification systems / References / Appendix A: Ei classification service: Software / Appendix B: Use of the classification software as subject filter in a WWW harvester.

Adams, K.C.: Word wranglers : Automatic classification tools transform enterprise documents from "bags of words" into knowledge resources (2003) 0.01
```
0.011060675 = product of:
  0.0442427 = sum of:
    0.0442427 = product of:
      0.0884854 = sum of:
        0.0884854 = weight(_text_:software in 1665) [ClassicSimilarity], result of:
          0.0884854 = score(doc=1665,freq=10.0), product of:
            0.18056466 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.045514934 = queryNorm
            0.49004826 = fieldWeight in 1665, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1665)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

Taxonomies are an important part of any knowledge management (KM) system, and automatic classification software is emerging as a "killer app" for consumer and enterprise portals. A number of companies such as Inxight Software , Mohomine, Metacode, and others claim to interpret the semantic content of any textual document and automatically classify text on the fly. The promise that software could automatically produce a Yahoo-style directory is a siren call not many IT managers are able to resist. KM needs have grown more complex due to the increasing amount of digital information, the declining effectiveness of keyword searching, and heterogeneous document formats in corporate databases. This environment requires innovative KM tools, and automatic classification technology is an example of this new kind of software. These products can be divided into three categories according to their underlying technology - rules-based, catalog-by-example, and statistical clustering. Evolving trends in this market include framing classification as a cyborg (computer- and human-based) activity and the increasing use of extensible markup language (XML) and support vector machine (SVM) technology. In this article, we'll survey the rapidly changing automatic classification software market and examine the features and capabilities of leading classification products.

Reiner, U.: Automatische DDC-Klassifizierung von bibliografischen Titeldatensätzen (2009) 0.01

0.007708307 = product of:
  0.030833228 = sum of:
    0.030833228 = product of:
      0.061666455 = sum of:
        0.061666455 = weight(_text_:22 in 611) [ClassicSimilarity], result of:
          0.061666455 = score(doc=611,freq=2.0), product of:
            0.15938555 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045514934 = queryNorm
            0.38690117 = fieldWeight in 611, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=611)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 22. 8.2009 12:54:24

Koch, T.; Ardö, A.; Brümmer, A.: ¬The building and maintenance of robot based internet search services : A review of current indexing and data collection methods. Prepared to meet the requirements of Work Package 3 of EU Telematics for Research, project DESIRE. Version D3.11v0.3 (Draft version 3) (1996) 0.01
```
0.005596308 = product of:
  0.022385232 = sum of:
    0.022385232 = product of:
      0.044770464 = sum of:
        0.044770464 = weight(_text_:software in 1669) [ClassicSimilarity], result of:
          0.044770464 = score(doc=1669,freq=4.0), product of:
            0.18056466 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.045514934 = queryNorm
            0.24794699 = fieldWeight in 1669, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.03125 = fieldNorm(doc=1669)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

After a short outline of problems, possibilities and difficulties of systematic information retrieval on the Internet and a description of efforts for development in this area, a specification of the terminology for this report is required. Although the process of retrieval is generally seen as an iterative process of browsing and information retrieval and several important services on the net have taken this fact into consideration, the emphasis of this report lays on the general retrieval tools for the whole of Internet. In order to be able to evaluate the differences, possibilities and restrictions of the different services it is necessary to begin with organizing the existing varieties in a typological/ taxonomical survey. The possibilities and weaknesses will be briefly compared and described for the most important services in the categories robot-based WWW-catalogues of different types, list- or form-based catalogues and simultaneous or collected search services respectively. It will however for different reasons not be possible to rank them in order of "best" services. Still more important are the weaknesses and problems common for all attempts of indexing the Internet. The problems of the quality of the input, the technical performance and the general problem of indexing virtual hypertext are shown to be at least as difficult as the different aspects of harvesting, indexing and information retrieval. Some of the attempts made in the area of further development of retrieval services will be mentioned in relation to descriptions of the contents of documents and standardization efforts. Internet harvesting and indexing technology and retrieval software is thoroughly reviewed. Details about all services and software are listed in analytical forms in Annex 1-3.

Automatic classification research at OCLC (2002) 0.01

0.005395815 = product of:
  0.02158326 = sum of:
    0.02158326 = product of:
      0.04316652 = sum of:
        0.04316652 = weight(_text_:22 in 1563) [ClassicSimilarity], result of:
          0.04316652 = score(doc=1563,freq=2.0), product of:
            0.15938555 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045514934 = queryNorm
            0.2708308 = fieldWeight in 1563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1563)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 5. 5.2003 9:22:09

Search Engines and Beyond : Developing efficient knowledge management systems, April 19-20 1999, Boston, Mass (1999) 0.00
```
0.0039571873 = product of:
  0.01582875 = sum of:
    0.01582875 = product of:
      0.0316575 = sum of:
        0.0316575 = weight(_text_:software in 2596) [ClassicSimilarity], result of:
          0.0316575 = score(doc=2596,freq=2.0), product of:
            0.18056466 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.045514934 = queryNorm
            0.17532499 = fieldWeight in 2596, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.03125 = fieldNorm(doc=2596)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Content

Ramana Rao (Inxight, Palo Alto, CA) 7 ± 2 Insights on achieving Effective Information Access Session One: Updates and a twelve month perspective Danny Sullivan (Search Engine Watch, US / England) Portalization and other search trends Carol Tenopir (University of Tennessee) Search realities faced by end users and professional searchers Session Two: Today's search engines and beyond Daniel Hoogterp (Retrieval Technologies, McLean, VA) Effective presentation and utilization of search techniques Rick Kenny (Fulcrum Technologies, Ontario, Canada) Beyond document clustering: The knowledge impact statement Gary Stock (Ingenius, Kalamazoo, MI) Automated change monitoring Gary Culliss (Direct Hit, Wellesley Hills, MA) User popularity ranked search engines Byron Dom (IBM, CA) Automatically finding the best pages on the World Wide Web (CLEVER) Peter Tomassi (LookSmart, San Francisco, CA) Adding human intellect to search technology Session Three: Panel discussion: Human v automated categorization and editing Ev Brenner (New York, NY)- Chairman James Callan (University of Massachusetts, MA) Marc Krellenstein (Northern Light Technology, Cambridge, MA) Dan Miller (Ask Jeeves, Berkeley, CA) Session Four: Updates and a twelve month perspective Steve Arnold (AIT, Harrods Creek, KY) Review: The leading edge in search and retrieval software Ellen Voorhees (NIST, Gaithersburg, MD) TREC update Session Five: Search engines now and beyond Intelligent Agents John Snyder (Muscat, Cambridge, England) Practical issues behind intelligent agents Text summarization Therese Firmin, (Dept of Defense, Ft George G. Meade, MD) The TIPSTER/SUMMAC evaluation of automatic text summarization systems Cross language searching Elizabeth Liddy (TextWise, Syracuse, NY) A conceptual interlingua approach to cross-language retrieval. Video search and retrieval Armon Amir (IBM, Almaden, CA) CueVideo: Modular system for automatic indexing and browsing of video/audio Speech recognition Michael Witbrock (Lycos, Waltham, MA) Retrieval of spoken documents Visualization James A. Wise (Integral Visuals, Richland, WA) Information visualization in the new millennium: Emerging science or passing fashion? Text mining David Evans (Claritech, Pittsburgh, PA) Text mining - towards decision support

Reiner, U.: Automatische DDC-Klassifizierung bibliografischer Titeldatensätze der Deutschen Nationalbibliografie (2009) 0.00

0.0030833227 = product of:
  0.012333291 = sum of:
    0.012333291 = product of:
      0.024666581 = sum of:
        0.024666581 = weight(_text_:22 in 3284) [ClassicSimilarity], result of:
          0.024666581 = score(doc=3284,freq=2.0), product of:
            0.15938555 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045514934 = queryNorm
            0.15476047 = fieldWeight in 3284, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=3284)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 22. 1.2010 14:41:24

Search (7 results, page 1 of 1)

Authors

Years

Languages

Types

Themes