Search (60 results, page 1 of 3)

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.26

0.25833747 = product of:
  0.3875062 = sum of:
    0.054560162 = product of:
      0.16368048 = sum of:
        0.16368048 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
          0.16368048 = score(doc=562,freq=2.0), product of:
            0.29123706 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03435205 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.33333334 = coord(1/3)
    0.16368048 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
      0.16368048 = score(doc=562,freq=2.0), product of:
        0.29123706 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03435205 = queryNorm
        0.56201804 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.16368048 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
      0.16368048 = score(doc=562,freq=2.0), product of:
        0.29123706 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03435205 = queryNorm
        0.56201804 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.005585074 = product of:
      0.02792537 = sum of:
        0.02792537 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
          0.02792537 = score(doc=562,freq=2.0), product of:
            0.120295025 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03435205 = queryNorm
            0.23214069 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.2 = coord(1/5)
  0.6666667 = coord(4/6)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32

Jenkins, C.: Automatic classification of Web resources using Java and Dewey Decimal Classification (1998) 0.02

0.019849308 = product of:
  0.05954792 = sum of:
    0.053032 = weight(_text_:suchmaschinen in 1673) [ClassicSimilarity], result of:
      0.053032 = score(doc=1673,freq=2.0), product of:
        0.15347718 = queryWeight, product of:
          4.4677734 = idf(docFreq=1378, maxDocs=44218)
          0.03435205 = queryNorm
        0.3455367 = fieldWeight in 1673, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4677734 = idf(docFreq=1378, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1673)
    0.00651592 = product of:
      0.0325796 = sum of:
        0.0325796 = weight(_text_:22 in 1673) [ClassicSimilarity], result of:
          0.0325796 = score(doc=1673,freq=2.0), product of:
            0.120295025 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03435205 = queryNorm
            0.2708308 = fieldWeight in 1673, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1673)
      0.2 = coord(1/5)
  0.33333334 = coord(2/6)

Date: 1. 8.1996 22:08:06
Theme: Suchmaschinen

Ardö, A.; Koch, T.: Automatic classification applied to full-text Internet documents in a robot-generated subject index (1999) 0.02

0.015152 = product of:
  0.090912 = sum of:
    0.090912 = weight(_text_:suchmaschinen in 382) [ClassicSimilarity], result of:
      0.090912 = score(doc=382,freq=2.0), product of:
        0.15347718 = queryWeight, product of:
          4.4677734 = idf(docFreq=1378, maxDocs=44218)
          0.03435205 = queryNorm
        0.59234864 = fieldWeight in 382, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4677734 = idf(docFreq=1378, maxDocs=44218)
          0.09375 = fieldNorm(doc=382)
  0.16666667 = coord(1/6)

Theme: Suchmaschinen

Krellenstein, M.: Document classification at Northern Light (1999) 0.02

0.015152 = product of:
  0.090912 = sum of:
    0.090912 = weight(_text_:suchmaschinen in 4435) [ClassicSimilarity], result of:
      0.090912 = score(doc=4435,freq=2.0), product of:
        0.15347718 = queryWeight, product of:
          4.4677734 = idf(docFreq=1378, maxDocs=44218)
          0.03435205 = queryNorm
        0.59234864 = fieldWeight in 4435, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4677734 = idf(docFreq=1378, maxDocs=44218)
          0.09375 = fieldNorm(doc=4435)
  0.16666667 = coord(1/6)

Theme: Suchmaschinen

Wätjen, H.-J.: Automatisches Sammeln, Klassifizieren und Indexieren von wissenschaftlich relevanten Informationsressourcen im deutschen World Wide Web : das DFG-Projekt GERHARD (1998) 0.01

0.012626667 = product of:
  0.07576 = sum of:
    0.07576 = weight(_text_:suchmaschinen in 3066) [ClassicSimilarity], result of:
      0.07576 = score(doc=3066,freq=2.0), product of:
        0.15347718 = queryWeight, product of:
          4.4677734 = idf(docFreq=1378, maxDocs=44218)
          0.03435205 = queryNorm
        0.49362388 = fieldWeight in 3066, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4677734 = idf(docFreq=1378, maxDocs=44218)
          0.078125 = fieldNorm(doc=3066)
  0.16666667 = coord(1/6)

Footnote: Vortrag auf der 20. Online-Tagung der Deutschen Gesellschaft für Dokumentation, 5.-7.5.1998. Session 3: WWW-Suchmaschinen

Wätjen, H.-J.: GERHARD : Automatisches Sammeln, Klassifizieren und Indexieren von wissenschaftlich relevanten Informationsressourcen im deutschen World Wide Web (1998) 0.01
```
0.008838667 = product of:
  0.053032 = sum of:
    0.053032 = weight(_text_:suchmaschinen in 3064) [ClassicSimilarity], result of:
      0.053032 = score(doc=3064,freq=2.0), product of:
        0.15347718 = queryWeight, product of:
          4.4677734 = idf(docFreq=1378, maxDocs=44218)
          0.03435205 = queryNorm
        0.3455367 = fieldWeight in 3064, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4677734 = idf(docFreq=1378, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3064)
  0.16666667 = coord(1/6)
```
Abstract

Die intellektuelle Erschließung des Internet befindet sich in einer Krise. Yahoo und andere Dienste können mit dem Wachstum des Web nicht mithalten. GERHARD ist derzeit weltweit der einzige Such- und Navigationsdienst, der die mit einem Roboter gesammelten Internetressourcen mit computerlinguistischen und statistischen Verfahren auch automatisch vollständig klassifiziert. Weit über eine Million HTML-Dokumente von wissenschaftlich relevanten Servern in Deutschland können wie bei anderen Suchmaschinen in der Datenbank gesucht, aber auch über die Navigation in der dreisprachigen Universalen Dezimalklassifikation (ETH-Bibliothek Zürich) recherchiert werden

Ozmutlu, S.; Cosar, G.C.: Analyzing the results of automatic new topic identification (2008) 0.01

0.007576 = product of:
  0.045456 = sum of:
    0.045456 = weight(_text_:suchmaschinen in 2604) [ClassicSimilarity], result of:
      0.045456 = score(doc=2604,freq=2.0), product of:
        0.15347718 = queryWeight, product of:
          4.4677734 = idf(docFreq=1378, maxDocs=44218)
          0.03435205 = queryNorm
        0.29617432 = fieldWeight in 2604, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4677734 = idf(docFreq=1378, maxDocs=44218)
          0.046875 = fieldNorm(doc=2604)
  0.16666667 = coord(1/6)

Theme: Suchmaschinen

Puzicha, J.: Informationen finden! : Intelligente Suchmaschinentechnologie & automatische Kategorisierung (2007) 0.01
```
0.007576 = product of:
  0.045456 = sum of:
    0.045456 = weight(_text_:suchmaschinen in 2817) [ClassicSimilarity], result of:
      0.045456 = score(doc=2817,freq=2.0), product of:
        0.15347718 = queryWeight, product of:
          4.4677734 = idf(docFreq=1378, maxDocs=44218)
          0.03435205 = queryNorm
        0.29617432 = fieldWeight in 2817, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4677734 = idf(docFreq=1378, maxDocs=44218)
          0.046875 = fieldNorm(doc=2817)
  0.16666667 = coord(1/6)
```
Abstract

Wie in diesem Text erläutert wurde, ist die Effektivität von Such- und Klassifizierungssystemen durch folgendes bestimmt: 1) den Arbeitsauftrag, 2) die Genauigkeit des Systems, 3) den zu erreichenden Automatisierungsgrad, 4) die Einfachheit der Integration in bereits vorhandene Systeme. Diese Kriterien gehen davon aus, dass jedes System, unabhängig von der Technologie, in der Lage ist, Grundvoraussetzungen des Produkts in Bezug auf Funktionalität, Skalierbarkeit und Input-Methode zu erfüllen. Diese Produkteigenschaften sind in der Recommind Produktliteratur genauer erläutert. Von diesen Fähigkeiten ausgehend sollte die vorhergehende Diskussion jedoch einige klare Trends aufgezeigt haben. Es ist nicht überraschend, dass jüngere Entwicklungen im Maschine Learning und anderen Bereichen der Informatik einen theoretischen Ausgangspunkt für die Entwicklung von Suchmaschinen- und Klassifizierungstechnologie haben. Besonders jüngste Fortschritte bei den statistischen Methoden (PLSA) und anderen mathematischen Werkzeugen (SVMs) haben eine Ergebnisqualität auf Durchbruchsniveau erreicht. Dazu kommt noch die Flexibilität in der Anwendung durch Selbsttraining und Kategorienerkennen von PLSA-Systemen, wie auch eine neue Generation von vorher unerreichten Produktivitätsverbesserungen.

Krüger, C.: Evaluation des WWW-Suchdienstes GERHARD unter besonderer Beachtung automatischer Indexierung (1999) 0.01

0.0063133333 = product of:
  0.03788 = sum of:
    0.03788 = weight(_text_:suchmaschinen in 1777) [ClassicSimilarity], result of:
      0.03788 = score(doc=1777,freq=2.0), product of:
        0.15347718 = queryWeight, product of:
          4.4677734 = idf(docFreq=1378, maxDocs=44218)
          0.03435205 = queryNorm
        0.24681194 = fieldWeight in 1777, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4677734 = idf(docFreq=1378, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1777)
  0.16666667 = coord(1/6)

Theme: Suchmaschinen

Savic, D.: Designing an expert system for classifying office documents (1994) 0.01

0.005102382 = product of:
  0.030614292 = sum of:
    0.030614292 = product of:
      0.07653573 = sum of:
        0.03896392 = weight(_text_:28 in 2655) [ClassicSimilarity], result of:
          0.03896392 = score(doc=2655,freq=2.0), product of:
            0.12305808 = queryWeight, product of:
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.03435205 = queryNorm
            0.31663033 = fieldWeight in 2655, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.0625 = fieldNorm(doc=2655)
        0.03757181 = weight(_text_:29 in 2655) [ClassicSimilarity], result of:
          0.03757181 = score(doc=2655,freq=2.0), product of:
            0.12083977 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.03435205 = queryNorm
            0.31092256 = fieldWeight in 2655, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0625 = fieldNorm(doc=2655)
      0.4 = coord(2/5)
  0.16666667 = coord(1/6)

Source: Records management quarterly. 28(1994) no.3, S.20-29

Oberhauser, O.: Automatisches Klassifizieren : Verfahren zur Erschließung elektronischer Dokumente (2004) 0.01
```
0.0050506666 = product of:
  0.030304 = sum of:
    0.030304 = weight(_text_:suchmaschinen in 2487) [ClassicSimilarity], result of:
      0.030304 = score(doc=2487,freq=2.0), product of:
        0.15347718 = queryWeight, product of:
          4.4677734 = idf(docFreq=1378, maxDocs=44218)
          0.03435205 = queryNorm
        0.19744955 = fieldWeight in 2487, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4677734 = idf(docFreq=1378, maxDocs=44218)
          0.03125 = fieldNorm(doc=2487)
  0.16666667 = coord(1/6)
```
Abstract

Automatisches Klassifizieren von Textdokumenten bedeutet die maschinelle Zuordnung jeweils einer oder mehrerer Notationen eines vorgegebenen Klassifikationssystems zu natürlich-sprachlichen Texten mithilfe eines geeigneten Algorithmus. In der vorliegenden Arbeit wird in Form einer umfassenden Literaturstudie ein aktueller Kenntnisstand zu den Ein-satzmöglichkeiten des automatischen Klassifizierens für die sachliche Erschliessung von elektronischen Dokumenten, insbesondere von Web-Ressourcen, erarbeitet. Dies betrifft zum einen den methodischen Aspekt und zum anderen die in relevanten Projekten und Anwendungen gewonnenen Erfahrungen. In methodischer Hinsicht gelten heute statistische Verfahren, die auf dem maschinellen Lernen basieren und auf der Grundlage bereits klassifizierter Beispieldokumente ein Modell - einen "Klassifikator" - erstellen, das zur Klassifizierung neuer Dokumente verwendet werden kann, als "state-of-the-art". Die vier in den 1990er Jahren an den Universitäten Lund, Wolverhampton und Oldenburg sowie bei OCLC (Dublin, OH) durchgeführten "grossen" Projekte zum automatischen Klassifizieren von Web-Ressourcen, die in dieser Arbeit ausführlich analysiert werden, arbeiteten allerdings noch mit einfacheren bzw. älteren methodischen Ansätzen. Diese Projekte bedeuten insbesondere aufgrund ihrer Verwendung etablierter bibliothekarischer Klassifikationssysteme einen wichtigen Erfahrungsgewinn, selbst wenn sie bisher nicht zu permanenten und qualitativ zufriedenstellenden Diensten für die Erschliessung elektronischer Ressourcen geführt haben. Die Analyse der weiteren einschlägigen Anwendungen und Projekte lässt erkennen, dass derzeit in den Bereichen Patent- und Mediendokumentation die aktivsten Bestrebungen bestehen, Systeme für die automatische klassifikatorische Erschliessung elektronischer Dokumente im laufenden operativen Betrieb einzusetzen. Dabei dominieren jedoch halbautomatische Systeme, die menschliche Bearbeiter durch Klassifizierungsvorschläge unterstützen, da die gegenwärtig erreichbare Klassifizierungsgüte für eine Vollautomatisierung meist noch nicht ausreicht. Weitere interessante Anwendungen und Projekte finden sich im Bereich von Web-Portalen, Suchmaschinen und (kommerziellen) Informationsdiensten, während sich etwa im Bibliothekswesen kaum nennenswertes Interesse an einer automatischen Klassifizierung von Büchern bzw. bibliographischen Datensätzen registrieren lässt. Die Studie schliesst mit einer Diskussion der wichtigsten Projekte und Anwendungen sowie einiger im Zusammenhang mit dem automatischen Klassifizieren relevanter Fragestellungen und Themen.

Search Engines and Beyond : Developing efficient knowledge management systems, April 19-20 1999, Boston, Mass (1999) 0.01

0.0050506666 = product of:
  0.030304 = sum of:
    0.030304 = weight(_text_:suchmaschinen in 2596) [ClassicSimilarity], result of:
      0.030304 = score(doc=2596,freq=2.0), product of:
        0.15347718 = queryWeight, product of:
          4.4677734 = idf(docFreq=1378, maxDocs=44218)
          0.03435205 = queryNorm
        0.19744955 = fieldWeight in 2596, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4677734 = idf(docFreq=1378, maxDocs=44218)
          0.03125 = fieldNorm(doc=2596)
  0.16666667 = coord(1/6)

Theme: Suchmaschinen

Savic, D.: Automatic classification of office documents : review of available methods and techniques (1995) 0.00

0.004464585 = product of:
  0.026787508 = sum of:
    0.026787508 = product of:
      0.06696877 = sum of:
        0.034093432 = weight(_text_:28 in 2219) [ClassicSimilarity], result of:
          0.034093432 = score(doc=2219,freq=2.0), product of:
            0.12305808 = queryWeight, product of:
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.03435205 = queryNorm
            0.27705154 = fieldWeight in 2219, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2219)
        0.032875333 = weight(_text_:29 in 2219) [ClassicSimilarity], result of:
          0.032875333 = score(doc=2219,freq=2.0), product of:
            0.12083977 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.03435205 = queryNorm
            0.27205724 = fieldWeight in 2219, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2219)
      0.4 = coord(2/5)
  0.16666667 = coord(1/6)

Date: 23. 7.1996 10:28:09
Source: Records management quarterly. 29(1995) no.4, S.3-18

Yi, K.: Automatic text classification using library classification schemes : trends, issues and challenges (2007) 0.00

0.0044448692 = product of:
  0.026669214 = sum of:
    0.026669214 = product of:
      0.06667303 = sum of:
        0.034093432 = weight(_text_:28 in 2560) [ClassicSimilarity], result of:
          0.034093432 = score(doc=2560,freq=2.0), product of:
            0.12305808 = queryWeight, product of:
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.03435205 = queryNorm
            0.27705154 = fieldWeight in 2560, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2560)
        0.0325796 = weight(_text_:22 in 2560) [ClassicSimilarity], result of:
          0.0325796 = score(doc=2560,freq=2.0), product of:
            0.120295025 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03435205 = queryNorm
            0.2708308 = fieldWeight in 2560, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2560)
      0.4 = coord(2/5)
  0.16666667 = coord(1/6)

Date: 28. 9.2003 11:42:17
22. 9.2008 18:31:54

Pfeffer, M.: Automatische Vergabe von RVK-Notationen mittels fallbasiertem Schließen (2009) 0.00

0.0038098875 = product of:
  0.022859324 = sum of:
    0.022859324 = product of:
      0.057148308 = sum of:
        0.02922294 = weight(_text_:28 in 3051) [ClassicSimilarity], result of:
          0.02922294 = score(doc=3051,freq=2.0), product of:
            0.12305808 = queryWeight, product of:
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.03435205 = queryNorm
            0.23747274 = fieldWeight in 3051, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.046875 = fieldNorm(doc=3051)
        0.02792537 = weight(_text_:22 in 3051) [ClassicSimilarity], result of:
          0.02792537 = score(doc=3051,freq=2.0), product of:
            0.120295025 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03435205 = queryNorm
            0.23214069 = fieldWeight in 3051, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=3051)
      0.4 = coord(2/5)
  0.16666667 = coord(1/6)

Date: 22. 8.2009 19:51:28

Ibekwe-SanJuan, F.; SanJuan, E.: From term variants to research topics (2002) 0.00

0.0031889891 = product of:
  0.019133935 = sum of:
    0.019133935 = product of:
      0.047834836 = sum of:
        0.024352452 = weight(_text_:28 in 1853) [ClassicSimilarity], result of:
          0.024352452 = score(doc=1853,freq=2.0), product of:
            0.12305808 = queryWeight, product of:
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.03435205 = queryNorm
            0.19789396 = fieldWeight in 1853, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1853)
        0.023482382 = weight(_text_:29 in 1853) [ClassicSimilarity], result of:
          0.023482382 = score(doc=1853,freq=2.0), product of:
            0.12083977 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.03435205 = queryNorm
            0.19432661 = fieldWeight in 1853, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1853)
      0.4 = coord(2/5)
  0.16666667 = coord(1/6)

Date: 6. 1.1997 18:30:28
Source: Knowledge organization. 29(2002) nos.3/4, S.181-197

Giorgetti, D.; Sebastiani, F.: Automating survey coding by multiclass text categorization techniques (2003) 0.00
```
0.0031889891 = product of:
  0.019133935 = sum of:
    0.019133935 = product of:
      0.047834836 = sum of:
        0.024352452 = weight(_text_:28 in 5172) [ClassicSimilarity], result of:
          0.024352452 = score(doc=5172,freq=2.0), product of:
            0.12305808 = queryWeight, product of:
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.03435205 = queryNorm
            0.19789396 = fieldWeight in 5172, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5172)
        0.023482382 = weight(_text_:29 in 5172) [ClassicSimilarity], result of:
          0.023482382 = score(doc=5172,freq=2.0), product of:
            0.12083977 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.03435205 = queryNorm
            0.19432661 = fieldWeight in 5172, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5172)
      0.4 = coord(2/5)
  0.16666667 = coord(1/6)
```
Abstract

In this issue Giorgetti, and Sebastiani suggest that answers to open ended questions in survey instruments can be coded automatically by creating classifiers which learn from training sets of manually coded answers. The manual effort required is only that of classifying a representative set of documents, not creating a dictionary of words that trigger an assignment. They use a naive Bayesian probabilistic learner from Mc Callum's RAINBOW package and the multi-class support vector machine learner from Hsu and Lin's BSVM package, both examples of text categorization techniques. Data from the 1996 General Social Survey by the U.S. National Opinion Research Center provided a set of answers to three questions (previously tested by Viechnicki using a dictionary approach), their associated manually assigned category codes, and a complete set of predefined category codes. The learners were run on three random disjoint subsets of the answer sets to create the classifiers and a remaining set was used as a test set. The dictionary approach is out preformed by 18% for RAINBOW and by 17% for BSVM, while the standard deviation of the results is reduced by 28% and 34% respectively over the dictionary approach.

Date

9. 7.2006 10:29:12
Oberhauser, O.: Automatisches Klassifizieren : Entwicklungsstand - Methodik - Anwendungsbereiche (2005) 0.00
```
0.0031566666 = product of:
  0.01894 = sum of:
    0.01894 = weight(_text_:suchmaschinen in 38) [ClassicSimilarity], result of:
      0.01894 = score(doc=38,freq=2.0), product of:
        0.15347718 = queryWeight, product of:
          4.4677734 = idf(docFreq=1378, maxDocs=44218)
          0.03435205 = queryNorm
        0.12340597 = fieldWeight in 38, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4677734 = idf(docFreq=1378, maxDocs=44218)
          0.01953125 = fieldNorm(doc=38)
  0.16666667 = coord(1/6)
```
Abstract

Automatisches Klassifizieren von Textdokumenten bedeutet die maschinelle Zuordnung jeweils einer oder mehrerer Notationen eines vorgegebenen Klassifikationssystems zu natürlich-sprachlichen Texten mithilfe eines geeigneten Algorithmus. In der vorliegenden Arbeit wird in Form einer umfassenden Literaturstudie ein aktueller Kenntnisstand zu den Ein-satzmöglichkeiten des automatischen Klassifizierens für die sachliche Erschliessung von elektronischen Dokumenten, insbesondere von Web-Ressourcen, erarbeitet. Dies betrifft zum einen den methodischen Aspekt und zum anderen die in relevanten Projekten und Anwendungen gewonnenen Erfahrungen. In methodischer Hinsicht gelten heute statistische Verfahren, die auf dem maschinellen Lernen basieren und auf der Grundlage bereits klassifizierter Beispieldokumente ein Modell - einen "Klassifikator" - erstellen, das zur Klassifizierung neuer Dokumente verwendet werden kann, als "state-of-the-art". Die vier in den 1990er Jahren an den Universitäten Lund, Wolverhampton und Oldenburg sowie bei OCLC (Dublin, OH) durchgeführten "grossen" Projekte zum automatischen Klassifizieren von Web-Ressourcen, die in dieser Arbeit ausführlich analysiert werden, arbeiteten allerdings noch mit einfacheren bzw. älteren methodischen Ansätzen. Diese Projekte bedeuten insbesondere aufgrund ihrer Verwendung etablierter bibliothekarischer Klassifikationssysteme einen wichtigen Erfahrungsgewinn, selbst wenn sie bisher nicht zu permanenten und qualitativ zufriedenstellenden Diensten für die Erschliessung elektronischer Ressourcen geführt haben. Die Analyse der weiteren einschlägigen Anwendungen und Projekte lässt erkennen, dass derzeit in den Bereichen Patent- und Mediendokumentation die aktivsten Bestrebungen bestehen, Systeme für die automatische klassifikatorische Erschliessung elektronischer Dokumente im laufenden operativen Betrieb einzusetzen. Dabei dominieren jedoch halbautomatische Systeme, die menschliche Bearbeiter durch Klassifizierungsvorschläge unterstützen, da die gegenwärtig erreichbare Klassifizierungsgüte für eine Vollautomatisierung meist noch nicht ausreicht. Weitere interessante Anwendungen und Projekte finden sich im Bereich von Web-Portalen, Suchmaschinen und (kommerziellen) Informationsdiensten, während sich etwa im Bibliothekswesen kaum nennenswertes Interesse an einer automatischen Klassifizierung von Büchern bzw. bibliographischen Datensätzen registrieren lässt. Die Studie schliesst mit einer Diskussion der wichtigsten Projekte und Anwendungen sowie einiger im Zusammenhang mit dem automatischen Klassifizieren relevanter Fragestellungen und Themen.

Khoo, C.S.G.; Ng, K.; Ou, S.: ¬An exploratory study of human clustering of Web pages (2003) 0.00

0.0025399253 = product of:
  0.015239551 = sum of:
    0.015239551 = product of:
      0.038098875 = sum of:
        0.01948196 = weight(_text_:28 in 2741) [ClassicSimilarity], result of:
          0.01948196 = score(doc=2741,freq=2.0), product of:
            0.12305808 = queryWeight, product of:
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.03435205 = queryNorm
            0.15831517 = fieldWeight in 2741, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.03125 = fieldNorm(doc=2741)
        0.018616915 = weight(_text_:22 in 2741) [ClassicSimilarity], result of:
          0.018616915 = score(doc=2741,freq=2.0), product of:
            0.120295025 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03435205 = queryNorm
            0.15476047 = fieldWeight in 2741, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=2741)
      0.4 = coord(2/5)
  0.16666667 = coord(1/6)

Date: 6. 1.1997 18:30:28
12. 9.2004 9:56:22

Panyr, J.: STEINADLER: ein Verfahren zur automatischen Deskribierung und zur automatischen thematischen Klassifikation (1978) 0.00

0.0025047874 = product of:
  0.015028724 = sum of:
    0.015028724 = product of:
      0.07514362 = sum of:
        0.07514362 = weight(_text_:29 in 5169) [ClassicSimilarity], result of:
          0.07514362 = score(doc=5169,freq=2.0), product of:
            0.12083977 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.03435205 = queryNorm
            0.6218451 = fieldWeight in 5169, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.125 = fieldNorm(doc=5169)
      0.2 = coord(1/5)
  0.16666667 = coord(1/6)

Source: Nachrichten für Dokumentation. 29(1978), S.92-96

Search (60 results, page 1 of 3)

Authors

Years

Languages

Types

Themes

Subjects