Search (52 results, page 1 of 3)

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.18

0.1823518 = product of:
  0.24313574 = sum of:
    0.059267204 = product of:
      0.17780161 = sum of:
        0.17780161 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
          0.17780161 = score(doc=562,freq=2.0), product of:
            0.31636283 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03731569 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.33333334 = coord(1/3)
    0.17780161 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
      0.17780161 = score(doc=562,freq=2.0), product of:
        0.31636283 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03731569 = queryNorm
        0.56201804 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.006066913 = product of:
      0.030334564 = sum of:
        0.030334564 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
          0.030334564 = score(doc=562,freq=2.0), product of:
            0.13067318 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03731569 = queryNorm
            0.23214069 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.2 = coord(1/5)
  0.75 = coord(3/4)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32

Oberhauser, O.: Automatisches Klassifizieren : Verfahren zur Erschließung elektronischer Dokumente (2004) 0.01
```
0.012944549 = product of:
  0.051778197 = sum of:
    0.051778197 = weight(_text_:lernen in 2487) [ClassicSimilarity], result of:
      0.051778197 = score(doc=2487,freq=2.0), product of:
        0.20909165 = queryWeight, product of:
          5.6033173 = idf(docFreq=442, maxDocs=44218)
          0.03731569 = queryNorm
        0.24763398 = fieldWeight in 2487, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.6033173 = idf(docFreq=442, maxDocs=44218)
          0.03125 = fieldNorm(doc=2487)
  0.25 = coord(1/4)
```
Abstract

Automatisches Klassifizieren von Textdokumenten bedeutet die maschinelle Zuordnung jeweils einer oder mehrerer Notationen eines vorgegebenen Klassifikationssystems zu natürlich-sprachlichen Texten mithilfe eines geeigneten Algorithmus. In der vorliegenden Arbeit wird in Form einer umfassenden Literaturstudie ein aktueller Kenntnisstand zu den Ein-satzmöglichkeiten des automatischen Klassifizierens für die sachliche Erschliessung von elektronischen Dokumenten, insbesondere von Web-Ressourcen, erarbeitet. Dies betrifft zum einen den methodischen Aspekt und zum anderen die in relevanten Projekten und Anwendungen gewonnenen Erfahrungen. In methodischer Hinsicht gelten heute statistische Verfahren, die auf dem maschinellen Lernen basieren und auf der Grundlage bereits klassifizierter Beispieldokumente ein Modell - einen "Klassifikator" - erstellen, das zur Klassifizierung neuer Dokumente verwendet werden kann, als "state-of-the-art". Die vier in den 1990er Jahren an den Universitäten Lund, Wolverhampton und Oldenburg sowie bei OCLC (Dublin, OH) durchgeführten "grossen" Projekte zum automatischen Klassifizieren von Web-Ressourcen, die in dieser Arbeit ausführlich analysiert werden, arbeiteten allerdings noch mit einfacheren bzw. älteren methodischen Ansätzen. Diese Projekte bedeuten insbesondere aufgrund ihrer Verwendung etablierter bibliothekarischer Klassifikationssysteme einen wichtigen Erfahrungsgewinn, selbst wenn sie bisher nicht zu permanenten und qualitativ zufriedenstellenden Diensten für die Erschliessung elektronischer Ressourcen geführt haben. Die Analyse der weiteren einschlägigen Anwendungen und Projekte lässt erkennen, dass derzeit in den Bereichen Patent- und Mediendokumentation die aktivsten Bestrebungen bestehen, Systeme für die automatische klassifikatorische Erschliessung elektronischer Dokumente im laufenden operativen Betrieb einzusetzen. Dabei dominieren jedoch halbautomatische Systeme, die menschliche Bearbeiter durch Klassifizierungsvorschläge unterstützen, da die gegenwärtig erreichbare Klassifizierungsgüte für eine Vollautomatisierung meist noch nicht ausreicht. Weitere interessante Anwendungen und Projekte finden sich im Bereich von Web-Portalen, Suchmaschinen und (kommerziellen) Informationsdiensten, während sich etwa im Bibliothekswesen kaum nennenswertes Interesse an einer automatischen Klassifizierung von Büchern bzw. bibliographischen Datensätzen registrieren lässt. Die Studie schliesst mit einer Diskussion der wichtigsten Projekte und Anwendungen sowie einiger im Zusammenhang mit dem automatischen Klassifizieren relevanter Fragestellungen und Themen.

Savic, D.: Designing an expert system for classifying office documents (1994) 0.01

0.008313867 = product of:
  0.03325547 = sum of:
    0.03325547 = product of:
      0.08313867 = sum of:
        0.04232544 = weight(_text_:28 in 2655) [ClassicSimilarity], result of:
          0.04232544 = score(doc=2655,freq=2.0), product of:
            0.13367462 = queryWeight, product of:
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.03731569 = queryNorm
            0.31663033 = fieldWeight in 2655, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.0625 = fieldNorm(doc=2655)
        0.040813226 = weight(_text_:29 in 2655) [ClassicSimilarity], result of:
          0.040813226 = score(doc=2655,freq=2.0), product of:
            0.13126493 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.03731569 = queryNorm
            0.31092256 = fieldWeight in 2655, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0625 = fieldNorm(doc=2655)
      0.4 = coord(2/5)
  0.25 = coord(1/4)

Source: Records management quarterly. 28(1994) no.3, S.20-29

Oberhauser, O.: Automatisches Klassifizieren : Entwicklungsstand - Methodik - Anwendungsbereiche (2005) 0.01
```
0.008090343 = product of:
  0.032361373 = sum of:
    0.032361373 = weight(_text_:lernen in 38) [ClassicSimilarity], result of:
      0.032361373 = score(doc=38,freq=2.0), product of:
        0.20909165 = queryWeight, product of:
          5.6033173 = idf(docFreq=442, maxDocs=44218)
          0.03731569 = queryNorm
        0.15477124 = fieldWeight in 38, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.6033173 = idf(docFreq=442, maxDocs=44218)
          0.01953125 = fieldNorm(doc=38)
  0.25 = coord(1/4)
```
Abstract

Automatisches Klassifizieren von Textdokumenten bedeutet die maschinelle Zuordnung jeweils einer oder mehrerer Notationen eines vorgegebenen Klassifikationssystems zu natürlich-sprachlichen Texten mithilfe eines geeigneten Algorithmus. In der vorliegenden Arbeit wird in Form einer umfassenden Literaturstudie ein aktueller Kenntnisstand zu den Ein-satzmöglichkeiten des automatischen Klassifizierens für die sachliche Erschliessung von elektronischen Dokumenten, insbesondere von Web-Ressourcen, erarbeitet. Dies betrifft zum einen den methodischen Aspekt und zum anderen die in relevanten Projekten und Anwendungen gewonnenen Erfahrungen. In methodischer Hinsicht gelten heute statistische Verfahren, die auf dem maschinellen Lernen basieren und auf der Grundlage bereits klassifizierter Beispieldokumente ein Modell - einen "Klassifikator" - erstellen, das zur Klassifizierung neuer Dokumente verwendet werden kann, als "state-of-the-art". Die vier in den 1990er Jahren an den Universitäten Lund, Wolverhampton und Oldenburg sowie bei OCLC (Dublin, OH) durchgeführten "grossen" Projekte zum automatischen Klassifizieren von Web-Ressourcen, die in dieser Arbeit ausführlich analysiert werden, arbeiteten allerdings noch mit einfacheren bzw. älteren methodischen Ansätzen. Diese Projekte bedeuten insbesondere aufgrund ihrer Verwendung etablierter bibliothekarischer Klassifikationssysteme einen wichtigen Erfahrungsgewinn, selbst wenn sie bisher nicht zu permanenten und qualitativ zufriedenstellenden Diensten für die Erschliessung elektronischer Ressourcen geführt haben. Die Analyse der weiteren einschlägigen Anwendungen und Projekte lässt erkennen, dass derzeit in den Bereichen Patent- und Mediendokumentation die aktivsten Bestrebungen bestehen, Systeme für die automatische klassifikatorische Erschliessung elektronischer Dokumente im laufenden operativen Betrieb einzusetzen. Dabei dominieren jedoch halbautomatische Systeme, die menschliche Bearbeiter durch Klassifizierungsvorschläge unterstützen, da die gegenwärtig erreichbare Klassifizierungsgüte für eine Vollautomatisierung meist noch nicht ausreicht. Weitere interessante Anwendungen und Projekte finden sich im Bereich von Web-Portalen, Suchmaschinen und (kommerziellen) Informationsdiensten, während sich etwa im Bibliothekswesen kaum nennenswertes Interesse an einer automatischen Klassifizierung von Büchern bzw. bibliographischen Datensätzen registrieren lässt. Die Studie schliesst mit einer Diskussion der wichtigsten Projekte und Anwendungen sowie einiger im Zusammenhang mit dem automatischen Klassifizieren relevanter Fragestellungen und Themen.

Savic, D.: Automatic classification of office documents : review of available methods and techniques (1995) 0.01

0.0072746337 = product of:
  0.029098535 = sum of:
    0.029098535 = product of:
      0.07274634 = sum of:
        0.03703476 = weight(_text_:28 in 2219) [ClassicSimilarity], result of:
          0.03703476 = score(doc=2219,freq=2.0), product of:
            0.13367462 = queryWeight, product of:
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.03731569 = queryNorm
            0.27705154 = fieldWeight in 2219, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2219)
        0.03571157 = weight(_text_:29 in 2219) [ClassicSimilarity], result of:
          0.03571157 = score(doc=2219,freq=2.0), product of:
            0.13126493 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.03731569 = queryNorm
            0.27205724 = fieldWeight in 2219, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2219)
      0.4 = coord(2/5)
  0.25 = coord(1/4)

Date: 23. 7.1996 10:28:09
Source: Records management quarterly. 29(1995) no.4, S.3-18

Yi, K.: Automatic text classification using library classification schemes : trends, issues and challenges (2007) 0.01

0.007242508 = product of:
  0.028970033 = sum of:
    0.028970033 = product of:
      0.07242508 = sum of:
        0.03703476 = weight(_text_:28 in 2560) [ClassicSimilarity], result of:
          0.03703476 = score(doc=2560,freq=2.0), product of:
            0.13367462 = queryWeight, product of:
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.03731569 = queryNorm
            0.27705154 = fieldWeight in 2560, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2560)
        0.035390325 = weight(_text_:22 in 2560) [ClassicSimilarity], result of:
          0.035390325 = score(doc=2560,freq=2.0), product of:
            0.13067318 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03731569 = queryNorm
            0.2708308 = fieldWeight in 2560, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2560)
      0.4 = coord(2/5)
  0.25 = coord(1/4)

Date: 28. 9.2003 11:42:17
22. 9.2008 18:31:54

Pfeffer, M.: Automatische Vergabe von RVK-Notationen mittels fallbasiertem Schließen (2009) 0.01

0.0062078643 = product of:
  0.024831457 = sum of:
    0.024831457 = product of:
      0.06207864 = sum of:
        0.031744078 = weight(_text_:28 in 3051) [ClassicSimilarity], result of:
          0.031744078 = score(doc=3051,freq=2.0), product of:
            0.13367462 = queryWeight, product of:
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.03731569 = queryNorm
            0.23747274 = fieldWeight in 3051, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.046875 = fieldNorm(doc=3051)
        0.030334564 = weight(_text_:22 in 3051) [ClassicSimilarity], result of:
          0.030334564 = score(doc=3051,freq=2.0), product of:
            0.13067318 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03731569 = queryNorm
            0.23214069 = fieldWeight in 3051, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=3051)
      0.4 = coord(2/5)
  0.25 = coord(1/4)

Date: 22. 8.2009 19:51:28

Ibekwe-SanJuan, F.; SanJuan, E.: From term variants to research topics (2002) 0.01

0.0051961667 = product of:
  0.020784667 = sum of:
    0.020784667 = product of:
      0.051961668 = sum of:
        0.0264534 = weight(_text_:28 in 1853) [ClassicSimilarity], result of:
          0.0264534 = score(doc=1853,freq=2.0), product of:
            0.13367462 = queryWeight, product of:
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.03731569 = queryNorm
            0.19789396 = fieldWeight in 1853, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1853)
        0.025508268 = weight(_text_:29 in 1853) [ClassicSimilarity], result of:
          0.025508268 = score(doc=1853,freq=2.0), product of:
            0.13126493 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.03731569 = queryNorm
            0.19432661 = fieldWeight in 1853, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1853)
      0.4 = coord(2/5)
  0.25 = coord(1/4)

Date: 6. 1.1997 18:30:28
Source: Knowledge organization. 29(2002) nos.3/4, S.181-197

Giorgetti, D.; Sebastiani, F.: Automating survey coding by multiclass text categorization techniques (2003) 0.01
```
0.0051961667 = product of:
  0.020784667 = sum of:
    0.020784667 = product of:
      0.051961668 = sum of:
        0.0264534 = weight(_text_:28 in 5172) [ClassicSimilarity], result of:
          0.0264534 = score(doc=5172,freq=2.0), product of:
            0.13367462 = queryWeight, product of:
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.03731569 = queryNorm
            0.19789396 = fieldWeight in 5172, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5172)
        0.025508268 = weight(_text_:29 in 5172) [ClassicSimilarity], result of:
          0.025508268 = score(doc=5172,freq=2.0), product of:
            0.13126493 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.03731569 = queryNorm
            0.19432661 = fieldWeight in 5172, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5172)
      0.4 = coord(2/5)
  0.25 = coord(1/4)
```
Abstract

In this issue Giorgetti, and Sebastiani suggest that answers to open ended questions in survey instruments can be coded automatically by creating classifiers which learn from training sets of manually coded answers. The manual effort required is only that of classifying a representative set of documents, not creating a dictionary of words that trigger an assignment. They use a naive Bayesian probabilistic learner from Mc Callum's RAINBOW package and the multi-class support vector machine learner from Hsu and Lin's BSVM package, both examples of text categorization techniques. Data from the 1996 General Social Survey by the U.S. National Opinion Research Center provided a set of answers to three questions (previously tested by Viechnicki using a dictionary approach), their associated manually assigned category codes, and a complete set of predefined category codes. The learners were run on three random disjoint subsets of the answer sets to create the classifiers and a remaining set was used as a test set. The dictionary approach is out preformed by 18% for RAINBOW and by 17% for BSVM, while the standard deviation of the results is reduced by 28% and 34% respectively over the dictionary approach.

Date

9. 7.2006 10:29:12

Khoo, C.S.G.; Ng, K.; Ou, S.: ¬An exploratory study of human clustering of Web pages (2003) 0.00

0.0041385763 = product of:
  0.016554305 = sum of:
    0.016554305 = product of:
      0.041385762 = sum of:
        0.02116272 = weight(_text_:28 in 2741) [ClassicSimilarity], result of:
          0.02116272 = score(doc=2741,freq=2.0), product of:
            0.13367462 = queryWeight, product of:
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.03731569 = queryNorm
            0.15831517 = fieldWeight in 2741, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.03125 = fieldNorm(doc=2741)
        0.020223042 = weight(_text_:22 in 2741) [ClassicSimilarity], result of:
          0.020223042 = score(doc=2741,freq=2.0), product of:
            0.13067318 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03731569 = queryNorm
            0.15476047 = fieldWeight in 2741, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=2741)
      0.4 = coord(2/5)
  0.25 = coord(1/4)

Date: 6. 1.1997 18:30:28
12. 9.2004 9:56:22

Panyr, J.: STEINADLER: ein Verfahren zur automatischen Deskribierung und zur automatischen thematischen Klassifikation (1978) 0.00

0.004081323 = product of:
  0.016325291 = sum of:
    0.016325291 = product of:
      0.08162645 = sum of:
        0.08162645 = weight(_text_:29 in 5169) [ClassicSimilarity], result of:
          0.08162645 = score(doc=5169,freq=2.0), product of:
            0.13126493 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.03731569 = queryNorm
            0.6218451 = fieldWeight in 5169, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.125 = fieldNorm(doc=5169)
      0.2 = coord(1/5)
  0.25 = coord(1/4)

Source: Nachrichten für Dokumentation. 29(1978), S.92-96

Kleinoeder, H.H.; Puzicha, J.: Automatische Katalogisierung am Beispiel einer Pilotanwendung (2002) 0.00

0.0037034762 = product of:
  0.014813905 = sum of:
    0.014813905 = product of:
      0.07406952 = sum of:
        0.07406952 = weight(_text_:28 in 1154) [ClassicSimilarity], result of:
          0.07406952 = score(doc=1154,freq=2.0), product of:
            0.13367462 = queryWeight, product of:
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.03731569 = queryNorm
            0.5541031 = fieldWeight in 1154, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.109375 = fieldNorm(doc=1154)
      0.2 = coord(1/5)
  0.25 = coord(1/4)

Date: 11. 7.2003 13:27:28

Subramanian, S.; Shafer, K.E.: Clustering (2001) 0.00

0.0030334564 = product of:
  0.012133826 = sum of:
    0.012133826 = product of:
      0.060669128 = sum of:
        0.060669128 = weight(_text_:22 in 1046) [ClassicSimilarity], result of:
          0.060669128 = score(doc=1046,freq=2.0), product of:
            0.13067318 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03731569 = queryNorm
            0.46428138 = fieldWeight in 1046, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=1046)
      0.2 = coord(1/5)
  0.25 = coord(1/4)

Date: 5. 5.2003 14:17:22

Reiner, U.: Automatische DDC-Klassifizierung von bibliografischen Titeldatensätzen (2009) 0.00

0.0025278805 = product of:
  0.010111522 = sum of:
    0.010111522 = product of:
      0.05055761 = sum of:
        0.05055761 = weight(_text_:22 in 611) [ClassicSimilarity], result of:
          0.05055761 = score(doc=611,freq=2.0), product of:
            0.13067318 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03731569 = queryNorm
            0.38690117 = fieldWeight in 611, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=611)
      0.2 = coord(1/5)
  0.25 = coord(1/4)

Date: 22. 8.2009 12:54:24

HaCohen-Kerner, Y. et al.: Classification using various machine learning methods and combinations of key-phrases and visual features (2016) 0.00

0.0025278805 = product of:
  0.010111522 = sum of:
    0.010111522 = product of:
      0.05055761 = sum of:
        0.05055761 = weight(_text_:22 in 2748) [ClassicSimilarity], result of:
          0.05055761 = score(doc=2748,freq=2.0), product of:
            0.13067318 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03731569 = queryNorm
            0.38690117 = fieldWeight in 2748, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2748)
      0.2 = coord(1/5)
  0.25 = coord(1/4)

Date: 1. 2.2016 18:25:22

Golub, K.; Hamon, T.; Ardö, A.: Automated classification of textual documents based on a controlled vocabulary in engineering (2007) 0.00

0.0022446455 = product of:
  0.008978582 = sum of:
    0.008978582 = product of:
      0.044892907 = sum of:
        0.044892907 = weight(_text_:28 in 1461) [ClassicSimilarity], result of:
          0.044892907 = score(doc=1461,freq=4.0), product of:
            0.13367462 = queryWeight, product of:
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.03731569 = queryNorm
            0.3358372 = fieldWeight in 1461, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.046875 = fieldNorm(doc=1461)
      0.2 = coord(1/5)
  0.25 = coord(1/4)

Date: 6. 1.1997 18:30:28
28. 2.2008 14:21:51

Desale, S.K.; Kumbhar, R.: Research on automatic classification of documents in library environment : a literature review (2013) 0.00

0.0022446455 = product of:
  0.008978582 = sum of:
    0.008978582 = product of:
      0.044892907 = sum of:
        0.044892907 = weight(_text_:28 in 1071) [ClassicSimilarity], result of:
          0.044892907 = score(doc=1071,freq=4.0), product of:
            0.13367462 = queryWeight, product of:
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.03731569 = queryNorm
            0.3358372 = fieldWeight in 1071, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.046875 = fieldNorm(doc=1071)
      0.2 = coord(1/5)
  0.25 = coord(1/4)

Date: 6. 1.1997 18:30:28
19. 9.2013 19:28:15

Aphinyanaphongs, Y.; Fu, L.D.; Li, Z.; Peskin, E.R.; Efstathiadis, E.; Aliferis, C.F.; Statnikov, A.: ¬A comprehensive empirical comparison of modern supervised classification and feature selection methods for text categorization (2014) 0.00
```
0.0022446455 = product of:
  0.008978582 = sum of:
    0.008978582 = product of:
      0.044892907 = sum of:
        0.044892907 = weight(_text_:28 in 1496) [ClassicSimilarity], result of:
          0.044892907 = score(doc=1496,freq=4.0), product of:
            0.13367462 = queryWeight, product of:
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.03731569 = queryNorm
            0.3358372 = fieldWeight in 1496, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.046875 = fieldNorm(doc=1496)
      0.2 = coord(1/5)
  0.25 = coord(1/4)
```
Abstract

An important aspect to performing text categorization is selecting appropriate supervised classification and feature selection methods. A comprehensive benchmark is needed to inform best practices in this broad application field. Previous benchmarks have evaluated performance for a few supervised classification and feature selection methods and limited ways to optimize them. The present work updates prior benchmarks by increasing the number of classifiers and feature selection methods order of magnitude, including adding recently developed, state-of-the-art methods. Specifically, this study used 229 text categorization data sets/tasks, and evaluated 28 classification methods (both well-established and proprietary/commercial) and 19 feature selection methods according to 4 classification performance metrics. We report several key findings that will be helpful in establishing best methodological practices for text categorization.

Date

26. 9.2014 18:28:57

Na, J.-C.; Sui, H.; Khoo, C.; Chan, S.; Zhou, Y.: Effectiveness of simple linguistic processing in automatic sentiment classification of product reviews (2004) 0.00

0.0018705378 = product of:
  0.007482151 = sum of:
    0.007482151 = product of:
      0.037410755 = sum of:
        0.037410755 = weight(_text_:28 in 2624) [ClassicSimilarity], result of:
          0.037410755 = score(doc=2624,freq=4.0), product of:
            0.13367462 = queryWeight, product of:
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.03731569 = queryNorm
            0.2798643 = fieldWeight in 2624, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5822632 = idf(docFreq=3342, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2624)
      0.2 = coord(1/5)
  0.25 = coord(1/4)

Date: 6. 1.1997 18:30:28
28. 8.2004 19:45:47

Ruocco, A.S.; Frieder, O.: Clustering and classification of large document bases in a parallel environment (1997) 0.00

0.0017855786 = product of:
  0.007142314 = sum of:
    0.007142314 = product of:
      0.03571157 = sum of:
        0.03571157 = weight(_text_:29 in 1661) [ClassicSimilarity], result of:
          0.03571157 = score(doc=1661,freq=2.0), product of:
            0.13126493 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.03731569 = queryNorm
            0.27205724 = fieldWeight in 1661, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1661)
      0.2 = coord(1/5)
  0.25 = coord(1/4)

Date: 29. 7.1998 17:45:02

Search (52 results, page 1 of 3)

Authors

Years

Languages

Types

Themes

Subjects