Search (52 results, page 1 of 3)

  • × theme_ss:"Automatisches Klassifizieren"
  1. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.19
    0.19072604 = product of:
      0.2543014 = sum of:
        0.05975248 = product of:
          0.17925744 = sum of:
            0.17925744 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
              0.17925744 = score(doc=562,freq=2.0), product of:
                0.3189532 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.037621226 = queryNorm
                0.56201804 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
        0.17925744 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
          0.17925744 = score(doc=562,freq=2.0), product of:
            0.3189532 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.037621226 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.015291469 = product of:
          0.030582938 = sum of:
            0.030582938 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
              0.030582938 = score(doc=562,freq=2.0), product of:
                0.13174312 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.037621226 = queryNorm
                0.23214069 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  2. Reiner, U.: Automatische DDC-Klassifizierung bibliografischer Titeldatensätze der Deutschen Nationalbibliografie (2009) 0.03
    0.03190141 = product of:
      0.12760565 = sum of:
        0.12760565 = sum of:
          0.107217014 = weight(_text_:intelligenz in 3284) [ClassicSimilarity], result of:
            0.107217014 = score(doc=3284,freq=8.0), product of:
              0.21362439 = queryWeight, product of:
                5.678294 = idf(docFreq=410, maxDocs=44218)
                0.037621226 = queryNorm
              0.501895 = fieldWeight in 3284, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                5.678294 = idf(docFreq=410, maxDocs=44218)
                0.03125 = fieldNorm(doc=3284)
          0.020388626 = weight(_text_:22 in 3284) [ClassicSimilarity], result of:
            0.020388626 = score(doc=3284,freq=2.0), product of:
              0.13174312 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.037621226 = queryNorm
              0.15476047 = fieldWeight in 3284, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.03125 = fieldNorm(doc=3284)
      0.25 = coord(1/4)
    
    Abstract
    Das Klassifizieren von Objekten (z. B. Fauna, Flora, Texte) ist ein Verfahren, das auf menschlicher Intelligenz basiert. In der Informatik - insbesondere im Gebiet der Künstlichen Intelligenz (KI) - wird u. a. untersucht, inweit Verfahren, die menschliche Intelligenz benötigen, automatisiert werden können. Hierbei hat sich herausgestellt, dass die Lösung von Alltagsproblemen eine größere Herausforderung darstellt, als die Lösung von Spezialproblemen, wie z. B. das Erstellen eines Schachcomputers. So ist "Rybka" der seit Juni 2007 amtierende Computerschach-Weltmeistern. Inwieweit Alltagsprobleme mit Methoden der Künstlichen Intelligenz gelöst werden können, ist eine - für den allgemeinen Fall - noch offene Frage. Beim Lösen von Alltagsproblemen spielt die Verarbeitung der natürlichen Sprache, wie z. B. das Verstehen, eine wesentliche Rolle. Den "gesunden Menschenverstand" als Maschine (in der Cyc-Wissensbasis in Form von Fakten und Regeln) zu realisieren, ist Lenat's Ziel seit 1984. Bezüglich des KI-Paradeprojektes "Cyc" gibt es CycOptimisten und Cyc-Pessimisten. Das Verstehen der natürlichen Sprache (z. B. Werktitel, Zusammenfassung, Vorwort, Inhalt) ist auch beim intellektuellen Klassifizieren von bibliografischen Titeldatensätzen oder Netzpublikationen notwendig, um diese Textobjekte korrekt klassifizieren zu können. Seit dem Jahr 2007 werden von der Deutschen Nationalbibliothek nahezu alle Veröffentlichungen mit der Dewey Dezimalklassifikation (DDC) intellektuell klassifiziert.
    Date
    22. 1.2010 14:41:24
  3. Pfeffer, M.: Automatische Vergabe von RVK-Notationen mittels fallbasiertem Schließen (2009) 0.03
    0.027748924 = product of:
      0.110995695 = sum of:
        0.110995695 = sum of:
          0.08041276 = weight(_text_:intelligenz in 3051) [ClassicSimilarity], result of:
            0.08041276 = score(doc=3051,freq=2.0), product of:
              0.21362439 = queryWeight, product of:
                5.678294 = idf(docFreq=410, maxDocs=44218)
                0.037621226 = queryNorm
              0.37642127 = fieldWeight in 3051, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.678294 = idf(docFreq=410, maxDocs=44218)
                0.046875 = fieldNorm(doc=3051)
          0.030582938 = weight(_text_:22 in 3051) [ClassicSimilarity], result of:
            0.030582938 = score(doc=3051,freq=2.0), product of:
              0.13174312 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.037621226 = queryNorm
              0.23214069 = fieldWeight in 3051, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=3051)
      0.25 = coord(1/4)
    
    Abstract
    Klassifikation von bibliografischen Einheiten ist für einen systematischen Zugang zu den Beständen einer Bibliothek und deren Aufstellung unumgänglich. Bislang wurde diese Aufgabe von Fachexperten manuell erledigt, sei es individuell nach einer selbst entwickelten Systematik oder kooperativ nach einer gemeinsamen Systematik. In dieser Arbeit wird ein Verfahren zur Automatisierung des Klassifikationsvorgangs vorgestellt. Dabei kommt das Verfahren des fallbasierten Schließens zum Einsatz, das im Kontext der Forschung zur künstlichen Intelligenz entwickelt wurde. Das Verfahren liefert für jedes Werk, für das bibliografische Daten vorliegen, eine oder mehrere mögliche Klassifikationen. In Experimenten werden die Ergebnisse der automatischen Klassifikation mit der durch Fachexperten verglichen. Diese Experimente belegen die hohe Qualität der automatischen Klassifikation und dass das Verfahren geeignet ist, Fachexperten bei der Klassifikationsarbeit signifikant zu entlasten. Auch die nahezu vollständige Resystematisierung eines Bibliothekskataloges ist - mit gewissen Abstrichen - möglich.
    Date
    22. 8.2009 19:51:28
  4. Zhu, W.Z.; Allen, R.B.: Document clustering using the LSI subspace signature model (2013) 0.02
    0.016820207 = product of:
      0.033640414 = sum of:
        0.018348943 = product of:
          0.055046827 = sum of:
            0.055046827 = weight(_text_:k in 690) [ClassicSimilarity], result of:
              0.055046827 = score(doc=690,freq=6.0), product of:
                0.13429943 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.037621226 = queryNorm
                0.40988132 = fieldWeight in 690, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.046875 = fieldNorm(doc=690)
          0.33333334 = coord(1/3)
        0.015291469 = product of:
          0.030582938 = sum of:
            0.030582938 = weight(_text_:22 in 690) [ClassicSimilarity], result of:
              0.030582938 = score(doc=690,freq=2.0), product of:
                0.13174312 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.037621226 = queryNorm
                0.23214069 = fieldWeight in 690, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=690)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    We describe the latent semantic indexing subspace signature model (LSISSM) for semantic content representation of unstructured text. Grounded on singular value decomposition, the model represents terms and documents by the distribution signatures of their statistical contribution across the top-ranking latent concept dimensions. LSISSM matches term signatures with document signatures according to their mapping coherence between latent semantic indexing (LSI) term subspace and LSI document subspace. LSISSM does feature reduction and finds a low-rank approximation of scalable and sparse term-document matrices. Experiments demonstrate that this approach significantly improves the performance of major clustering algorithms such as standard K-means and self-organizing maps compared with the vector space model and the traditional LSI model. The unique contribution ranking mechanism in LSISSM also improves the initialization of standard K-means compared with random seeding procedure, which sometimes causes low efficiency and effectiveness of clustering. A two-stage initialization strategy based on LSISSM significantly reduces the running time of standard K-means procedures.
    Date
    23. 3.2013 13:22:36
  5. Yi, K.: Automatic text classification using library classification schemes : trends, issues and challenges (2007) 0.02
    0.015099721 = product of:
      0.030199442 = sum of:
        0.012359395 = product of:
          0.037078183 = sum of:
            0.037078183 = weight(_text_:k in 2560) [ClassicSimilarity], result of:
              0.037078183 = score(doc=2560,freq=2.0), product of:
                0.13429943 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.037621226 = queryNorm
                0.27608594 = fieldWeight in 2560, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2560)
          0.33333334 = coord(1/3)
        0.017840048 = product of:
          0.035680097 = sum of:
            0.035680097 = weight(_text_:22 in 2560) [ClassicSimilarity], result of:
              0.035680097 = score(doc=2560,freq=2.0), product of:
                0.13174312 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.037621226 = queryNorm
                0.2708308 = fieldWeight in 2560, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2560)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Date
    22. 9.2008 18:31:54
  6. Pfeffer, M.: Automatische Vergabe von RVK-Notationen anhand von bibliografischen Daten mittels fallbasiertem Schließen (2007) 0.01
    0.010051595 = product of:
      0.04020638 = sum of:
        0.04020638 = product of:
          0.08041276 = sum of:
            0.08041276 = weight(_text_:intelligenz in 558) [ClassicSimilarity], result of:
              0.08041276 = score(doc=558,freq=2.0), product of:
                0.21362439 = queryWeight, product of:
                  5.678294 = idf(docFreq=410, maxDocs=44218)
                  0.037621226 = queryNorm
                0.37642127 = fieldWeight in 558, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.678294 = idf(docFreq=410, maxDocs=44218)
                  0.046875 = fieldNorm(doc=558)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Klassifikation von bibliografischen Einheiten ist für einen systematischen Zugang zu den Beständen einer Bibliothek und deren Aufstellung unumgänglich. Bislang wurde diese Aufgabe von Fachexperten manuell erledigt, sei es individuell nach einer selbst entwickelten Systematik oder kooperativ nach einer gemeinsamen Systematik. In dieser Arbeit wird ein Verfahren zur Automatisierung des Klassifikationsvorgangs vorgestellt. Dabei kommt das Verfahren des fallbasierten Schließens zum Einsatz, das im Kontext der Forschung zur künstlichen Intelligenz entwickelt wurde. Das Verfahren liefert für jedes Werk, für das bibliografische Daten vorliegen, eine oder mehrere mögliche Klassifikationen. In Experimenten werden die Ergebnisse der automatischen Klassifikation mit der durch Fachexperten verglichen. Diese Experimente belegen die hohe Qualität der automatischen Klassifikation und dass das Verfahren geeignet ist, Fachexperten bei der Klassifikationsarbeit signifikant zu entlasten. Auch die nahezu vollständige Resystematisierung eines Bibliothekskataloges ist - mit gewissen Abstrichen - möglich.
  7. Kasprzik, A.: Automatisierte und semiautomatisierte Klassifizierung : eine Analyse aktueller Projekte (2014) 0.01
    0.010051595 = product of:
      0.04020638 = sum of:
        0.04020638 = product of:
          0.08041276 = sum of:
            0.08041276 = weight(_text_:intelligenz in 2470) [ClassicSimilarity], result of:
              0.08041276 = score(doc=2470,freq=2.0), product of:
                0.21362439 = queryWeight, product of:
                  5.678294 = idf(docFreq=410, maxDocs=44218)
                  0.037621226 = queryNorm
                0.37642127 = fieldWeight in 2470, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.678294 = idf(docFreq=410, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2470)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Das sprunghafte Anwachsen der Menge digital verfügbarer Dokumente gepaart mit dem Zeit- und Personalmangel an wissenschaftlichen Bibliotheken legt den Einsatz von halb- oder vollautomatischen Verfahren für die verbale und klassifikatorische Inhaltserschließung nahe. Nach einer kurzen allgemeinen Einführung in die gängige Methodik beleuchtet dieser Artikel eine Reihe von Projekten zur automatisierten Klassifizierung aus dem Zeitraum 2007-2012 und aus dem deutschsprachigen Raum. Ein Großteil der vorgestellten Projekte verwendet Methoden des Maschinellen Lernens aus der Künstlichen Intelligenz, arbeitet meist mit angepassten Versionen einer kommerziellen Software und bezieht sich in der Regel auf die Dewey Decimal Classification (DDC). Als Datengrundlage dienen Metadatensätze, Abstracs, Inhaltsverzeichnisse und Volltexte in diversen Datenformaten. Die abschließende Analyse enthält eine Anordnung der Projekte nach einer Reihe von verschiedenen Kriterien und eine Zusammenfassung der aktuellen Lage und der größten Herausfordungen für automatisierte Klassifizierungsverfahren.
  8. Khoo, C.S.G.; Ng, K.; Ou, S.: ¬An exploratory study of human clustering of Web pages (2003) 0.01
    0.008628412 = product of:
      0.017256824 = sum of:
        0.007062511 = product of:
          0.021187533 = sum of:
            0.021187533 = weight(_text_:k in 2741) [ClassicSimilarity], result of:
              0.021187533 = score(doc=2741,freq=2.0), product of:
                0.13429943 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.037621226 = queryNorm
                0.15776339 = fieldWeight in 2741, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2741)
          0.33333334 = coord(1/3)
        0.010194313 = product of:
          0.020388626 = sum of:
            0.020388626 = weight(_text_:22 in 2741) [ClassicSimilarity], result of:
              0.020388626 = score(doc=2741,freq=2.0), product of:
                0.13174312 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.037621226 = queryNorm
                0.15476047 = fieldWeight in 2741, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2741)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Date
    12. 9.2004 9:56:22
  9. Subramanian, S.; Shafer, K.E.: Clustering (2001) 0.01
    0.0076457346 = product of:
      0.030582938 = sum of:
        0.030582938 = product of:
          0.061165877 = sum of:
            0.061165877 = weight(_text_:22 in 1046) [ClassicSimilarity], result of:
              0.061165877 = score(doc=1046,freq=2.0), product of:
                0.13174312 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.037621226 = queryNorm
                0.46428138 = fieldWeight in 1046, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=1046)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    5. 5.2003 14:17:22
  10. Sparck Jones, K.: Automatic classification (1976) 0.01
    0.007062511 = product of:
      0.028250044 = sum of:
        0.028250044 = product of:
          0.08475013 = sum of:
            0.08475013 = weight(_text_:k in 2908) [ClassicSimilarity], result of:
              0.08475013 = score(doc=2908,freq=2.0), product of:
                0.13429943 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.037621226 = queryNorm
                0.63105357 = fieldWeight in 2908, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.125 = fieldNorm(doc=2908)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
  11. Reiner, U.: Automatische DDC-Klassifizierung von bibliografischen Titeldatensätzen (2009) 0.01
    0.006371446 = product of:
      0.025485784 = sum of:
        0.025485784 = product of:
          0.050971568 = sum of:
            0.050971568 = weight(_text_:22 in 611) [ClassicSimilarity], result of:
              0.050971568 = score(doc=611,freq=2.0), product of:
                0.13174312 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.037621226 = queryNorm
                0.38690117 = fieldWeight in 611, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=611)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    22. 8.2009 12:54:24
  12. HaCohen-Kerner, Y. et al.: Classification using various machine learning methods and combinations of key-phrases and visual features (2016) 0.01
    0.006371446 = product of:
      0.025485784 = sum of:
        0.025485784 = product of:
          0.050971568 = sum of:
            0.050971568 = weight(_text_:22 in 2748) [ClassicSimilarity], result of:
              0.050971568 = score(doc=2748,freq=2.0), product of:
                0.13174312 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.037621226 = queryNorm
                0.38690117 = fieldWeight in 2748, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2748)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    1. 2.2016 18:25:22
  13. Yu, W.; Gong, Y.: Document clustering by concept factorization (2004) 0.01
    0.0052968836 = product of:
      0.021187535 = sum of:
        0.021187535 = product of:
          0.0635626 = sum of:
            0.0635626 = weight(_text_:k in 4084) [ClassicSimilarity], result of:
              0.0635626 = score(doc=4084,freq=2.0), product of:
                0.13429943 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.037621226 = queryNorm
                0.47329018 = fieldWeight in 4084, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.09375 = fieldNorm(doc=4084)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Source
    SIGIR'04: Proceedings of the 27th Annual International ACM-SIGIR Conference an Research and Development in Information Retrieval. Ed.: K. Järvelin, u.a
  14. Kwon, O.W.; Lee, J.H.: Text categorization based on k-nearest neighbor approach for web site classification (2003) 0.00
    0.0049350797 = product of:
      0.019740319 = sum of:
        0.019740319 = product of:
          0.059220955 = sum of:
            0.059220955 = weight(_text_:k in 1070) [ClassicSimilarity], result of:
              0.059220955 = score(doc=1070,freq=10.0), product of:
                0.13429943 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.037621226 = queryNorm
                0.44096208 = fieldWeight in 1070, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1070)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Abstract
    Automatic categorization is a viable method to deal with the scaling problem on the World Wide Web. For Web site classification, this paper proposes the use of Web pages linked with the home page in a different manner from the sole use of home pages in previous research. To implement our proposed method, we derive a scheme for Web site classification based on the k-nearest neighbor (k-NN) approach. It consists of three phases: Web page selection (connectivity analysis), Web page classification, and Web site classification. Given a Web site, the Web page selection chooses several representative Web pages using connectivity analysis. The k-NN classifier next classifies each of the selected Web pages. Finally, the classified Web pages are extended to a classification of the entire Web site. To improve performance, we supplement the k-NN approach with a feature selection method and a term weighting scheme using markup tags, and also reform its document-document similarity measure. In our experiments on a Korean commercial Web directory, the proposed system, using both a home page and its linked pages, improved the performance of micro-averaging breakeven point by 30.02%, compared with an ordinary classification which uses a home page only.
  15. Bock, H.-H.: Datenanalyse zur Strukturierung und Ordnung von Information (1989) 0.00
    0.004460012 = product of:
      0.017840048 = sum of:
        0.017840048 = product of:
          0.035680097 = sum of:
            0.035680097 = weight(_text_:22 in 141) [ClassicSimilarity], result of:
              0.035680097 = score(doc=141,freq=2.0), product of:
                0.13174312 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.037621226 = queryNorm
                0.2708308 = fieldWeight in 141, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=141)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Pages
    S.1-22
  16. Dubin, D.: Dimensions and discriminability (1998) 0.00
    0.004460012 = product of:
      0.017840048 = sum of:
        0.017840048 = product of:
          0.035680097 = sum of:
            0.035680097 = weight(_text_:22 in 2338) [ClassicSimilarity], result of:
              0.035680097 = score(doc=2338,freq=2.0), product of:
                0.13174312 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.037621226 = queryNorm
                0.2708308 = fieldWeight in 2338, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2338)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    22. 9.1997 19:16:05
  17. Automatic classification research at OCLC (2002) 0.00
    0.004460012 = product of:
      0.017840048 = sum of:
        0.017840048 = product of:
          0.035680097 = sum of:
            0.035680097 = weight(_text_:22 in 1563) [ClassicSimilarity], result of:
              0.035680097 = score(doc=1563,freq=2.0), product of:
                0.13174312 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.037621226 = queryNorm
                0.2708308 = fieldWeight in 1563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1563)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    5. 5.2003 9:22:09
  18. Jenkins, C.: Automatic classification of Web resources using Java and Dewey Decimal Classification (1998) 0.00
    0.004460012 = product of:
      0.017840048 = sum of:
        0.017840048 = product of:
          0.035680097 = sum of:
            0.035680097 = weight(_text_:22 in 1673) [ClassicSimilarity], result of:
              0.035680097 = score(doc=1673,freq=2.0), product of:
                0.13174312 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.037621226 = queryNorm
                0.2708308 = fieldWeight in 1673, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1673)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    1. 8.1996 22:08:06
  19. Yoon, Y.; Lee, C.; Lee, G.G.: ¬An effective procedure for constructing a hierarchical text classification system (2006) 0.00
    0.004460012 = product of:
      0.017840048 = sum of:
        0.017840048 = product of:
          0.035680097 = sum of:
            0.035680097 = weight(_text_:22 in 5273) [ClassicSimilarity], result of:
              0.035680097 = score(doc=5273,freq=2.0), product of:
                0.13174312 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.037621226 = queryNorm
                0.2708308 = fieldWeight in 5273, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5273)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    22. 7.2006 16:24:52
  20. Wätjen, H.-J.; Diekmann, B.; Möller, G.; Carstensen, K.-U.: Bericht zum DFG-Projekt: GERHARD : German Harvest Automated Retrieval and Directory (1998) 0.00
    0.0044140695 = product of:
      0.017656278 = sum of:
        0.017656278 = product of:
          0.05296883 = sum of:
            0.05296883 = weight(_text_:k in 3065) [ClassicSimilarity], result of:
              0.05296883 = score(doc=3065,freq=2.0), product of:
                0.13429943 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.037621226 = queryNorm
                0.39440846 = fieldWeight in 3065, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.078125 = fieldNorm(doc=3065)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    

Languages

  • e 43
  • d 8
  • a 1
  • More… Less…

Types

  • a 45
  • el 9
  • r 1
  • x 1
  • More… Less…