Document (#26564)

Author
Walther, R.
Title
Möglichkeiten und Grenzen automatischer Klassifikationen von Web-Dokumenten
Imprint
Bern : Rechts- und Wirtschaftswissenschaftlichen Fakultät
Year
2001
Pages
97 S
Abstract
Automatische Klassifikationen von Web- und andern Textdokumenten ermöglichen es, betriebsinterne und externe Informationen geordnet zugänglich zu machen. Die Forschung zur automatischen Klassifikation hat sich in den letzten Jahren intensiviert. Das Resultat sind verschiedenen Methoden, die heute in der Praxis einzeln oder kombiniert für die Klassifikation im Einsatz sind. In der vorliegenden Lizenziatsarbeit werden neben allgemeinen Grundsätzen einige Methoden zur automatischen Klassifikation genauer betrachtet und ihre Möglichkeiten und Grenzen erörtert. Daneben erfolgt die Präsentation der Resultate aus einer Umfrage bei Anbieterrfirmen von Softwarelösungen zur automatische Klassifikation von Text-Dokumenten. Die Ausführungen dienen der myax internet AG als Basis, ein eigenes Klassifikations-Produkt zu entwickeln
Content
Auch unter: http://www.ie.iwi.unibe.ch/roundtable/april/hostettler/lizwalther.pdf
Footnote
Lizenziatsarbeit an der Rechts- und Wirtschaftswissenschaftlichen Fakultät der Universität Bern, Institut für Wirtschaftsinformatik (Prof. G. Knolmayer)
Theme
Automatisches Klassifizieren
Internet

Similar documents (author)

  1. Walther, J.: ¬La construction d'un langage documentaire pluridisciplinaire (1992) 5.59
    5.5903964 = sum of:
      5.5903964 = weight(author_txt:walther in 2309) [ClassicSimilarity], result of:
        5.5903964 = fieldWeight in 2309, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.944634 = idf(docFreq=14, maxDocs=42306)
          0.625 = fieldNorm(doc=2309)
    
  2. Walther, C.: Wie Deutschland zur Dezimalklassifikation kam (1957) 5.59
    5.5903964 = sum of:
      5.5903964 = weight(author_txt:walther in 5009) [ClassicSimilarity], result of:
        5.5903964 = fieldWeight in 5009, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.944634 = idf(docFreq=14, maxDocs=42306)
          0.625 = fieldNorm(doc=5009)
    
  3. Walther, R.: In vierundzwanzig Bänden um die Welt : Die Neuauflage des 'Großen Brockhaus': wie die Enzyklopädie das Wissen der Gegenwart inventarisiert (1996) 5.59
    5.5903964 = sum of:
      5.5903964 = weight(author_txt:walther in 6349) [ClassicSimilarity], result of:
        5.5903964 = fieldWeight in 6349, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.944634 = idf(docFreq=14, maxDocs=42306)
          0.625 = fieldNorm(doc=6349)
    
  4. Walther, R.: Wille und Kraft aller einzelnen Glieder : Mit Abschluß seiner 20. Auflage wird der 'Brockhaus' eingestellt (1999) 5.59
    5.5903964 = sum of:
      5.5903964 = weight(author_txt:walther in 4389) [ClassicSimilarity], result of:
        5.5903964 = fieldWeight in 4389, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.944634 = idf(docFreq=14, maxDocs=42306)
          0.625 = fieldNorm(doc=4389)
    
  5. Walther, R.: Wanderung aus gestorbenen Systemen : Bibliotheken bemühen sich, digital archivierte Texte trotz des Wandels der Technik zugänglich zu halten (2003) 5.59
    5.5903964 = sum of:
      5.5903964 = weight(author_txt:walther in 2484) [ClassicSimilarity], result of:
        5.5903964 = fieldWeight in 2484, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.944634 = idf(docFreq=14, maxDocs=42306)
          0.625 = fieldNorm(doc=2484)
    

Similar documents (content)

  1. Hoffmann, R.: Entwicklung einer benutzerunterstützten automatisierten Klassifikation von Web - Dokumenten : Untersuchung gegenwärtiger Methoden zur automatisierten Dokumentklassifikation und Implementierung eines Prototyps zum verbesserten Information Retrieval für das xFIND System (2002) 0.28
    0.28034994 = sum of:
      0.28034994 = product of:
        1.0012498 = sum of:
          0.06832435 = weight(abstract_txt:automatischer in 198) [ClassicSimilarity], result of:
            0.06832435 = score(doc=198,freq=1.0), product of:
              0.17441818 = queryWeight, product of:
                1.1454052 = boost
                8.356848 = idf(docFreq=26, maxDocs=42306)
                0.01822175 = queryNorm
              0.39172724 = fieldWeight in 198, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.356848 = idf(docFreq=26, maxDocs=42306)
                0.046875 = fieldNorm(doc=198)
          0.040127184 = weight(abstract_txt:möglichkeiten in 198) [ClassicSimilarity], result of:
            0.040127184 = score(doc=198,freq=1.0), product of:
              0.15411462 = queryWeight, product of:
                1.5226504 = boost
                5.55461 = idf(docFreq=444, maxDocs=42306)
                0.01822175 = queryNorm
              0.26037234 = fieldWeight in 198, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.55461 = idf(docFreq=444, maxDocs=42306)
                0.046875 = fieldNorm(doc=198)
          0.091800176 = weight(abstract_txt:methoden in 198) [ClassicSimilarity], result of:
            0.091800176 = score(doc=198,freq=4.0), product of:
              0.16856228 = queryWeight, product of:
                1.5924231 = boost
                5.8091397 = idf(docFreq=344, maxDocs=42306)
                0.01822175 = queryNorm
              0.54460686 = fieldWeight in 198, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.8091397 = idf(docFreq=344, maxDocs=42306)
                0.046875 = fieldNorm(doc=198)
          0.08925563 = weight(abstract_txt:dokumenten in 198) [ClassicSimilarity], result of:
            0.08925563 = score(doc=198,freq=2.0), product of:
              0.20843236 = queryWeight, product of:
                1.7707646 = boost
                6.4597273 = idf(docFreq=179, maxDocs=42306)
                0.01822175 = queryNorm
              0.4282235 = fieldWeight in 198, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.4597273 = idf(docFreq=179, maxDocs=42306)
                0.046875 = fieldNorm(doc=198)
          0.17002328 = weight(abstract_txt:automatischen in 198) [ClassicSimilarity], result of:
            0.17002328 = score(doc=198,freq=5.0), product of:
              0.23599365 = queryWeight, product of:
                1.8842062 = boost
                6.873561 = idf(docFreq=118, maxDocs=42306)
                0.01822175 = queryNorm
              0.720457 = fieldWeight in 198, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.873561 = idf(docFreq=118, maxDocs=42306)
                0.046875 = fieldNorm(doc=198)
          0.07836974 = weight(abstract_txt:automatische in 198) [ClassicSimilarity], result of:
            0.07836974 = score(doc=198,freq=1.0), product of:
              0.24079658 = queryWeight, product of:
                1.9032832 = boost
                6.943154 = idf(docFreq=110, maxDocs=42306)
                0.01822175 = queryNorm
              0.32546034 = fieldWeight in 198, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.943154 = idf(docFreq=110, maxDocs=42306)
                0.046875 = fieldNorm(doc=198)
          0.46334943 = weight(title_txt:klassifikation in 198) [ClassicSimilarity], result of:
            0.46334943 = score(doc=198,freq=1.0), product of:
              0.3936653 = queryWeight, product of:
                3.441572 = boost
                6.2774057 = idf(docFreq=215, maxDocs=42306)
                0.01822175 = queryNorm
              1.1770136 = fieldWeight in 198, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2774057 = idf(docFreq=215, maxDocs=42306)
                0.1875 = fieldNorm(doc=198)
        0.28 = coord(7/25)
    
  2. Oberhauser, O.: Klassifikation in Online-Informationssystemen (1986) 0.24
    0.24034229 = sum of:
      0.24034229 = product of:
        1.5021393 = sum of:
          0.048105013 = weight(abstract_txt:sind in 2589) [ClassicSimilarity], result of:
            0.048105013 = score(doc=2589,freq=4.0), product of:
              0.07793977 = queryWeight, product of:
                1.0828239 = boost
                3.950128 = idf(docFreq=2213, maxDocs=42306)
                0.01822175 = queryNorm
              0.6172075 = fieldWeight in 2589, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.950128 = idf(docFreq=2213, maxDocs=42306)
                0.078125 = fieldNorm(doc=2589)
          0.06687864 = weight(abstract_txt:möglichkeiten in 2589) [ClassicSimilarity], result of:
            0.06687864 = score(doc=2589,freq=1.0), product of:
              0.15411462 = queryWeight, product of:
                1.5226504 = boost
                5.55461 = idf(docFreq=444, maxDocs=42306)
                0.01822175 = queryNorm
              0.43395388 = fieldWeight in 2589, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.55461 = idf(docFreq=444, maxDocs=42306)
                0.078125 = fieldNorm(doc=2589)
          0.1515573 = weight(abstract_txt:klassifikationen in 2589) [ClassicSimilarity], result of:
            0.1515573 = score(doc=2589,freq=1.0), product of:
              0.2658909 = queryWeight, product of:
                2.0 = boost
                7.295975 = idf(docFreq=77, maxDocs=42306)
                0.01822175 = queryNorm
              0.5699981 = fieldWeight in 2589, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.295975 = idf(docFreq=77, maxDocs=42306)
                0.078125 = fieldNorm(doc=2589)
          1.2355984 = weight(title_txt:klassifikation in 2589) [ClassicSimilarity], result of:
            1.2355984 = score(doc=2589,freq=1.0), product of:
              0.3936653 = queryWeight, product of:
                3.441572 = boost
                6.2774057 = idf(docFreq=215, maxDocs=42306)
                0.01822175 = queryNorm
              3.1387029 = fieldWeight in 2589, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2774057 = idf(docFreq=215, maxDocs=42306)
                0.5 = fieldNorm(doc=2589)
        0.16 = coord(4/25)
    
  3. Degens, P.O.: Hierarchische Klassifikation (1980) 0.23
    0.23258354 = sum of:
      0.23258354 = product of:
        1.9381962 = sum of:
          0.09363009 = weight(abstract_txt:möglichkeiten in 90) [ClassicSimilarity], result of:
            0.09363009 = score(doc=90,freq=1.0), product of:
              0.15411462 = queryWeight, product of:
                1.5226504 = boost
                5.55461 = idf(docFreq=444, maxDocs=42306)
                0.01822175 = queryNorm
              0.6075354 = fieldWeight in 90, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.55461 = idf(docFreq=444, maxDocs=42306)
                0.109375 = fieldNorm(doc=90)
          0.3000681 = weight(abstract_txt:klassifikationen in 90) [ClassicSimilarity], result of:
            0.3000681 = score(doc=90,freq=2.0), product of:
              0.2658909 = queryWeight, product of:
                2.0 = boost
                7.295975 = idf(docFreq=77, maxDocs=42306)
                0.01822175 = queryNorm
              1.1285385 = fieldWeight in 90, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.295975 = idf(docFreq=77, maxDocs=42306)
                0.109375 = fieldNorm(doc=90)
          1.544498 = weight(title_txt:klassifikation in 90) [ClassicSimilarity], result of:
            1.544498 = score(doc=90,freq=1.0), product of:
              0.3936653 = queryWeight, product of:
                3.441572 = boost
                6.2774057 = idf(docFreq=215, maxDocs=42306)
                0.01822175 = queryNorm
              3.9233785 = fieldWeight in 90, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2774057 = idf(docFreq=215, maxDocs=42306)
                0.625 = fieldNorm(doc=90)
        0.12 = coord(3/25)
    
  4. Manecke, H.-J.: Klassifikation, Klassieren (2004) 0.21
    0.21273954 = sum of:
      0.21273954 = product of:
        1.7728295 = sum of:
          0.024996096 = weight(abstract_txt:sind in 3903) [ClassicSimilarity], result of:
            0.024996096 = score(doc=3903,freq=3.0), product of:
              0.07793977 = queryWeight, product of:
                1.0828239 = boost
                3.950128 = idf(docFreq=2213, maxDocs=42306)
                0.01822175 = queryNorm
              0.32071042 = fieldWeight in 3903, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.950128 = idf(docFreq=2213, maxDocs=42306)
                0.046875 = fieldNorm(doc=3903)
          0.20333545 = weight(abstract_txt:klassifikationen in 3903) [ClassicSimilarity], result of:
            0.20333545 = score(doc=3903,freq=5.0), product of:
              0.2658909 = queryWeight, product of:
                2.0 = boost
                7.295975 = idf(docFreq=77, maxDocs=42306)
                0.01822175 = queryNorm
              0.76473266 = fieldWeight in 3903, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.295975 = idf(docFreq=77, maxDocs=42306)
                0.046875 = fieldNorm(doc=3903)
          1.544498 = weight(title_txt:klassifikation in 3903) [ClassicSimilarity], result of:
            1.544498 = score(doc=3903,freq=1.0), product of:
              0.3936653 = queryWeight, product of:
                3.441572 = boost
                6.2774057 = idf(docFreq=215, maxDocs=42306)
                0.01822175 = queryNorm
              3.9233785 = fieldWeight in 3903, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2774057 = idf(docFreq=215, maxDocs=42306)
                0.625 = fieldNorm(doc=3903)
        0.12 = coord(3/25)
    
  5. Helmbrecht-Schaar, A.: Entwicklung eines Verfahrens der automatischen Klassifizierung für Textdokumente aus dem Fachbereich Informatik mithilfe eines fachspezifischen Klassifikationssystems (2007) 0.19
    0.18875802 = sum of:
      0.18875802 = product of:
        0.78649175 = sum of:
          0.09458068 = weight(abstract_txt:möglichkeiten in 3411) [ClassicSimilarity], result of:
            0.09458068 = score(doc=3411,freq=2.0), product of:
              0.15411462 = queryWeight, product of:
                1.5226504 = boost
                5.55461 = idf(docFreq=444, maxDocs=42306)
                0.01822175 = queryNorm
              0.6137035 = fieldWeight in 3411, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.55461 = idf(docFreq=444, maxDocs=42306)
                0.078125 = fieldNorm(doc=3411)
          0.10518877 = weight(abstract_txt:dokumenten in 3411) [ClassicSimilarity], result of:
            0.10518877 = score(doc=3411,freq=1.0), product of:
              0.20843236 = queryWeight, product of:
                1.7707646 = boost
                6.4597273 = idf(docFreq=179, maxDocs=42306)
                0.01822175 = queryNorm
              0.5046662 = fieldWeight in 3411, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.4597273 = idf(docFreq=179, maxDocs=42306)
                0.078125 = fieldNorm(doc=3411)
          0.11504383 = weight(abstract_txt:grenzen in 3411) [ClassicSimilarity], result of:
            0.11504383 = score(doc=3411,freq=1.0), product of:
              0.22125569 = queryWeight, product of:
                1.8244228 = boost
                6.655472 = idf(docFreq=147, maxDocs=42306)
                0.01822175 = queryNorm
              0.51995873 = fieldWeight in 3411, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.655472 = idf(docFreq=147, maxDocs=42306)
                0.078125 = fieldNorm(doc=3411)
          0.12672788 = weight(abstract_txt:automatischen in 3411) [ClassicSimilarity], result of:
            0.12672788 = score(doc=3411,freq=1.0), product of:
              0.23599365 = queryWeight, product of:
                1.8842062 = boost
                6.873561 = idf(docFreq=118, maxDocs=42306)
                0.01822175 = queryNorm
              0.53699696 = fieldWeight in 3411, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.873561 = idf(docFreq=118, maxDocs=42306)
                0.078125 = fieldNorm(doc=3411)
          0.13061623 = weight(abstract_txt:automatische in 3411) [ClassicSimilarity], result of:
            0.13061623 = score(doc=3411,freq=1.0), product of:
              0.24079658 = queryWeight, product of:
                1.9032832 = boost
                6.943154 = idf(docFreq=110, maxDocs=42306)
                0.01822175 = queryNorm
              0.5424339 = fieldWeight in 3411, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.943154 = idf(docFreq=110, maxDocs=42306)
                0.078125 = fieldNorm(doc=3411)
          0.21433437 = weight(abstract_txt:klassifikationen in 3411) [ClassicSimilarity], result of:
            0.21433437 = score(doc=3411,freq=2.0), product of:
              0.2658909 = queryWeight, product of:
                2.0 = boost
                7.295975 = idf(docFreq=77, maxDocs=42306)
                0.01822175 = queryNorm
              0.80609894 = fieldWeight in 3411, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.295975 = idf(docFreq=77, maxDocs=42306)
                0.078125 = fieldNorm(doc=3411)
        0.24 = coord(6/25)