Search (23 results, page 1 of 2)

  • × theme_ss:"Automatisches Klassifizieren"
  1. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.10
    0.101439916 = sum of:
      0.08076982 = product of:
        0.24230945 = sum of:
          0.24230945 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
            0.24230945 = score(doc=562,freq=2.0), product of:
              0.43114176 = queryWeight, product of:
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.050854117 = queryNorm
              0.56201804 = fieldWeight in 562, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.046875 = fieldNorm(doc=562)
        0.33333334 = coord(1/3)
      0.020670092 = product of:
        0.041340183 = sum of:
          0.041340183 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
            0.041340183 = score(doc=562,freq=2.0), product of:
              0.17808245 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.050854117 = queryNorm
              0.23214069 = fieldWeight in 562, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=562)
        0.5 = coord(1/2)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  2. Leroy, G.; Miller, T.; Rosemblat, G.; Browne, A.: ¬A balanced approach to health information evaluation : a vocabulary-based naïve Bayes classifier and readability formulas (2008) 0.03
    0.034448884 = product of:
      0.06889777 = sum of:
        0.06889777 = product of:
          0.13779554 = sum of:
            0.13779554 = weight(_text_:90 in 1998) [ClassicSimilarity], result of:
              0.13779554 = score(doc=1998,freq=4.0), product of:
                0.2733978 = queryWeight, product of:
                  5.376119 = idf(docFreq=555, maxDocs=44218)
                  0.050854117 = queryNorm
                0.50401115 = fieldWeight in 1998, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.376119 = idf(docFreq=555, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1998)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Since millions seek health information online, it is vital for this information to be comprehensible. Most studies use readability formulas, which ignore vocabulary, and conclude that online health information is too difficult. We developed a vocabularly-based, naïve Bayes classifier to distinguish between three difficulty levels in text. It proved 98% accurate in a 250-document evaluation. We compared our classifier with readability formulas for 90 new documents with different origins and asked representative human evaluators, an expert and a consumer, to judge each document. Average readability grade levels for educational and commercial pages was 10th grade or higher, too difficult according to current literature. In contrast, the classifier showed that 70-90% of these pages were written at an intermediate, appropriate level indicating that vocabulary usage is frequently appropriate in text considered too difficult by readability formula evaluations. The expert considered the pages more difficult for a consumer than the consumer did.
  3. Panyr, J.: Automatische Indexierung und Klassifikation (1983) 0.03
    0.032478724 = product of:
      0.06495745 = sum of:
        0.06495745 = product of:
          0.1299149 = sum of:
            0.1299149 = weight(_text_:90 in 7692) [ClassicSimilarity], result of:
              0.1299149 = score(doc=7692,freq=2.0), product of:
                0.2733978 = queryWeight, product of:
                  5.376119 = idf(docFreq=555, maxDocs=44218)
                  0.050854117 = queryNorm
                0.4751863 = fieldWeight in 7692, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.376119 = idf(docFreq=555, maxDocs=44218)
                  0.0625 = fieldNorm(doc=7692)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Pages
    S.90-111
  4. Rose, J.R.; Gasteiger, J.: HORACE: an automatic system for the hierarchical classification of chemical reactions (1994) 0.03
    0.028418882 = product of:
      0.056837764 = sum of:
        0.056837764 = product of:
          0.11367553 = sum of:
            0.11367553 = weight(_text_:90 in 7696) [ClassicSimilarity], result of:
              0.11367553 = score(doc=7696,freq=2.0), product of:
                0.2733978 = queryWeight, product of:
                  5.376119 = idf(docFreq=555, maxDocs=44218)
                  0.050854117 = queryNorm
                0.415788 = fieldWeight in 7696, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.376119 = idf(docFreq=555, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=7696)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Journal of chemical information and computer sciences. 34(1994) no.1, S.74-90
  5. Subramanian, S.; Shafer, K.E.: Clustering (2001) 0.02
    0.020670092 = product of:
      0.041340183 = sum of:
        0.041340183 = product of:
          0.08268037 = sum of:
            0.08268037 = weight(_text_:22 in 1046) [ClassicSimilarity], result of:
              0.08268037 = score(doc=1046,freq=2.0), product of:
                0.17808245 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050854117 = queryNorm
                0.46428138 = fieldWeight in 1046, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=1046)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    5. 5.2003 14:17:22
  6. Calado, P.; Cristo, M.; Gonçalves, M.A.; Moura, E.S. de; Ribeiro-Neto, B.; Ziviani, N.: Link-based similarity measures for the classification of Web documents (2006) 0.02
    0.020299202 = product of:
      0.040598404 = sum of:
        0.040598404 = product of:
          0.08119681 = sum of:
            0.08119681 = weight(_text_:90 in 4921) [ClassicSimilarity], result of:
              0.08119681 = score(doc=4921,freq=2.0), product of:
                0.2733978 = queryWeight, product of:
                  5.376119 = idf(docFreq=555, maxDocs=44218)
                  0.050854117 = queryNorm
                0.29699144 = fieldWeight in 4921, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.376119 = idf(docFreq=555, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4921)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Traditional text-based document classifiers tend to perform poorly an the Web. Text in Web documents is usually noisy and often does not contain enough information to determine their topic. However, the Web provides a different source that can be useful to document classification: its hyperlink structure. In this work, the authors evaluate how the link structure of the Web can be used to determine a measure of similarity appropriate for document classification. They experiment with five different similarity measures and determine their adequacy for predicting the topic of a Web page. Tests performed an a Web directory Show that link information alone allows classifying documents with an average precision of 86%. Further, when combined with a traditional textbased classifier, precision increases to values of up to 90%, representing gains that range from 63 to 132% over the use of text-based classification alone. Because the measures proposed in this article are straightforward to compute, they provide a practical and effective solution for Web classification and related information retrieval tasks. Further, the authors provide an important set of guidelines an how link structure can be used effectively to classify Web documents.
  7. Wang, J.: ¬An extensive study on automated Dewey Decimal Classification (2009) 0.02
    0.020299202 = product of:
      0.040598404 = sum of:
        0.040598404 = product of:
          0.08119681 = sum of:
            0.08119681 = weight(_text_:90 in 3172) [ClassicSimilarity], result of:
              0.08119681 = score(doc=3172,freq=2.0), product of:
                0.2733978 = queryWeight, product of:
                  5.376119 = idf(docFreq=555, maxDocs=44218)
                  0.050854117 = queryNorm
                0.29699144 = fieldWeight in 3172, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.376119 = idf(docFreq=555, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3172)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In this paper, we present a theoretical analysis and extensive experiments on the automated assignment of Dewey Decimal Classification (DDC) classes to bibliographic data with a supervised machine-learning approach. Library classification systems, such as the DDC, impose great obstacles on state-of-art text categorization (TC) technologies, including deep hierarchy, data sparseness, and skewed distribution. We first analyze statistically the document and category distributions over the DDC, and discuss the obstacles imposed by bibliographic corpora and library classification schemes on TC technology. To overcome these obstacles, we propose an innovative algorithm to reshape the DDC structure into a balanced virtual tree by balancing the category distribution and flattening the hierarchy. To improve the classification effectiveness to a level acceptable to real-world applications, we propose an interactive classification model that is able to predict a class of any depth within a limited number of user interactions. The experiments are conducted on a large bibliographic collection created by the Library of Congress within the science and technology domains over 10 years. With no more than three interactions, a classification accuracy of nearly 90% is achieved, thus providing a practical solution to the automatic bibliographic classification problem.
  8. Reiner, U.: Automatische DDC-Klassifizierung von bibliografischen Titeldatensätzen (2009) 0.02
    0.017225077 = product of:
      0.034450155 = sum of:
        0.034450155 = product of:
          0.06890031 = sum of:
            0.06890031 = weight(_text_:22 in 611) [ClassicSimilarity], result of:
              0.06890031 = score(doc=611,freq=2.0), product of:
                0.17808245 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050854117 = queryNorm
                0.38690117 = fieldWeight in 611, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=611)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 8.2009 12:54:24
  9. HaCohen-Kerner, Y. et al.: Classification using various machine learning methods and combinations of key-phrases and visual features (2016) 0.02
    0.017225077 = product of:
      0.034450155 = sum of:
        0.034450155 = product of:
          0.06890031 = sum of:
            0.06890031 = weight(_text_:22 in 2748) [ClassicSimilarity], result of:
              0.06890031 = score(doc=2748,freq=2.0), product of:
                0.17808245 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050854117 = queryNorm
                0.38690117 = fieldWeight in 2748, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2748)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    1. 2.2016 18:25:22
  10. Bock, H.-H.: Datenanalyse zur Strukturierung und Ordnung von Information (1989) 0.01
    0.012057554 = product of:
      0.024115108 = sum of:
        0.024115108 = product of:
          0.048230216 = sum of:
            0.048230216 = weight(_text_:22 in 141) [ClassicSimilarity], result of:
              0.048230216 = score(doc=141,freq=2.0), product of:
                0.17808245 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050854117 = queryNorm
                0.2708308 = fieldWeight in 141, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=141)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Pages
    S.1-22
  11. Dubin, D.: Dimensions and discriminability (1998) 0.01
    0.012057554 = product of:
      0.024115108 = sum of:
        0.024115108 = product of:
          0.048230216 = sum of:
            0.048230216 = weight(_text_:22 in 2338) [ClassicSimilarity], result of:
              0.048230216 = score(doc=2338,freq=2.0), product of:
                0.17808245 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050854117 = queryNorm
                0.2708308 = fieldWeight in 2338, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2338)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 9.1997 19:16:05
  12. Automatic classification research at OCLC (2002) 0.01
    0.012057554 = product of:
      0.024115108 = sum of:
        0.024115108 = product of:
          0.048230216 = sum of:
            0.048230216 = weight(_text_:22 in 1563) [ClassicSimilarity], result of:
              0.048230216 = score(doc=1563,freq=2.0), product of:
                0.17808245 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050854117 = queryNorm
                0.2708308 = fieldWeight in 1563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1563)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    5. 5.2003 9:22:09
  13. Jenkins, C.: Automatic classification of Web resources using Java and Dewey Decimal Classification (1998) 0.01
    0.012057554 = product of:
      0.024115108 = sum of:
        0.024115108 = product of:
          0.048230216 = sum of:
            0.048230216 = weight(_text_:22 in 1673) [ClassicSimilarity], result of:
              0.048230216 = score(doc=1673,freq=2.0), product of:
                0.17808245 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050854117 = queryNorm
                0.2708308 = fieldWeight in 1673, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1673)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    1. 8.1996 22:08:06
  14. Yoon, Y.; Lee, C.; Lee, G.G.: ¬An effective procedure for constructing a hierarchical text classification system (2006) 0.01
    0.012057554 = product of:
      0.024115108 = sum of:
        0.024115108 = product of:
          0.048230216 = sum of:
            0.048230216 = weight(_text_:22 in 5273) [ClassicSimilarity], result of:
              0.048230216 = score(doc=5273,freq=2.0), product of:
                0.17808245 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050854117 = queryNorm
                0.2708308 = fieldWeight in 5273, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5273)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 7.2006 16:24:52
  15. Yi, K.: Automatic text classification using library classification schemes : trends, issues and challenges (2007) 0.01
    0.012057554 = product of:
      0.024115108 = sum of:
        0.024115108 = product of:
          0.048230216 = sum of:
            0.048230216 = weight(_text_:22 in 2560) [ClassicSimilarity], result of:
              0.048230216 = score(doc=2560,freq=2.0), product of:
                0.17808245 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050854117 = queryNorm
                0.2708308 = fieldWeight in 2560, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2560)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 9.2008 18:31:54
  16. Liu, R.-L.: Context recognition for hierarchical text classification (2009) 0.01
    0.010335046 = product of:
      0.020670092 = sum of:
        0.020670092 = product of:
          0.041340183 = sum of:
            0.041340183 = weight(_text_:22 in 2760) [ClassicSimilarity], result of:
              0.041340183 = score(doc=2760,freq=2.0), product of:
                0.17808245 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050854117 = queryNorm
                0.23214069 = fieldWeight in 2760, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2760)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 3.2009 19:11:54
  17. Pfeffer, M.: Automatische Vergabe von RVK-Notationen mittels fallbasiertem Schließen (2009) 0.01
    0.010335046 = product of:
      0.020670092 = sum of:
        0.020670092 = product of:
          0.041340183 = sum of:
            0.041340183 = weight(_text_:22 in 3051) [ClassicSimilarity], result of:
              0.041340183 = score(doc=3051,freq=2.0), product of:
                0.17808245 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050854117 = queryNorm
                0.23214069 = fieldWeight in 3051, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3051)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 8.2009 19:51:28
  18. Zhu, W.Z.; Allen, R.B.: Document clustering using the LSI subspace signature model (2013) 0.01
    0.010335046 = product of:
      0.020670092 = sum of:
        0.020670092 = product of:
          0.041340183 = sum of:
            0.041340183 = weight(_text_:22 in 690) [ClassicSimilarity], result of:
              0.041340183 = score(doc=690,freq=2.0), product of:
                0.17808245 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050854117 = queryNorm
                0.23214069 = fieldWeight in 690, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=690)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    23. 3.2013 13:22:36
  19. Egbert, J.; Biber, D.; Davies, M.: Developing a bottom-up, user-based method of web register classification (2015) 0.01
    0.010335046 = product of:
      0.020670092 = sum of:
        0.020670092 = product of:
          0.041340183 = sum of:
            0.041340183 = weight(_text_:22 in 2158) [ClassicSimilarity], result of:
              0.041340183 = score(doc=2158,freq=2.0), product of:
                0.17808245 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050854117 = queryNorm
                0.23214069 = fieldWeight in 2158, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2158)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    4. 8.2015 19:22:04
  20. Mengle, S.; Goharian, N.: Passage detection using text classification (2009) 0.01
    0.008612539 = product of:
      0.017225077 = sum of:
        0.017225077 = product of:
          0.034450155 = sum of:
            0.034450155 = weight(_text_:22 in 2765) [ClassicSimilarity], result of:
              0.034450155 = score(doc=2765,freq=2.0), product of:
                0.17808245 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050854117 = queryNorm
                0.19345059 = fieldWeight in 2765, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2765)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 3.2009 19:14:43