Search (65 results, page 2 of 4)

  • × theme_ss:"Automatisches Klassifizieren"
  1. Savic, D.: Designing an expert system for classifying office documents (1994) 0.01
    0.0053307354 = product of:
      0.015992206 = sum of:
        0.015992206 = product of:
          0.047976613 = sum of:
            0.047976613 = weight(_text_:29 in 2655) [ClassicSimilarity], result of:
              0.047976613 = score(doc=2655,freq=2.0), product of:
                0.15430406 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0438652 = queryNorm
                0.31092256 = fieldWeight in 2655, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2655)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Source
    Records management quarterly. 28(1994) no.3, S.20-29
  2. Pong, J.Y.-H.; Kwok, R.C.-W.; Lau, R.Y.-K.; Hao, J.-X.; Wong, P.C.-C.: ¬A comparative study of two automatic document classification methods in a library setting (2008) 0.00
    0.0048523275 = product of:
      0.014556982 = sum of:
        0.014556982 = product of:
          0.043670945 = sum of:
            0.043670945 = weight(_text_:k in 2532) [ClassicSimilarity], result of:
              0.043670945 = score(doc=2532,freq=4.0), product of:
                0.15658903 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0438652 = queryNorm
                0.2788889 = fieldWeight in 2532, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2532)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    In current library practice, trained human experts usually carry out document cataloguing and indexing based on a manual approach. With the explosive growth in the number of electronic documents available on the Internet and digital libraries, it is increasingly difficult for library practitioners to categorize both electronic documents and traditional library materials using just a manual approach. To improve the effectiveness and efficiency of document categorization at the library setting, more in-depth studies of using automatic document classification methods to categorize library items are required. Machine learning research has advanced rapidly in recent years. However, applying machine learning techniques to improve library practice is still a relatively unexplored area. This paper illustrates the design and development of a machine learning based automatic document classification system to alleviate the manual categorization problem encountered within the library setting. Two supervised machine learning algorithms have been tested. Our empirical tests show that supervised machine learning algorithms in general, and the k-nearest neighbours (KNN) algorithm in particular, can be used to develop an effective document classification system to enhance current library practice. Moreover, some concrete recommendations regarding how to practically apply the KNN algorithm to develop automatic document classification in a library setting are made. To our best knowledge, this is the first in-depth study of applying the KNN algorithm to automatic document classification based on the widely used LCC classification scheme adopted by many large libraries.
  3. Han, K.; Rezapour, R.; Nakamura, K.; Devkota, D.; Miller, D.C.; Diesner, J.: ¬An expert-in-the-loop method for domain-specific document categorization based on small training data (2023) 0.00
    0.0048523275 = product of:
      0.014556982 = sum of:
        0.014556982 = product of:
          0.043670945 = sum of:
            0.043670945 = weight(_text_:k in 967) [ClassicSimilarity], result of:
              0.043670945 = score(doc=967,freq=4.0), product of:
                0.15658903 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0438652 = queryNorm
                0.2788889 = fieldWeight in 967, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=967)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
  4. Yang, Y.; Liu, X.: ¬A re-examination of text categorization methods (1999) 0.00
    0.0048035593 = product of:
      0.014410677 = sum of:
        0.014410677 = product of:
          0.04323203 = sum of:
            0.04323203 = weight(_text_:k in 3386) [ClassicSimilarity], result of:
              0.04323203 = score(doc=3386,freq=2.0), product of:
                0.15658903 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0438652 = queryNorm
                0.27608594 = fieldWeight in 3386, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3386)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    This paper reports a controlled study with statistical significance tests an five text categorization methods: the Support Vector Machines (SVM), a k-Nearest Neighbor (kNN) classifier, a neural network (NNet) approach, the Linear Leastsquares Fit (LLSF) mapping and a Naive Bayes (NB) classifier. We focus an the robustness of these methods in dealing with a skewed category distribution, and their performance as function of the training-set category frequency. Our results show that SVM, kNN and LLSF significantly outperform NNet and NB when the number of positive training instances per category are small (less than ten, and that all the methods perform comparably when the categories are sufficiently common (over 300 instances).
  5. Sebastiani, F.: Classification of text, automatic (2006) 0.00
    0.0048035593 = product of:
      0.014410677 = sum of:
        0.014410677 = product of:
          0.04323203 = sum of:
            0.04323203 = weight(_text_:k in 5003) [ClassicSimilarity], result of:
              0.04323203 = score(doc=5003,freq=2.0), product of:
                0.15658903 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0438652 = queryNorm
                0.27608594 = fieldWeight in 5003, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5003)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Source
    Encyclopedia of language and linguistics. 2nd ed. Ed.: K. Brown. Vol. 14
  6. Savic, D.: Automatic classification of office documents : review of available methods and techniques (1995) 0.00
    0.004664393 = product of:
      0.013993179 = sum of:
        0.013993179 = product of:
          0.041979536 = sum of:
            0.041979536 = weight(_text_:29 in 2219) [ClassicSimilarity], result of:
              0.041979536 = score(doc=2219,freq=2.0), product of:
                0.15430406 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0438652 = queryNorm
                0.27205724 = fieldWeight in 2219, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2219)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Source
    Records management quarterly. 29(1995) no.4, S.3-18
  7. Ruocco, A.S.; Frieder, O.: Clustering and classification of large document bases in a parallel environment (1997) 0.00
    0.004664393 = product of:
      0.013993179 = sum of:
        0.013993179 = product of:
          0.041979536 = sum of:
            0.041979536 = weight(_text_:29 in 1661) [ClassicSimilarity], result of:
              0.041979536 = score(doc=1661,freq=2.0), product of:
                0.15430406 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0438652 = queryNorm
                0.27205724 = fieldWeight in 1661, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1661)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    29. 7.1998 17:45:02
  8. Ruiz, M.E.; Srinivasan, P.: Combining machine learning and hierarchical indexing structures for text categorization (2001) 0.00
    0.004664393 = product of:
      0.013993179 = sum of:
        0.013993179 = product of:
          0.041979536 = sum of:
            0.041979536 = weight(_text_:29 in 1595) [ClassicSimilarity], result of:
              0.041979536 = score(doc=1595,freq=2.0), product of:
                0.15430406 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0438652 = queryNorm
                0.27205724 = fieldWeight in 1595, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1595)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    11. 5.2003 18:29:44
  9. Bock, H.-H.: Datenanalyse zur Strukturierung und Ordnung von Information (1989) 0.00
    0.0046224343 = product of:
      0.013867302 = sum of:
        0.013867302 = product of:
          0.041601904 = sum of:
            0.041601904 = weight(_text_:22 in 141) [ClassicSimilarity], result of:
              0.041601904 = score(doc=141,freq=2.0), product of:
                0.15360846 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0438652 = queryNorm
                0.2708308 = fieldWeight in 141, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=141)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Pages
    S.1-22
  10. Dubin, D.: Dimensions and discriminability (1998) 0.00
    0.0046224343 = product of:
      0.013867302 = sum of:
        0.013867302 = product of:
          0.041601904 = sum of:
            0.041601904 = weight(_text_:22 in 2338) [ClassicSimilarity], result of:
              0.041601904 = score(doc=2338,freq=2.0), product of:
                0.15360846 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0438652 = queryNorm
                0.2708308 = fieldWeight in 2338, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2338)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    22. 9.1997 19:16:05
  11. Automatic classification research at OCLC (2002) 0.00
    0.0046224343 = product of:
      0.013867302 = sum of:
        0.013867302 = product of:
          0.041601904 = sum of:
            0.041601904 = weight(_text_:22 in 1563) [ClassicSimilarity], result of:
              0.041601904 = score(doc=1563,freq=2.0), product of:
                0.15360846 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0438652 = queryNorm
                0.2708308 = fieldWeight in 1563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1563)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    5. 5.2003 9:22:09
  12. Jenkins, C.: Automatic classification of Web resources using Java and Dewey Decimal Classification (1998) 0.00
    0.0046224343 = product of:
      0.013867302 = sum of:
        0.013867302 = product of:
          0.041601904 = sum of:
            0.041601904 = weight(_text_:22 in 1673) [ClassicSimilarity], result of:
              0.041601904 = score(doc=1673,freq=2.0), product of:
                0.15360846 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0438652 = queryNorm
                0.2708308 = fieldWeight in 1673, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1673)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    1. 8.1996 22:08:06
  13. Yoon, Y.; Lee, C.; Lee, G.G.: ¬An effective procedure for constructing a hierarchical text classification system (2006) 0.00
    0.0046224343 = product of:
      0.013867302 = sum of:
        0.013867302 = product of:
          0.041601904 = sum of:
            0.041601904 = weight(_text_:22 in 5273) [ClassicSimilarity], result of:
              0.041601904 = score(doc=5273,freq=2.0), product of:
                0.15360846 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0438652 = queryNorm
                0.2708308 = fieldWeight in 5273, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5273)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    22. 7.2006 16:24:52
  14. Sun, A.; Lim, E.-P.; Ng, W.-K.: Performance measurement framework for hierarchical text classification (2003) 0.00
    0.004117336 = product of:
      0.012352008 = sum of:
        0.012352008 = product of:
          0.037056025 = sum of:
            0.037056025 = weight(_text_:k in 1808) [ClassicSimilarity], result of:
              0.037056025 = score(doc=1808,freq=2.0), product of:
                0.15658903 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0438652 = queryNorm
                0.23664509 = fieldWeight in 1808, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1808)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
  15. Golub, K.: Automated subject classification of textual Web pages, based on a controlled vocabulary : challenges and recommendations (2006) 0.00
    0.004117336 = product of:
      0.012352008 = sum of:
        0.012352008 = product of:
          0.037056025 = sum of:
            0.037056025 = weight(_text_:k in 5897) [ClassicSimilarity], result of:
              0.037056025 = score(doc=5897,freq=2.0), product of:
                0.15658903 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0438652 = queryNorm
                0.23664509 = fieldWeight in 5897, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5897)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
  16. Hagedorn, K.; Chapman, S.; Newman, D.: Enhancing search and browse using automated clustering of subject metadata (2007) 0.00
    0.004117336 = product of:
      0.012352008 = sum of:
        0.012352008 = product of:
          0.037056025 = sum of:
            0.037056025 = weight(_text_:k in 1168) [ClassicSimilarity], result of:
              0.037056025 = score(doc=1168,freq=2.0), product of:
                0.15658903 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0438652 = queryNorm
                0.23664509 = fieldWeight in 1168, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1168)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
  17. Golub, K.; Hamon, T.; Ardö, A.: Automated classification of textual documents based on a controlled vocabulary in engineering (2007) 0.00
    0.004117336 = product of:
      0.012352008 = sum of:
        0.012352008 = product of:
          0.037056025 = sum of:
            0.037056025 = weight(_text_:k in 1461) [ClassicSimilarity], result of:
              0.037056025 = score(doc=1461,freq=2.0), product of:
                0.15658903 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0438652 = queryNorm
                0.23664509 = fieldWeight in 1461, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1461)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
  18. Reiner, U.: DDC-based search in the data of the German National Bibliography (2008) 0.00
    0.004117336 = product of:
      0.012352008 = sum of:
        0.012352008 = product of:
          0.037056025 = sum of:
            0.037056025 = weight(_text_:k in 2166) [ClassicSimilarity], result of:
              0.037056025 = score(doc=2166,freq=2.0), product of:
                0.15658903 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0438652 = queryNorm
                0.23664509 = fieldWeight in 2166, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2166)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Source
    New pespectives on subject indexing and classification: essays in honour of Magda Heiner-Freiling. Red.: K. Knull-Schlomann, u.a
  19. Golub, K.: Automated subject classification of textual documents in the context of Web-based hierarchical browsing (2011) 0.00
    0.004117336 = product of:
      0.012352008 = sum of:
        0.012352008 = product of:
          0.037056025 = sum of:
            0.037056025 = weight(_text_:k in 4558) [ClassicSimilarity], result of:
              0.037056025 = score(doc=4558,freq=2.0), product of:
                0.15658903 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0438652 = queryNorm
                0.23664509 = fieldWeight in 4558, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4558)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
  20. Sojka, P.; Lee, M.; Rehurek, R.; Hatlapatka, R.; Kucbel, M.; Bouche, T.; Goutorbe, C.; Anghelache, R.; Wojciechowski, K.: Toolset for entity and semantic associations : Final Release (2013) 0.00
    0.004117336 = product of:
      0.012352008 = sum of:
        0.012352008 = product of:
          0.037056025 = sum of:
            0.037056025 = weight(_text_:k in 1057) [ClassicSimilarity], result of:
              0.037056025 = score(doc=1057,freq=2.0), product of:
                0.15658903 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0438652 = queryNorm
                0.23664509 = fieldWeight in 1057, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1057)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    

Languages

  • e 55
  • d 9
  • a 1
  • More… Less…

Types

  • a 58
  • el 9
  • r 1
  • x 1
  • More… Less…