Search (15 results, page 1 of 1)

  • × theme_ss:"Automatisches Klassifizieren"
  • × year_i:[2000 TO 2010}
  1. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.10
    0.102153145 = sum of:
      0.08133772 = product of:
        0.24401315 = sum of:
          0.24401315 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
            0.24401315 = score(doc=562,freq=2.0), product of:
              0.43417317 = queryWeight, product of:
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.051211677 = queryNorm
              0.56201804 = fieldWeight in 562, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.046875 = fieldNorm(doc=562)
        0.33333334 = coord(1/3)
      0.020815425 = product of:
        0.04163085 = sum of:
          0.04163085 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
            0.04163085 = score(doc=562,freq=2.0), product of:
              0.17933457 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.051211677 = queryNorm
              0.23214069 = fieldWeight in 562, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=562)
        0.5 = coord(1/2)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  2. Frank, E.; Paynter, G.W.: Predicting Library of Congress Classifications from Library of Congress Subject Headings (2004) 0.03
    0.028232787 = product of:
      0.056465574 = sum of:
        0.056465574 = product of:
          0.11293115 = sum of:
            0.11293115 = weight(_text_:headings in 2218) [ClassicSimilarity], result of:
              0.11293115 = score(doc=2218,freq=4.0), product of:
                0.24837378 = queryWeight, product of:
                  4.849944 = idf(docFreq=940, maxDocs=44218)
                  0.051211677 = queryNorm
                0.45468226 = fieldWeight in 2218, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.849944 = idf(docFreq=940, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2218)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This paper addresses the problem of automatically assigning a Library of Congress Classification (LCC) to a work given its set of Library of Congress Subject Headings (LCSH). LCCs are organized in a tree: The root node of this hierarchy comprises all possible topics, and leaf nodes correspond to the most specialized topic areas defined. We describe a procedure that, given a resource identified by its LCSH, automatically places that resource in the LCC hierarchy. The procedure uses machine learning techniques and training data from a large library catalog to learn a model that maps from sets of LCSH to classifications from the LCC tree. We present empirical results for our technique showing its accuracy an an independent collection of 50,000 LCSH/LCC pairs.
  3. Godby, C. J.; Stuler, J.: ¬The Library of Congress Classification as a knowledge base for automatic subject categorization (2001) 0.03
    0.026618127 = product of:
      0.053236254 = sum of:
        0.053236254 = product of:
          0.10647251 = sum of:
            0.10647251 = weight(_text_:headings in 1567) [ClassicSimilarity], result of:
              0.10647251 = score(doc=1567,freq=2.0), product of:
                0.24837378 = queryWeight, product of:
                  4.849944 = idf(docFreq=940, maxDocs=44218)
                  0.051211677 = queryNorm
                0.42867854 = fieldWeight in 1567, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.849944 = idf(docFreq=940, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1567)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This paper describes a set of experiments in adapting a subset of the Library of Congress Classification for use as a database for automatic classification. A high degree of concept integrity was obtained when subject headings were mapped from OCLC's WorldCat database and filtered using the log-likelihood statistic
  4. Godby, C.J.; Stuler, J.: ¬The Library of Congress Classification as a knowledge base for automatic subject categorization : subject access issues (2003) 0.02
    0.023290861 = product of:
      0.046581723 = sum of:
        0.046581723 = product of:
          0.093163446 = sum of:
            0.093163446 = weight(_text_:headings in 3962) [ClassicSimilarity], result of:
              0.093163446 = score(doc=3962,freq=2.0), product of:
                0.24837378 = queryWeight, product of:
                  4.849944 = idf(docFreq=940, maxDocs=44218)
                  0.051211677 = queryNorm
                0.37509373 = fieldWeight in 3962, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.849944 = idf(docFreq=940, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3962)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This paper describes a set of experiments in adapting a subset of the Library of Congress Classification for use as a database for automatic classification. A high degree of concept integrity was obtained when subject headings were mapped from OCLC's WorldCat database and filtered using the log-likelihood statistic.
  5. Subramanian, S.; Shafer, K.E.: Clustering (2001) 0.02
    0.020815425 = product of:
      0.04163085 = sum of:
        0.04163085 = product of:
          0.0832617 = sum of:
            0.0832617 = weight(_text_:22 in 1046) [ClassicSimilarity], result of:
              0.0832617 = score(doc=1046,freq=2.0), product of:
                0.17933457 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051211677 = queryNorm
                0.46428138 = fieldWeight in 1046, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=1046)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    5. 5.2003 14:17:22
  6. Reiner, U.: Automatische DDC-Klassifizierung von bibliografischen Titeldatensätzen (2009) 0.02
    0.017346188 = product of:
      0.034692377 = sum of:
        0.034692377 = product of:
          0.06938475 = sum of:
            0.06938475 = weight(_text_:22 in 611) [ClassicSimilarity], result of:
              0.06938475 = score(doc=611,freq=2.0), product of:
                0.17933457 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051211677 = queryNorm
                0.38690117 = fieldWeight in 611, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=611)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 8.2009 12:54:24
  7. Humphrey, S.M.; Névéol, A.; Browne, A.; Gobeil, J.; Ruch, P.; Darmoni, S.J.: Comparing a rule-based versus statistical system for automatic categorization of MEDLINE documents according to biomedical specialty (2009) 0.02
    0.01663633 = product of:
      0.03327266 = sum of:
        0.03327266 = product of:
          0.06654532 = sum of:
            0.06654532 = weight(_text_:headings in 3300) [ClassicSimilarity], result of:
              0.06654532 = score(doc=3300,freq=2.0), product of:
                0.24837378 = queryWeight, product of:
                  4.849944 = idf(docFreq=940, maxDocs=44218)
                  0.051211677 = queryNorm
                0.2679241 = fieldWeight in 3300, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.849944 = idf(docFreq=940, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3300)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Automatic document categorization is an important research problem in Information Science and Natural Language Processing. Many applications, including, Word Sense Disambiguation and Information Retrieval in large collections, can benefit from such categorization. This paper focuses on automatic categorization of documents from the biomedical literature into broad discipline-based categories. Two different systems are described and contrasted: CISMeF, which uses rules based on human indexing of the documents by the Medical Subject Headings (MeSH) controlled vocabulary in order to assign metaterms (MTs), and Journal Descriptor Indexing (JDI), based on human categorization of about 4,000 journals and statistical associations between journal descriptors (JDs) and textwords in the documents. We evaluate and compare the performance of these systems against a gold standard of humanly assigned categories for 100 MEDLINE documents, using six measures selected from trec_eval. The results show that for five of the measures performance is comparable, and for one measure JDI is superior. We conclude that these results favor JDI, given the significantly greater intellectual overhead involved in human indexing and maintaining a rule base for mapping MeSH terms to MTs. We also note a JDI method that associates JDs with MeSH indexing rather than textwords, and it may be worthwhile to investigate whether this JDI method (statistical) and CISMeF (rule-based) might be combined and then evaluated showing they are complementary to one another.
  8. Automatic classification research at OCLC (2002) 0.01
    0.012142331 = product of:
      0.024284663 = sum of:
        0.024284663 = product of:
          0.048569325 = sum of:
            0.048569325 = weight(_text_:22 in 1563) [ClassicSimilarity], result of:
              0.048569325 = score(doc=1563,freq=2.0), product of:
                0.17933457 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051211677 = queryNorm
                0.2708308 = fieldWeight in 1563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1563)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    5. 5.2003 9:22:09
  9. Yoon, Y.; Lee, C.; Lee, G.G.: ¬An effective procedure for constructing a hierarchical text classification system (2006) 0.01
    0.012142331 = product of:
      0.024284663 = sum of:
        0.024284663 = product of:
          0.048569325 = sum of:
            0.048569325 = weight(_text_:22 in 5273) [ClassicSimilarity], result of:
              0.048569325 = score(doc=5273,freq=2.0), product of:
                0.17933457 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051211677 = queryNorm
                0.2708308 = fieldWeight in 5273, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5273)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 7.2006 16:24:52
  10. Yi, K.: Automatic text classification using library classification schemes : trends, issues and challenges (2007) 0.01
    0.012142331 = product of:
      0.024284663 = sum of:
        0.024284663 = product of:
          0.048569325 = sum of:
            0.048569325 = weight(_text_:22 in 2560) [ClassicSimilarity], result of:
              0.048569325 = score(doc=2560,freq=2.0), product of:
                0.17933457 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051211677 = queryNorm
                0.2708308 = fieldWeight in 2560, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2560)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 9.2008 18:31:54
  11. Liu, R.-L.: Context recognition for hierarchical text classification (2009) 0.01
    0.010407712 = product of:
      0.020815425 = sum of:
        0.020815425 = product of:
          0.04163085 = sum of:
            0.04163085 = weight(_text_:22 in 2760) [ClassicSimilarity], result of:
              0.04163085 = score(doc=2760,freq=2.0), product of:
                0.17933457 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051211677 = queryNorm
                0.23214069 = fieldWeight in 2760, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2760)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 3.2009 19:11:54
  12. Pfeffer, M.: Automatische Vergabe von RVK-Notationen mittels fallbasiertem Schließen (2009) 0.01
    0.010407712 = product of:
      0.020815425 = sum of:
        0.020815425 = product of:
          0.04163085 = sum of:
            0.04163085 = weight(_text_:22 in 3051) [ClassicSimilarity], result of:
              0.04163085 = score(doc=3051,freq=2.0), product of:
                0.17933457 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051211677 = queryNorm
                0.23214069 = fieldWeight in 3051, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3051)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 8.2009 19:51:28
  13. Mengle, S.; Goharian, N.: Passage detection using text classification (2009) 0.01
    0.008673094 = product of:
      0.017346188 = sum of:
        0.017346188 = product of:
          0.034692377 = sum of:
            0.034692377 = weight(_text_:22 in 2765) [ClassicSimilarity], result of:
              0.034692377 = score(doc=2765,freq=2.0), product of:
                0.17933457 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051211677 = queryNorm
                0.19345059 = fieldWeight in 2765, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2765)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 3.2009 19:14:43
  14. Khoo, C.S.G.; Ng, K.; Ou, S.: ¬An exploratory study of human clustering of Web pages (2003) 0.01
    0.006938475 = product of:
      0.01387695 = sum of:
        0.01387695 = product of:
          0.0277539 = sum of:
            0.0277539 = weight(_text_:22 in 2741) [ClassicSimilarity], result of:
              0.0277539 = score(doc=2741,freq=2.0), product of:
                0.17933457 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051211677 = queryNorm
                0.15476047 = fieldWeight in 2741, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2741)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    12. 9.2004 9:56:22
  15. Reiner, U.: Automatische DDC-Klassifizierung bibliografischer Titeldatensätze der Deutschen Nationalbibliografie (2009) 0.01
    0.006938475 = product of:
      0.01387695 = sum of:
        0.01387695 = product of:
          0.0277539 = sum of:
            0.0277539 = weight(_text_:22 in 3284) [ClassicSimilarity], result of:
              0.0277539 = score(doc=3284,freq=2.0), product of:
                0.17933457 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051211677 = queryNorm
                0.15476047 = fieldWeight in 3284, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3284)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 1.2010 14:41:24