Search (20 results, page 1 of 1)

  • × theme_ss:"Automatisches Klassifizieren"
  1. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.10
    0.10162766 = sum of:
      0.08091931 = product of:
        0.24275793 = sum of:
          0.24275793 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
            0.24275793 = score(doc=562,freq=2.0), product of:
              0.43193975 = queryWeight, product of:
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.05094824 = queryNorm
              0.56201804 = fieldWeight in 562, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.046875 = fieldNorm(doc=562)
        0.33333334 = coord(1/3)
      0.020708349 = product of:
        0.041416697 = sum of:
          0.041416697 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
            0.041416697 = score(doc=562,freq=2.0), product of:
              0.17841205 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.05094824 = queryNorm
              0.23214069 = fieldWeight in 562, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=562)
        0.5 = coord(1/2)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  2. Hung, C.-M.; Chien, L.-F.: Web-based text classification in the absence of manually labeled training documents (2007) 0.02
    0.023198929 = product of:
      0.046397857 = sum of:
        0.046397857 = product of:
          0.092795715 = sum of:
            0.092795715 = weight(_text_:news in 87) [ClassicSimilarity], result of:
              0.092795715 = score(doc=87,freq=2.0), product of:
                0.26705483 = queryWeight, product of:
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.05094824 = queryNorm
                0.34747815 = fieldWeight in 87, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.046875 = fieldNorm(doc=87)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Most text classification techniques assume that manually labeled documents (corpora) can be easily obtained while learning text classifiers. However, labeled training documents are sometimes unavailable or inadequate even if they are available. The goal of this article is to present a self-learned approach to extract high-quality training documents from the Web when the required manually labeled documents are unavailable or of poor quality. To learn a text classifier automatically, we need only a set of user-defined categories and some highly related keywords. Extensive experiments are conducted to evaluate the performance of the proposed approach using the test set from the Reuters-21578 news data set. The experiments show that very promising results can be achieved only by using automatically extracted documents from the Web.
  3. Subramanian, S.; Shafer, K.E.: Clustering (2001) 0.02
    0.020708349 = product of:
      0.041416697 = sum of:
        0.041416697 = product of:
          0.082833394 = sum of:
            0.082833394 = weight(_text_:22 in 1046) [ClassicSimilarity], result of:
              0.082833394 = score(doc=1046,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.46428138 = fieldWeight in 1046, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=1046)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    5. 5.2003 14:17:22
  4. Reiner, U.: Automatische DDC-Klassifizierung von bibliografischen Titeldatensätzen (2009) 0.02
    0.017256958 = product of:
      0.034513917 = sum of:
        0.034513917 = product of:
          0.06902783 = sum of:
            0.06902783 = weight(_text_:22 in 611) [ClassicSimilarity], result of:
              0.06902783 = score(doc=611,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.38690117 = fieldWeight in 611, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=611)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 8.2009 12:54:24
  5. HaCohen-Kerner, Y. et al.: Classification using various machine learning methods and combinations of key-phrases and visual features (2016) 0.02
    0.017256958 = product of:
      0.034513917 = sum of:
        0.034513917 = product of:
          0.06902783 = sum of:
            0.06902783 = weight(_text_:22 in 2748) [ClassicSimilarity], result of:
              0.06902783 = score(doc=2748,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.38690117 = fieldWeight in 2748, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2748)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    1. 2.2016 18:25:22
  6. Dolin, R.; Agrawal, D.; El Abbadi, A.; Pearlman, J.: Using automated classification for summarizing and selecting heterogeneous information sources (1998) 0.02
    0.01640412 = product of:
      0.03280824 = sum of:
        0.03280824 = product of:
          0.06561648 = sum of:
            0.06561648 = weight(_text_:news in 1253) [ClassicSimilarity], result of:
              0.06561648 = score(doc=1253,freq=4.0), product of:
                0.26705483 = queryWeight, product of:
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.05094824 = queryNorm
                0.24570416 = fieldWeight in 1253, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.0234375 = fieldNorm(doc=1253)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    We are currently experimenting with newsgroups as collections. We have built an initial prototype which automatically classifies and summarizes newsgroups within the LCC. (The prototype can be tested below, and more details may be found at http://pharos.alexandria.ucsb.edu/). The prototype uses electronic library catalog records as a `training set' and Latent Semantic Indexing (LSI) for IR. We use the training set to build a rich set of classification terminology, and associate these terms with the relevant categories in the LCC. This association between terms and classification categories allows us to relate users' queries to nodes in the LCC so that users can select appropriate query categories. Newsgroups are similarly associated with classification categories. Pharos then matches the categories selected by users to relevant newsgroups. In principle, this approach allows users to exclude newsgroups that might have been selected based on an unintended meaning of a query term, and to include newsgroups with relevant content even though the exact query terms may not have been used. This work is extensible to other types of classification, including geographical, temporal, and image feature. Before discussing the methodology of the collection summarization and selection, we first present an online demonstration below. The demonstration is not intended to be a complete end-user interface. Rather, it is intended merely to offer a view of the process to suggest the "look and feel" of the prototype. The demo works as follows. First supply it with a few keywords of interest. The system will then use those terms to try to return to you the most relevant subject categories within the LCC. Assuming that the system recognizes any of your terms (it has over 400,000 terms indexed), it will give you a list of 15 LCC categories sorted by relevancy ranking. From there, you have two choices. The first choice, by clicking on the "News" links, is to get a list of newsgroups which the system has identified as relevant to the LCC category you select. The other choice, by clicking on the LCC ID links, is to enter the LCC hierarchy starting at the category of your choice and navigate the tree until you locate the best category for your query. From there, again, you can get a list of newsgroups by clicking on the "News" links. After having shown this demonstration to many people, we would like to suggest that you first give it easier examples before trying to break it. For example, "prostate cancer" (discussed below), "remote sensing", "investment banking", and "gershwin" all work reasonably well.
  7. Bock, H.-H.: Datenanalyse zur Strukturierung und Ordnung von Information (1989) 0.01
    0.01207987 = product of:
      0.02415974 = sum of:
        0.02415974 = product of:
          0.04831948 = sum of:
            0.04831948 = weight(_text_:22 in 141) [ClassicSimilarity], result of:
              0.04831948 = score(doc=141,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.2708308 = fieldWeight in 141, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=141)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Pages
    S.1-22
  8. Dubin, D.: Dimensions and discriminability (1998) 0.01
    0.01207987 = product of:
      0.02415974 = sum of:
        0.02415974 = product of:
          0.04831948 = sum of:
            0.04831948 = weight(_text_:22 in 2338) [ClassicSimilarity], result of:
              0.04831948 = score(doc=2338,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.2708308 = fieldWeight in 2338, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2338)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 9.1997 19:16:05
  9. Automatic classification research at OCLC (2002) 0.01
    0.01207987 = product of:
      0.02415974 = sum of:
        0.02415974 = product of:
          0.04831948 = sum of:
            0.04831948 = weight(_text_:22 in 1563) [ClassicSimilarity], result of:
              0.04831948 = score(doc=1563,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.2708308 = fieldWeight in 1563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1563)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    5. 5.2003 9:22:09
  10. Jenkins, C.: Automatic classification of Web resources using Java and Dewey Decimal Classification (1998) 0.01
    0.01207987 = product of:
      0.02415974 = sum of:
        0.02415974 = product of:
          0.04831948 = sum of:
            0.04831948 = weight(_text_:22 in 1673) [ClassicSimilarity], result of:
              0.04831948 = score(doc=1673,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.2708308 = fieldWeight in 1673, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1673)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    1. 8.1996 22:08:06
  11. Yoon, Y.; Lee, C.; Lee, G.G.: ¬An effective procedure for constructing a hierarchical text classification system (2006) 0.01
    0.01207987 = product of:
      0.02415974 = sum of:
        0.02415974 = product of:
          0.04831948 = sum of:
            0.04831948 = weight(_text_:22 in 5273) [ClassicSimilarity], result of:
              0.04831948 = score(doc=5273,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.2708308 = fieldWeight in 5273, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5273)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 7.2006 16:24:52
  12. Yi, K.: Automatic text classification using library classification schemes : trends, issues and challenges (2007) 0.01
    0.01207987 = product of:
      0.02415974 = sum of:
        0.02415974 = product of:
          0.04831948 = sum of:
            0.04831948 = weight(_text_:22 in 2560) [ClassicSimilarity], result of:
              0.04831948 = score(doc=2560,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.2708308 = fieldWeight in 2560, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2560)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 9.2008 18:31:54
  13. Liu, R.-L.: Context recognition for hierarchical text classification (2009) 0.01
    0.010354174 = product of:
      0.020708349 = sum of:
        0.020708349 = product of:
          0.041416697 = sum of:
            0.041416697 = weight(_text_:22 in 2760) [ClassicSimilarity], result of:
              0.041416697 = score(doc=2760,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.23214069 = fieldWeight in 2760, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2760)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 3.2009 19:11:54
  14. Pfeffer, M.: Automatische Vergabe von RVK-Notationen mittels fallbasiertem Schließen (2009) 0.01
    0.010354174 = product of:
      0.020708349 = sum of:
        0.020708349 = product of:
          0.041416697 = sum of:
            0.041416697 = weight(_text_:22 in 3051) [ClassicSimilarity], result of:
              0.041416697 = score(doc=3051,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.23214069 = fieldWeight in 3051, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3051)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 8.2009 19:51:28
  15. Zhu, W.Z.; Allen, R.B.: Document clustering using the LSI subspace signature model (2013) 0.01
    0.010354174 = product of:
      0.020708349 = sum of:
        0.020708349 = product of:
          0.041416697 = sum of:
            0.041416697 = weight(_text_:22 in 690) [ClassicSimilarity], result of:
              0.041416697 = score(doc=690,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.23214069 = fieldWeight in 690, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=690)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    23. 3.2013 13:22:36
  16. Egbert, J.; Biber, D.; Davies, M.: Developing a bottom-up, user-based method of web register classification (2015) 0.01
    0.010354174 = product of:
      0.020708349 = sum of:
        0.020708349 = product of:
          0.041416697 = sum of:
            0.041416697 = weight(_text_:22 in 2158) [ClassicSimilarity], result of:
              0.041416697 = score(doc=2158,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.23214069 = fieldWeight in 2158, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2158)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    4. 8.2015 19:22:04
  17. Mengle, S.; Goharian, N.: Passage detection using text classification (2009) 0.01
    0.008628479 = product of:
      0.017256958 = sum of:
        0.017256958 = product of:
          0.034513917 = sum of:
            0.034513917 = weight(_text_:22 in 2765) [ClassicSimilarity], result of:
              0.034513917 = score(doc=2765,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.19345059 = fieldWeight in 2765, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2765)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 3.2009 19:14:43
  18. Liu, R.-L.: ¬A passage extractor for classification of disease aspect information (2013) 0.01
    0.008628479 = product of:
      0.017256958 = sum of:
        0.017256958 = product of:
          0.034513917 = sum of:
            0.034513917 = weight(_text_:22 in 1107) [ClassicSimilarity], result of:
              0.034513917 = score(doc=1107,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.19345059 = fieldWeight in 1107, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1107)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    28.10.2013 19:22:57
  19. Khoo, C.S.G.; Ng, K.; Ou, S.: ¬An exploratory study of human clustering of Web pages (2003) 0.01
    0.006902783 = product of:
      0.013805566 = sum of:
        0.013805566 = product of:
          0.027611133 = sum of:
            0.027611133 = weight(_text_:22 in 2741) [ClassicSimilarity], result of:
              0.027611133 = score(doc=2741,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.15476047 = fieldWeight in 2741, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2741)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    12. 9.2004 9:56:22
  20. Reiner, U.: Automatische DDC-Klassifizierung bibliografischer Titeldatensätze der Deutschen Nationalbibliografie (2009) 0.01
    0.006902783 = product of:
      0.013805566 = sum of:
        0.013805566 = product of:
          0.027611133 = sum of:
            0.027611133 = weight(_text_:22 in 3284) [ClassicSimilarity], result of:
              0.027611133 = score(doc=3284,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.15476047 = fieldWeight in 3284, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3284)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 1.2010 14:41:24