Search (38 results, page 1 of 2)

  • × theme_ss:"Automatisches Klassifizieren"
  1. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.20
    0.20105061 = product of:
      0.26806748 = sum of:
        0.06427538 = product of:
          0.19282612 = sum of:
            0.19282612 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
              0.19282612 = score(doc=562,freq=2.0), product of:
                0.34309596 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.04046892 = queryNorm
                0.56201804 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
        0.19282612 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
          0.19282612 = score(doc=562,freq=2.0), product of:
            0.34309596 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.04046892 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.01096596 = product of:
          0.03289788 = sum of:
            0.03289788 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
              0.03289788 = score(doc=562,freq=2.0), product of:
                0.14171526 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04046892 = queryNorm
                0.23214069 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
      0.75 = coord(3/4)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  2. Egbert, J.; Biber, D.; Davies, M.: Developing a bottom-up, user-based method of web register classification (2015) 0.10
    0.100737825 = product of:
      0.20147565 = sum of:
        0.19050969 = weight(_text_:register in 2158) [ClassicSimilarity], result of:
          0.19050969 = score(doc=2158,freq=10.0), product of:
            0.22805978 = queryWeight, product of:
              5.6354303 = idf(docFreq=428, maxDocs=44218)
              0.04046892 = queryNorm
            0.8353498 = fieldWeight in 2158, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              5.6354303 = idf(docFreq=428, maxDocs=44218)
              0.046875 = fieldNorm(doc=2158)
        0.01096596 = product of:
          0.03289788 = sum of:
            0.03289788 = weight(_text_:22 in 2158) [ClassicSimilarity], result of:
              0.03289788 = score(doc=2158,freq=2.0), product of:
                0.14171526 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04046892 = queryNorm
                0.23214069 = fieldWeight in 2158, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2158)
          0.33333334 = coord(1/3)
      0.5 = coord(2/4)
    
    Abstract
    This paper introduces a project to develop a reliable, cost-effective method for classifying Internet texts into register categories, and apply that approach to the analysis of a large corpus of web documents. To date, the project has proceeded in 2 key phases. First, we developed a bottom-up method for web register classification, asking end users of the web to utilize a decision-tree survey to code relevant situational characteristics of web documents, resulting in a bottom-up identification of register and subregister categories. We present details regarding the development and testing of this method through a series of 10 pilot studies. Then, in the second phase of our project we applied this procedure to a corpus of 53,000 web documents. An analysis of the results demonstrates the effectiveness of these methods for web register classification and provides a preliminary description of the types and distribution of registers on the web.
    Date
    4. 8.2015 19:22:04
  3. Teich, E.; Degaetano-Ortlieb, S.; Fankhauser, P.; Kermes, H.; Lapshinova-Koltunski, E.: ¬The linguistic construal of disciplinarity : a data-mining approach using register features (2016) 0.03
    0.030122228 = product of:
      0.12048891 = sum of:
        0.12048891 = weight(_text_:register in 3015) [ClassicSimilarity], result of:
          0.12048891 = score(doc=3015,freq=4.0), product of:
            0.22805978 = queryWeight, product of:
              5.6354303 = idf(docFreq=428, maxDocs=44218)
              0.04046892 = queryNorm
            0.5283216 = fieldWeight in 3015, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.6354303 = idf(docFreq=428, maxDocs=44218)
              0.046875 = fieldNorm(doc=3015)
      0.25 = coord(1/4)
    
    Abstract
    We analyze the linguistic evolution of selected scientific disciplines over a 30-year time span (1970s to 2000s). Our focus is on four highly specialized disciplines at the boundaries of computer science that emerged during that time: computational linguistics, bioinformatics, digital construction, and microelectronics. Our analysis is driven by the question whether these disciplines develop a distinctive language use-both individually and collectively-over the given time period. The data set is the English Scientific Text Corpus (scitex), which includes texts from the 1970s/1980s and early 2000s. Our theoretical basis is register theory. In terms of methods, we combine corpus-based methods of feature extraction (various aggregated features [part-of-speech based], n-grams, lexico-grammatical patterns) and automatic text classification. The results of our research are directly relevant to the study of linguistic variation and languages for specific purposes (LSP) and have implications for various natural language processing (NLP) tasks, for example, authorship attribution, text mining, or training NLP tools.
  4. Panyr, J.: STEINADLER: ein Verfahren zur automatischen Deskribierung und zur automatischen thematischen Klassifikation (1978) 0.01
    0.0073770015 = product of:
      0.029508006 = sum of:
        0.029508006 = product of:
          0.088524014 = sum of:
            0.088524014 = weight(_text_:29 in 5169) [ClassicSimilarity], result of:
              0.088524014 = score(doc=5169,freq=2.0), product of:
                0.142357 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04046892 = queryNorm
                0.6218451 = fieldWeight in 5169, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.125 = fieldNorm(doc=5169)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Source
    Nachrichten für Dokumentation. 29(1978), S.92-96
  5. Subramanian, S.; Shafer, K.E.: Clustering (2001) 0.01
    0.00548298 = product of:
      0.02193192 = sum of:
        0.02193192 = product of:
          0.06579576 = sum of:
            0.06579576 = weight(_text_:22 in 1046) [ClassicSimilarity], result of:
              0.06579576 = score(doc=1046,freq=2.0), product of:
                0.14171526 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04046892 = queryNorm
                0.46428138 = fieldWeight in 1046, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=1046)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    5. 5.2003 14:17:22
  6. Reiner, U.: Automatische DDC-Klassifizierung von bibliografischen Titeldatensätzen (2009) 0.00
    0.00456915 = product of:
      0.0182766 = sum of:
        0.0182766 = product of:
          0.0548298 = sum of:
            0.0548298 = weight(_text_:22 in 611) [ClassicSimilarity], result of:
              0.0548298 = score(doc=611,freq=2.0), product of:
                0.14171526 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04046892 = queryNorm
                0.38690117 = fieldWeight in 611, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=611)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    22. 8.2009 12:54:24
  7. HaCohen-Kerner, Y. et al.: Classification using various machine learning methods and combinations of key-phrases and visual features (2016) 0.00
    0.00456915 = product of:
      0.0182766 = sum of:
        0.0182766 = product of:
          0.0548298 = sum of:
            0.0548298 = weight(_text_:22 in 2748) [ClassicSimilarity], result of:
              0.0548298 = score(doc=2748,freq=2.0), product of:
                0.14171526 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04046892 = queryNorm
                0.38690117 = fieldWeight in 2748, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2748)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    1. 2.2016 18:25:22
  8. Savic, D.: Designing an expert system for classifying office documents (1994) 0.00
    0.0036885007 = product of:
      0.014754003 = sum of:
        0.014754003 = product of:
          0.044262007 = sum of:
            0.044262007 = weight(_text_:29 in 2655) [ClassicSimilarity], result of:
              0.044262007 = score(doc=2655,freq=2.0), product of:
                0.142357 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04046892 = queryNorm
                0.31092256 = fieldWeight in 2655, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2655)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Source
    Records management quarterly. 28(1994) no.3, S.20-29
  9. Savic, D.: Automatic classification of office documents : review of available methods and techniques (1995) 0.00
    0.0032274378 = product of:
      0.012909751 = sum of:
        0.012909751 = product of:
          0.038729254 = sum of:
            0.038729254 = weight(_text_:29 in 2219) [ClassicSimilarity], result of:
              0.038729254 = score(doc=2219,freq=2.0), product of:
                0.142357 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04046892 = queryNorm
                0.27205724 = fieldWeight in 2219, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2219)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Source
    Records management quarterly. 29(1995) no.4, S.3-18
  10. Ruocco, A.S.; Frieder, O.: Clustering and classification of large document bases in a parallel environment (1997) 0.00
    0.0032274378 = product of:
      0.012909751 = sum of:
        0.012909751 = product of:
          0.038729254 = sum of:
            0.038729254 = weight(_text_:29 in 1661) [ClassicSimilarity], result of:
              0.038729254 = score(doc=1661,freq=2.0), product of:
                0.142357 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04046892 = queryNorm
                0.27205724 = fieldWeight in 1661, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1661)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    29. 7.1998 17:45:02
  11. Ruiz, M.E.; Srinivasan, P.: Combining machine learning and hierarchical indexing structures for text categorization (2001) 0.00
    0.0032274378 = product of:
      0.012909751 = sum of:
        0.012909751 = product of:
          0.038729254 = sum of:
            0.038729254 = weight(_text_:29 in 1595) [ClassicSimilarity], result of:
              0.038729254 = score(doc=1595,freq=2.0), product of:
                0.142357 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04046892 = queryNorm
                0.27205724 = fieldWeight in 1595, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1595)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    11. 5.2003 18:29:44
  12. Bock, H.-H.: Datenanalyse zur Strukturierung und Ordnung von Information (1989) 0.00
    0.0031984048 = product of:
      0.012793619 = sum of:
        0.012793619 = product of:
          0.038380858 = sum of:
            0.038380858 = weight(_text_:22 in 141) [ClassicSimilarity], result of:
              0.038380858 = score(doc=141,freq=2.0), product of:
                0.14171526 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04046892 = queryNorm
                0.2708308 = fieldWeight in 141, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=141)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Pages
    S.1-22
  13. Dubin, D.: Dimensions and discriminability (1998) 0.00
    0.0031984048 = product of:
      0.012793619 = sum of:
        0.012793619 = product of:
          0.038380858 = sum of:
            0.038380858 = weight(_text_:22 in 2338) [ClassicSimilarity], result of:
              0.038380858 = score(doc=2338,freq=2.0), product of:
                0.14171526 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04046892 = queryNorm
                0.2708308 = fieldWeight in 2338, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2338)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    22. 9.1997 19:16:05
  14. Automatic classification research at OCLC (2002) 0.00
    0.0031984048 = product of:
      0.012793619 = sum of:
        0.012793619 = product of:
          0.038380858 = sum of:
            0.038380858 = weight(_text_:22 in 1563) [ClassicSimilarity], result of:
              0.038380858 = score(doc=1563,freq=2.0), product of:
                0.14171526 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04046892 = queryNorm
                0.2708308 = fieldWeight in 1563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1563)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    5. 5.2003 9:22:09
  15. Jenkins, C.: Automatic classification of Web resources using Java and Dewey Decimal Classification (1998) 0.00
    0.0031984048 = product of:
      0.012793619 = sum of:
        0.012793619 = product of:
          0.038380858 = sum of:
            0.038380858 = weight(_text_:22 in 1673) [ClassicSimilarity], result of:
              0.038380858 = score(doc=1673,freq=2.0), product of:
                0.14171526 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04046892 = queryNorm
                0.2708308 = fieldWeight in 1673, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1673)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    1. 8.1996 22:08:06
  16. Yoon, Y.; Lee, C.; Lee, G.G.: ¬An effective procedure for constructing a hierarchical text classification system (2006) 0.00
    0.0031984048 = product of:
      0.012793619 = sum of:
        0.012793619 = product of:
          0.038380858 = sum of:
            0.038380858 = weight(_text_:22 in 5273) [ClassicSimilarity], result of:
              0.038380858 = score(doc=5273,freq=2.0), product of:
                0.14171526 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04046892 = queryNorm
                0.2708308 = fieldWeight in 5273, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5273)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    22. 7.2006 16:24:52
  17. Yi, K.: Automatic text classification using library classification schemes : trends, issues and challenges (2007) 0.00
    0.0031984048 = product of:
      0.012793619 = sum of:
        0.012793619 = product of:
          0.038380858 = sum of:
            0.038380858 = weight(_text_:22 in 2560) [ClassicSimilarity], result of:
              0.038380858 = score(doc=2560,freq=2.0), product of:
                0.14171526 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04046892 = queryNorm
                0.2708308 = fieldWeight in 2560, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2560)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    22. 9.2008 18:31:54
  18. Drori, O.; Alon, N.: Using document classification for displaying search results (2003) 0.00
    0.0027663754 = product of:
      0.011065502 = sum of:
        0.011065502 = product of:
          0.033196505 = sum of:
            0.033196505 = weight(_text_:29 in 1565) [ClassicSimilarity], result of:
              0.033196505 = score(doc=1565,freq=2.0), product of:
                0.142357 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04046892 = queryNorm
                0.23319192 = fieldWeight in 1565, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1565)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Source
    Journal of information science. 29(2003) no.2, S.97-106
  19. Chung, Y.-M.; Noh, Y.-H.: Developing a specialized directory system by automatically classifying Web documents (2003) 0.00
    0.0027663754 = product of:
      0.011065502 = sum of:
        0.011065502 = product of:
          0.033196505 = sum of:
            0.033196505 = weight(_text_:29 in 1566) [ClassicSimilarity], result of:
              0.033196505 = score(doc=1566,freq=2.0), product of:
                0.142357 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04046892 = queryNorm
                0.23319192 = fieldWeight in 1566, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1566)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Source
    Journal of information science. 29(2003) no.2, S.117-126
  20. Liu, X.; Yu, S.; Janssens, F.; Glänzel, W.; Moreau, Y.; Moor, B.de: Weighted hybrid clustering by combining text mining and bibliometrics on a large-scale journal database (2010) 0.00
    0.0027663754 = product of:
      0.011065502 = sum of:
        0.011065502 = product of:
          0.033196505 = sum of:
            0.033196505 = weight(_text_:29 in 3464) [ClassicSimilarity], result of:
              0.033196505 = score(doc=3464,freq=2.0), product of:
                0.142357 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04046892 = queryNorm
                0.23319192 = fieldWeight in 3464, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3464)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    1. 6.2010 9:29:57