Search (21 results, page 1 of 2)

  • × theme_ss:"Automatisches Klassifizieren"
  1. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.18
    0.18370643 = product of:
      0.24494192 = sum of:
        0.05873049 = product of:
          0.17619146 = sum of:
            0.17619146 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
              0.17619146 = score(doc=562,freq=2.0), product of:
                0.31349787 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.03697776 = queryNorm
                0.56201804 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
        0.17619146 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
          0.17619146 = score(doc=562,freq=2.0), product of:
            0.31349787 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03697776 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.010019952 = product of:
          0.030059857 = sum of:
            0.030059857 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
              0.030059857 = score(doc=562,freq=2.0), product of:
                0.12948982 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03697776 = queryNorm
                0.23214069 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
      0.75 = coord(3/4)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  2. Teich, E.; Degaetano-Ortlieb, S.; Fankhauser, P.; Kermes, H.; Lapshinova-Koltunski, E.: ¬The linguistic construal of disciplinarity : a data-mining approach using register features (2016) 0.02
    0.017192384 = product of:
      0.06876954 = sum of:
        0.06876954 = weight(_text_:evolution in 3015) [ClassicSimilarity], result of:
          0.06876954 = score(doc=3015,freq=2.0), product of:
            0.19585751 = queryWeight, product of:
              5.29663 = idf(docFreq=601, maxDocs=44218)
              0.03697776 = queryNorm
            0.35112026 = fieldWeight in 3015, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.29663 = idf(docFreq=601, maxDocs=44218)
              0.046875 = fieldNorm(doc=3015)
      0.25 = coord(1/4)
    
    Abstract
    We analyze the linguistic evolution of selected scientific disciplines over a 30-year time span (1970s to 2000s). Our focus is on four highly specialized disciplines at the boundaries of computer science that emerged during that time: computational linguistics, bioinformatics, digital construction, and microelectronics. Our analysis is driven by the question whether these disciplines develop a distinctive language use-both individually and collectively-over the given time period. The data set is the English Scientific Text Corpus (scitex), which includes texts from the 1970s/1980s and early 2000s. Our theoretical basis is register theory. In terms of methods, we combine corpus-based methods of feature extraction (various aggregated features [part-of-speech based], n-grams, lexico-grammatical patterns) and automatic text classification. The results of our research are directly relevant to the study of linguistic variation and languages for specific purposes (LSP) and have implications for various natural language processing (NLP) tasks, for example, authorship attribution, text mining, or training NLP tools.
  3. Ibekwe-SanJuan, F.; SanJuan, E.: From term variants to research topics (2002) 0.01
    0.014326988 = product of:
      0.05730795 = sum of:
        0.05730795 = weight(_text_:evolution in 1853) [ClassicSimilarity], result of:
          0.05730795 = score(doc=1853,freq=2.0), product of:
            0.19585751 = queryWeight, product of:
              5.29663 = idf(docFreq=601, maxDocs=44218)
              0.03697776 = queryNorm
            0.2926002 = fieldWeight in 1853, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.29663 = idf(docFreq=601, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1853)
      0.25 = coord(1/4)
    
    Abstract
    In a scientific and technological watch (STW) task, an expert user needs to survey the evolution of research topics in his area of specialisation in order to detect interesting changes. The majority of methods proposing evaluation metrics (bibliometrics and scientometrics studies) for STW rely solely an statistical data analysis methods (Co-citation analysis, co-word analysis). Such methods usually work an structured databases where the units of analysis (words, keywords) are already attributed to documents by human indexers. The advent of huge amounts of unstructured textual data has rendered necessary the integration of natural language processing (NLP) techniques to first extract meaningful units from texts. We propose a method for STW which is NLP-oriented. The method not only analyses texts linguistically in order to extract terms from them, but also uses linguistic relations (syntactic variations) as the basis for clustering. Terms and variation relations are formalised as weighted di-graphs which the clustering algorithm, CPCL (Classification by Preferential Clustered Link) will seek to reduce in order to produces classes. These classes ideally represent the research topics present in the corpus. The results of the classification are subjected to validation by an expert in STW.
  4. Smiraglia, R.P.; Cai, X.: Tracking the evolution of clustering, machine learning, automatic indexing and automatic classification in knowledge organization (2017) 0.01
    0.014326988 = product of:
      0.05730795 = sum of:
        0.05730795 = weight(_text_:evolution in 3627) [ClassicSimilarity], result of:
          0.05730795 = score(doc=3627,freq=2.0), product of:
            0.19585751 = queryWeight, product of:
              5.29663 = idf(docFreq=601, maxDocs=44218)
              0.03697776 = queryNorm
            0.2926002 = fieldWeight in 3627, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.29663 = idf(docFreq=601, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3627)
      0.25 = coord(1/4)
    
  5. Subramanian, S.; Shafer, K.E.: Clustering (2001) 0.01
    0.005009976 = product of:
      0.020039905 = sum of:
        0.020039905 = product of:
          0.060119715 = sum of:
            0.060119715 = weight(_text_:22 in 1046) [ClassicSimilarity], result of:
              0.060119715 = score(doc=1046,freq=2.0), product of:
                0.12948982 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03697776 = queryNorm
                0.46428138 = fieldWeight in 1046, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=1046)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    5. 5.2003 14:17:22
  6. Reiner, U.: Automatische DDC-Klassifizierung von bibliografischen Titeldatensätzen (2009) 0.00
    0.0041749803 = product of:
      0.016699921 = sum of:
        0.016699921 = product of:
          0.050099764 = sum of:
            0.050099764 = weight(_text_:22 in 611) [ClassicSimilarity], result of:
              0.050099764 = score(doc=611,freq=2.0), product of:
                0.12948982 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03697776 = queryNorm
                0.38690117 = fieldWeight in 611, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=611)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    22. 8.2009 12:54:24
  7. HaCohen-Kerner, Y. et al.: Classification using various machine learning methods and combinations of key-phrases and visual features (2016) 0.00
    0.0041749803 = product of:
      0.016699921 = sum of:
        0.016699921 = product of:
          0.050099764 = sum of:
            0.050099764 = weight(_text_:22 in 2748) [ClassicSimilarity], result of:
              0.050099764 = score(doc=2748,freq=2.0), product of:
                0.12948982 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03697776 = queryNorm
                0.38690117 = fieldWeight in 2748, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2748)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    1. 2.2016 18:25:22
  8. Bock, H.-H.: Datenanalyse zur Strukturierung und Ordnung von Information (1989) 0.00
    0.0029224863 = product of:
      0.011689945 = sum of:
        0.011689945 = product of:
          0.035069834 = sum of:
            0.035069834 = weight(_text_:22 in 141) [ClassicSimilarity], result of:
              0.035069834 = score(doc=141,freq=2.0), product of:
                0.12948982 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03697776 = queryNorm
                0.2708308 = fieldWeight in 141, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=141)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Pages
    S.1-22
  9. Dubin, D.: Dimensions and discriminability (1998) 0.00
    0.0029224863 = product of:
      0.011689945 = sum of:
        0.011689945 = product of:
          0.035069834 = sum of:
            0.035069834 = weight(_text_:22 in 2338) [ClassicSimilarity], result of:
              0.035069834 = score(doc=2338,freq=2.0), product of:
                0.12948982 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03697776 = queryNorm
                0.2708308 = fieldWeight in 2338, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2338)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    22. 9.1997 19:16:05
  10. Automatic classification research at OCLC (2002) 0.00
    0.0029224863 = product of:
      0.011689945 = sum of:
        0.011689945 = product of:
          0.035069834 = sum of:
            0.035069834 = weight(_text_:22 in 1563) [ClassicSimilarity], result of:
              0.035069834 = score(doc=1563,freq=2.0), product of:
                0.12948982 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03697776 = queryNorm
                0.2708308 = fieldWeight in 1563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1563)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    5. 5.2003 9:22:09
  11. Jenkins, C.: Automatic classification of Web resources using Java and Dewey Decimal Classification (1998) 0.00
    0.0029224863 = product of:
      0.011689945 = sum of:
        0.011689945 = product of:
          0.035069834 = sum of:
            0.035069834 = weight(_text_:22 in 1673) [ClassicSimilarity], result of:
              0.035069834 = score(doc=1673,freq=2.0), product of:
                0.12948982 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03697776 = queryNorm
                0.2708308 = fieldWeight in 1673, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1673)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    1. 8.1996 22:08:06
  12. Yoon, Y.; Lee, C.; Lee, G.G.: ¬An effective procedure for constructing a hierarchical text classification system (2006) 0.00
    0.0029224863 = product of:
      0.011689945 = sum of:
        0.011689945 = product of:
          0.035069834 = sum of:
            0.035069834 = weight(_text_:22 in 5273) [ClassicSimilarity], result of:
              0.035069834 = score(doc=5273,freq=2.0), product of:
                0.12948982 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03697776 = queryNorm
                0.2708308 = fieldWeight in 5273, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5273)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    22. 7.2006 16:24:52
  13. Yi, K.: Automatic text classification using library classification schemes : trends, issues and challenges (2007) 0.00
    0.0029224863 = product of:
      0.011689945 = sum of:
        0.011689945 = product of:
          0.035069834 = sum of:
            0.035069834 = weight(_text_:22 in 2560) [ClassicSimilarity], result of:
              0.035069834 = score(doc=2560,freq=2.0), product of:
                0.12948982 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03697776 = queryNorm
                0.2708308 = fieldWeight in 2560, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2560)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    22. 9.2008 18:31:54
  14. Liu, R.-L.: Context recognition for hierarchical text classification (2009) 0.00
    0.002504988 = product of:
      0.010019952 = sum of:
        0.010019952 = product of:
          0.030059857 = sum of:
            0.030059857 = weight(_text_:22 in 2760) [ClassicSimilarity], result of:
              0.030059857 = score(doc=2760,freq=2.0), product of:
                0.12948982 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03697776 = queryNorm
                0.23214069 = fieldWeight in 2760, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2760)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    22. 3.2009 19:11:54
  15. Pfeffer, M.: Automatische Vergabe von RVK-Notationen mittels fallbasiertem Schließen (2009) 0.00
    0.002504988 = product of:
      0.010019952 = sum of:
        0.010019952 = product of:
          0.030059857 = sum of:
            0.030059857 = weight(_text_:22 in 3051) [ClassicSimilarity], result of:
              0.030059857 = score(doc=3051,freq=2.0), product of:
                0.12948982 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03697776 = queryNorm
                0.23214069 = fieldWeight in 3051, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3051)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    22. 8.2009 19:51:28
  16. Zhu, W.Z.; Allen, R.B.: Document clustering using the LSI subspace signature model (2013) 0.00
    0.002504988 = product of:
      0.010019952 = sum of:
        0.010019952 = product of:
          0.030059857 = sum of:
            0.030059857 = weight(_text_:22 in 690) [ClassicSimilarity], result of:
              0.030059857 = score(doc=690,freq=2.0), product of:
                0.12948982 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03697776 = queryNorm
                0.23214069 = fieldWeight in 690, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=690)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    23. 3.2013 13:22:36
  17. Egbert, J.; Biber, D.; Davies, M.: Developing a bottom-up, user-based method of web register classification (2015) 0.00
    0.002504988 = product of:
      0.010019952 = sum of:
        0.010019952 = product of:
          0.030059857 = sum of:
            0.030059857 = weight(_text_:22 in 2158) [ClassicSimilarity], result of:
              0.030059857 = score(doc=2158,freq=2.0), product of:
                0.12948982 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03697776 = queryNorm
                0.23214069 = fieldWeight in 2158, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2158)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    4. 8.2015 19:22:04
  18. Mengle, S.; Goharian, N.: Passage detection using text classification (2009) 0.00
    0.0020874902 = product of:
      0.008349961 = sum of:
        0.008349961 = product of:
          0.025049882 = sum of:
            0.025049882 = weight(_text_:22 in 2765) [ClassicSimilarity], result of:
              0.025049882 = score(doc=2765,freq=2.0), product of:
                0.12948982 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03697776 = queryNorm
                0.19345059 = fieldWeight in 2765, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2765)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    22. 3.2009 19:14:43
  19. Liu, R.-L.: ¬A passage extractor for classification of disease aspect information (2013) 0.00
    0.0020874902 = product of:
      0.008349961 = sum of:
        0.008349961 = product of:
          0.025049882 = sum of:
            0.025049882 = weight(_text_:22 in 1107) [ClassicSimilarity], result of:
              0.025049882 = score(doc=1107,freq=2.0), product of:
                0.12948982 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03697776 = queryNorm
                0.19345059 = fieldWeight in 1107, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1107)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    28.10.2013 19:22:57
  20. Khoo, C.S.G.; Ng, K.; Ou, S.: ¬An exploratory study of human clustering of Web pages (2003) 0.00
    0.0016699921 = product of:
      0.0066799684 = sum of:
        0.0066799684 = product of:
          0.020039905 = sum of:
            0.020039905 = weight(_text_:22 in 2741) [ClassicSimilarity], result of:
              0.020039905 = score(doc=2741,freq=2.0), product of:
                0.12948982 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03697776 = queryNorm
                0.15476047 = fieldWeight in 2741, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2741)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    12. 9.2004 9:56:22