Search (9 results, page 1 of 1)

  • × language_ss:"e"
  • × theme_ss:"Automatisches Klassifizieren"
  • × year_i:[2000 TO 2010}
  1. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.18
    0.18370643 = product of:
      0.24494192 = sum of:
        0.05873049 = product of:
          0.17619146 = sum of:
            0.17619146 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
              0.17619146 = score(doc=562,freq=2.0), product of:
                0.31349787 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.03697776 = queryNorm
                0.56201804 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
        0.17619146 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
          0.17619146 = score(doc=562,freq=2.0), product of:
            0.31349787 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03697776 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.010019952 = product of:
          0.030059857 = sum of:
            0.030059857 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
              0.030059857 = score(doc=562,freq=2.0), product of:
                0.12948982 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03697776 = queryNorm
                0.23214069 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
      0.75 = coord(3/4)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  2. Ibekwe-SanJuan, F.; SanJuan, E.: From term variants to research topics (2002) 0.01
    0.014326988 = product of:
      0.05730795 = sum of:
        0.05730795 = weight(_text_:evolution in 1853) [ClassicSimilarity], result of:
          0.05730795 = score(doc=1853,freq=2.0), product of:
            0.19585751 = queryWeight, product of:
              5.29663 = idf(docFreq=601, maxDocs=44218)
              0.03697776 = queryNorm
            0.2926002 = fieldWeight in 1853, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.29663 = idf(docFreq=601, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1853)
      0.25 = coord(1/4)
    
    Abstract
    In a scientific and technological watch (STW) task, an expert user needs to survey the evolution of research topics in his area of specialisation in order to detect interesting changes. The majority of methods proposing evaluation metrics (bibliometrics and scientometrics studies) for STW rely solely an statistical data analysis methods (Co-citation analysis, co-word analysis). Such methods usually work an structured databases where the units of analysis (words, keywords) are already attributed to documents by human indexers. The advent of huge amounts of unstructured textual data has rendered necessary the integration of natural language processing (NLP) techniques to first extract meaningful units from texts. We propose a method for STW which is NLP-oriented. The method not only analyses texts linguistically in order to extract terms from them, but also uses linguistic relations (syntactic variations) as the basis for clustering. Terms and variation relations are formalised as weighted di-graphs which the clustering algorithm, CPCL (Classification by Preferential Clustered Link) will seek to reduce in order to produces classes. These classes ideally represent the research topics present in the corpus. The results of the classification are subjected to validation by an expert in STW.
  3. Subramanian, S.; Shafer, K.E.: Clustering (2001) 0.01
    0.005009976 = product of:
      0.020039905 = sum of:
        0.020039905 = product of:
          0.060119715 = sum of:
            0.060119715 = weight(_text_:22 in 1046) [ClassicSimilarity], result of:
              0.060119715 = score(doc=1046,freq=2.0), product of:
                0.12948982 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03697776 = queryNorm
                0.46428138 = fieldWeight in 1046, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=1046)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    5. 5.2003 14:17:22
  4. Automatic classification research at OCLC (2002) 0.00
    0.0029224863 = product of:
      0.011689945 = sum of:
        0.011689945 = product of:
          0.035069834 = sum of:
            0.035069834 = weight(_text_:22 in 1563) [ClassicSimilarity], result of:
              0.035069834 = score(doc=1563,freq=2.0), product of:
                0.12948982 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03697776 = queryNorm
                0.2708308 = fieldWeight in 1563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1563)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    5. 5.2003 9:22:09
  5. Yoon, Y.; Lee, C.; Lee, G.G.: ¬An effective procedure for constructing a hierarchical text classification system (2006) 0.00
    0.0029224863 = product of:
      0.011689945 = sum of:
        0.011689945 = product of:
          0.035069834 = sum of:
            0.035069834 = weight(_text_:22 in 5273) [ClassicSimilarity], result of:
              0.035069834 = score(doc=5273,freq=2.0), product of:
                0.12948982 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03697776 = queryNorm
                0.2708308 = fieldWeight in 5273, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5273)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    22. 7.2006 16:24:52
  6. Yi, K.: Automatic text classification using library classification schemes : trends, issues and challenges (2007) 0.00
    0.0029224863 = product of:
      0.011689945 = sum of:
        0.011689945 = product of:
          0.035069834 = sum of:
            0.035069834 = weight(_text_:22 in 2560) [ClassicSimilarity], result of:
              0.035069834 = score(doc=2560,freq=2.0), product of:
                0.12948982 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03697776 = queryNorm
                0.2708308 = fieldWeight in 2560, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2560)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    22. 9.2008 18:31:54
  7. Liu, R.-L.: Context recognition for hierarchical text classification (2009) 0.00
    0.002504988 = product of:
      0.010019952 = sum of:
        0.010019952 = product of:
          0.030059857 = sum of:
            0.030059857 = weight(_text_:22 in 2760) [ClassicSimilarity], result of:
              0.030059857 = score(doc=2760,freq=2.0), product of:
                0.12948982 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03697776 = queryNorm
                0.23214069 = fieldWeight in 2760, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2760)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    22. 3.2009 19:11:54
  8. Mengle, S.; Goharian, N.: Passage detection using text classification (2009) 0.00
    0.0020874902 = product of:
      0.008349961 = sum of:
        0.008349961 = product of:
          0.025049882 = sum of:
            0.025049882 = weight(_text_:22 in 2765) [ClassicSimilarity], result of:
              0.025049882 = score(doc=2765,freq=2.0), product of:
                0.12948982 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03697776 = queryNorm
                0.19345059 = fieldWeight in 2765, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2765)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    22. 3.2009 19:14:43
  9. Khoo, C.S.G.; Ng, K.; Ou, S.: ¬An exploratory study of human clustering of Web pages (2003) 0.00
    0.0016699921 = product of:
      0.0066799684 = sum of:
        0.0066799684 = product of:
          0.020039905 = sum of:
            0.020039905 = weight(_text_:22 in 2741) [ClassicSimilarity], result of:
              0.020039905 = score(doc=2741,freq=2.0), product of:
                0.12948982 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03697776 = queryNorm
                0.15476047 = fieldWeight in 2741, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2741)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Date
    12. 9.2004 9:56:22