Search (9 results, page 1 of 1)

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.05

0.0489438 = product of:
  0.081572995 = sum of:
    0.060778037 = product of:
      0.18233411 = sum of:
        0.18233411 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
          0.18233411 = score(doc=562,freq=2.0), product of:
            0.32442752 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03826694 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.33333334 = coord(1/3)
    0.0052410355 = weight(_text_:e in 562) [ClassicSimilarity], result of:
      0.0052410355 = score(doc=562,freq=2.0), product of:
        0.055003747 = queryWeight, product of:
          1.43737 = idf(docFreq=28552, maxDocs=44218)
          0.03826694 = queryNorm
        0.09528506 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.43737 = idf(docFreq=28552, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.015553925 = product of:
      0.03110785 = sum of:
        0.03110785 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
          0.03110785 = score(doc=562,freq=2.0), product of:
            0.1340043 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03826694 = queryNorm
            0.23214069 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.5 = coord(1/2)
  0.6 = coord(3/5)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32
Language: e

Ibekwe-SanJuan, F.; SanJuan, E.: From term variants to research topics (2002) 0.01
```
0.010378103 = product of:
  0.025945257 = sum of:
    0.00617662 = weight(_text_:e in 1853) [ClassicSimilarity], result of:
      0.00617662 = score(doc=1853,freq=4.0), product of:
        0.055003747 = queryWeight, product of:
          1.43737 = idf(docFreq=28552, maxDocs=44218)
          0.03826694 = queryNorm
        0.112294525 = fieldWeight in 1853, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.43737 = idf(docFreq=28552, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1853)
    0.019768637 = product of:
      0.05930591 = sum of:
        0.05930591 = weight(_text_:evolution in 1853) [ClassicSimilarity], result of:
          0.05930591 = score(doc=1853,freq=2.0), product of:
            0.2026858 = queryWeight, product of:
              5.29663 = idf(docFreq=601, maxDocs=44218)
              0.03826694 = queryNorm
            0.2926002 = fieldWeight in 1853, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.29663 = idf(docFreq=601, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1853)
      0.33333334 = coord(1/3)
  0.4 = coord(2/5)
```
Abstract

In a scientific and technological watch (STW) task, an expert user needs to survey the evolution of research topics in his area of specialisation in order to detect interesting changes. The majority of methods proposing evaluation metrics (bibliometrics and scientometrics studies) for STW rely solely an statistical data analysis methods (Co-citation analysis, co-word analysis). Such methods usually work an structured databases where the units of analysis (words, keywords) are already attributed to documents by human indexers. The advent of huge amounts of unstructured textual data has rendered necessary the integration of natural language processing (NLP) techniques to first extract meaningful units from texts. We propose a method for STW which is NLP-oriented. The method not only analyses texts linguistically in order to extract terms from them, but also uses linguistic relations (syntactic variations) as the basis for clustering. Terms and variation relations are formalised as weighted di-graphs which the clustering algorithm, CPCL (Classification by Preferential Clustered Link) will seek to reduce in order to produces classes. These classes ideally represent the research topics present in the corpus. The results of the classification are subjected to validation by an expert in STW.

Language

e

Zhang, X: Rough set theory based automatic text categorization (2005) 0.00

0.0013976095 = product of:
  0.006988047 = sum of:
    0.006988047 = weight(_text_:e in 2822) [ClassicSimilarity], result of:
      0.006988047 = score(doc=2822,freq=2.0), product of:
        0.055003747 = queryWeight, product of:
          1.43737 = idf(docFreq=28552, maxDocs=44218)
          0.03826694 = queryNorm
        0.12704675 = fieldWeight in 2822, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.43737 = idf(docFreq=28552, maxDocs=44218)
          0.0625 = fieldNorm(doc=2822)
  0.2 = coord(1/5)

Language: e

Ruiz, M.E.; Srinivasan, P.: Combining machine learning and hierarchical indexing structures for text categorization (2001) 0.00

0.0012229083 = product of:
  0.006114541 = sum of:
    0.006114541 = weight(_text_:e in 1595) [ClassicSimilarity], result of:
      0.006114541 = score(doc=1595,freq=2.0), product of:
        0.055003747 = queryWeight, product of:
          1.43737 = idf(docFreq=28552, maxDocs=44218)
          0.03826694 = queryNorm
        0.1111659 = fieldWeight in 1595, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.43737 = idf(docFreq=28552, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1595)
  0.2 = coord(1/5)

Language: e

Sebastiani, F.: Machine learning in automated text categorization (2002) 0.00

0.0010482072 = product of:
  0.0052410355 = sum of:
    0.0052410355 = weight(_text_:e in 3389) [ClassicSimilarity], result of:
      0.0052410355 = score(doc=3389,freq=2.0), product of:
        0.055003747 = queryWeight, product of:
          1.43737 = idf(docFreq=28552, maxDocs=44218)
          0.03826694 = queryNorm
        0.09528506 = fieldWeight in 3389, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.43737 = idf(docFreq=28552, maxDocs=44218)
          0.046875 = fieldNorm(doc=3389)
  0.2 = coord(1/5)

Language: e

Sebastiani, F.: ¬A tutorial an automated text categorisation (1999) 0.00

0.0010482072 = product of:
  0.0052410355 = sum of:
    0.0052410355 = weight(_text_:e in 3390) [ClassicSimilarity], result of:
      0.0052410355 = score(doc=3390,freq=2.0), product of:
        0.055003747 = queryWeight, product of:
          1.43737 = idf(docFreq=28552, maxDocs=44218)
          0.03826694 = queryNorm
        0.09528506 = fieldWeight in 3390, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.43737 = idf(docFreq=28552, maxDocs=44218)
          0.046875 = fieldNorm(doc=3390)
  0.2 = coord(1/5)

Language: e

Duwairi, R.M.: Machine learning for Arabic text categorization (2006) 0.00

0.0010482072 = product of:
  0.0052410355 = sum of:
    0.0052410355 = weight(_text_:e in 5115) [ClassicSimilarity], result of:
      0.0052410355 = score(doc=5115,freq=2.0), product of:
        0.055003747 = queryWeight, product of:
          1.43737 = idf(docFreq=28552, maxDocs=44218)
          0.03826694 = queryNorm
        0.09528506 = fieldWeight in 5115, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.43737 = idf(docFreq=28552, maxDocs=44218)
          0.046875 = fieldNorm(doc=5115)
  0.2 = coord(1/5)

Language: e

Ko, Y.: ¬A new term-weighting scheme for text classification using the odds of positive and negative class probabilities (2015) 0.00

0.0010482072 = product of:
  0.0052410355 = sum of:
    0.0052410355 = weight(_text_:e in 2339) [ClassicSimilarity], result of:
      0.0052410355 = score(doc=2339,freq=2.0), product of:
        0.055003747 = queryWeight, product of:
          1.43737 = idf(docFreq=28552, maxDocs=44218)
          0.03826694 = queryNorm
        0.09528506 = fieldWeight in 2339, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.43737 = idf(docFreq=28552, maxDocs=44218)
          0.046875 = fieldNorm(doc=2339)
  0.2 = coord(1/5)

Language: e

Peng, F.; Huang, X.: Machine learning for Asian language text classification (2007) 0.00

8.73506E-4 = product of:
  0.00436753 = sum of:
    0.00436753 = weight(_text_:e in 831) [ClassicSimilarity], result of:
      0.00436753 = score(doc=831,freq=2.0), product of:
        0.055003747 = queryWeight, product of:
          1.43737 = idf(docFreq=28552, maxDocs=44218)
          0.03826694 = queryNorm
        0.07940422 = fieldWeight in 831, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.43737 = idf(docFreq=28552, maxDocs=44218)
          0.0390625 = fieldNorm(doc=831)
  0.2 = coord(1/5)

Language: e

Search (9 results, page 1 of 1)

Authors

Years

Types