Search (13 results, page 1 of 1)

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.23

0.23485142 = product of:
  0.31313524 = sum of:
    0.0735765 = product of:
      0.2207295 = sum of:
        0.2207295 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
          0.2207295 = score(doc=562,freq=2.0), product of:
            0.3927445 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046325076 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.33333334 = coord(1/3)
    0.2207295 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
      0.2207295 = score(doc=562,freq=2.0), product of:
        0.3927445 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046325076 = queryNorm
        0.56201804 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.018829225 = product of:
      0.03765845 = sum of:
        0.03765845 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
          0.03765845 = score(doc=562,freq=2.0), product of:
            0.16222252 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046325076 = queryNorm
            0.23214069 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32

Wu, K.J.; Chen, M.-C.; Sun, Y.: Automatic topics discovery from hyperlinked documents (2004) 0.01
```
0.01220762 = product of:
  0.04883048 = sum of:
    0.04883048 = weight(_text_:social in 2563) [ClassicSimilarity], result of:
      0.04883048 = score(doc=2563,freq=2.0), product of:
        0.1847249 = queryWeight, product of:
          3.9875789 = idf(docFreq=2228, maxDocs=44218)
          0.046325076 = queryNorm
        0.26434162 = fieldWeight in 2563, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9875789 = idf(docFreq=2228, maxDocs=44218)
          0.046875 = fieldNorm(doc=2563)
  0.25 = coord(1/4)
```
Abstract

Topic discovery is an important means for marketing, e-Business and social science studies. As well, it can be applied to various purposes, such as identifying a group with certain properties and observing the emergence and diminishment of a certain cyber community. Previous topic discovery work (J.M. Kleinberg, Proceedings of the 9th Annual ACM-SIAM Symposium on Discrete Algorithms, San Francisco, California, p. 668) requires manual judgment of usefulness of outcomes and is thus incapable of handling the explosive growth of the Internet. In this paper, we propose the Automatic Topic Discovery (ATD) method, which combines a method of base set construction, a clustering algorithm and an iterative principal eigenvector computation method to discover the topics relevant to a given query without using manual examination. Given a query, ATD returns with topics associated with the query and top representative pages for each topic. Our experiments show that the ATD method performs better than the traditional eigenvector method in terms of computation time and topic discovery quality.
Giorgetti, D.; Sebastiani, F.: Automating survey coding by multiclass text categorization techniques (2003) 0.01
```
0.010173016 = product of:
  0.040692065 = sum of:
    0.040692065 = weight(_text_:social in 5172) [ClassicSimilarity], result of:
      0.040692065 = score(doc=5172,freq=2.0), product of:
        0.1847249 = queryWeight, product of:
          3.9875789 = idf(docFreq=2228, maxDocs=44218)
          0.046325076 = queryNorm
        0.22028469 = fieldWeight in 5172, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9875789 = idf(docFreq=2228, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5172)
  0.25 = coord(1/4)
```
Abstract

In this issue Giorgetti, and Sebastiani suggest that answers to open ended questions in survey instruments can be coded automatically by creating classifiers which learn from training sets of manually coded answers. The manual effort required is only that of classifying a representative set of documents, not creating a dictionary of words that trigger an assignment. They use a naive Bayesian probabilistic learner from Mc Callum's RAINBOW package and the multi-class support vector machine learner from Hsu and Lin's BSVM package, both examples of text categorization techniques. Data from the 1996 General Social Survey by the U.S. National Opinion Research Center provided a set of answers to three questions (previously tested by Viechnicki using a dictionary approach), their associated manually assigned category codes, and a complete set of predefined category codes. The learners were run on three random disjoint subsets of the answer sets to create the classifiers and a remaining set was used as a test set. The dictionary approach is out preformed by 18% for RAINBOW and by 17% for BSVM, while the standard deviation of the results is reduced by 28% and 34% respectively over the dictionary approach.

Subramanian, S.; Shafer, K.E.: Clustering (2001) 0.01

0.009414612 = product of:
  0.03765845 = sum of:
    0.03765845 = product of:
      0.0753169 = sum of:
        0.0753169 = weight(_text_:22 in 1046) [ClassicSimilarity], result of:
          0.0753169 = score(doc=1046,freq=2.0), product of:
            0.16222252 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046325076 = queryNorm
            0.46428138 = fieldWeight in 1046, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=1046)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 5. 5.2003 14:17:22

Reiner, U.: Automatische DDC-Klassifizierung von bibliografischen Titeldatensätzen (2009) 0.01

0.007845511 = product of:
  0.031382043 = sum of:
    0.031382043 = product of:
      0.062764086 = sum of:
        0.062764086 = weight(_text_:22 in 611) [ClassicSimilarity], result of:
          0.062764086 = score(doc=611,freq=2.0), product of:
            0.16222252 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046325076 = queryNorm
            0.38690117 = fieldWeight in 611, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=611)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 22. 8.2009 12:54:24

Automatic classification research at OCLC (2002) 0.01

0.005491857 = product of:
  0.021967428 = sum of:
    0.021967428 = product of:
      0.043934856 = sum of:
        0.043934856 = weight(_text_:22 in 1563) [ClassicSimilarity], result of:
          0.043934856 = score(doc=1563,freq=2.0), product of:
            0.16222252 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046325076 = queryNorm
            0.2708308 = fieldWeight in 1563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1563)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 5. 5.2003 9:22:09

Yoon, Y.; Lee, C.; Lee, G.G.: ¬An effective procedure for constructing a hierarchical text classification system (2006) 0.01

0.005491857 = product of:
  0.021967428 = sum of:
    0.021967428 = product of:
      0.043934856 = sum of:
        0.043934856 = weight(_text_:22 in 5273) [ClassicSimilarity], result of:
          0.043934856 = score(doc=5273,freq=2.0), product of:
            0.16222252 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046325076 = queryNorm
            0.2708308 = fieldWeight in 5273, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5273)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 22. 7.2006 16:24:52

Yi, K.: Automatic text classification using library classification schemes : trends, issues and challenges (2007) 0.01

0.005491857 = product of:
  0.021967428 = sum of:
    0.021967428 = product of:
      0.043934856 = sum of:
        0.043934856 = weight(_text_:22 in 2560) [ClassicSimilarity], result of:
          0.043934856 = score(doc=2560,freq=2.0), product of:
            0.16222252 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046325076 = queryNorm
            0.2708308 = fieldWeight in 2560, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2560)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 22. 9.2008 18:31:54

Liu, R.-L.: Context recognition for hierarchical text classification (2009) 0.00

0.004707306 = product of:
  0.018829225 = sum of:
    0.018829225 = product of:
      0.03765845 = sum of:
        0.03765845 = weight(_text_:22 in 2760) [ClassicSimilarity], result of:
          0.03765845 = score(doc=2760,freq=2.0), product of:
            0.16222252 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046325076 = queryNorm
            0.23214069 = fieldWeight in 2760, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2760)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 22. 3.2009 19:11:54

Pfeffer, M.: Automatische Vergabe von RVK-Notationen mittels fallbasiertem Schließen (2009) 0.00

0.004707306 = product of:
  0.018829225 = sum of:
    0.018829225 = product of:
      0.03765845 = sum of:
        0.03765845 = weight(_text_:22 in 3051) [ClassicSimilarity], result of:
          0.03765845 = score(doc=3051,freq=2.0), product of:
            0.16222252 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046325076 = queryNorm
            0.23214069 = fieldWeight in 3051, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=3051)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 22. 8.2009 19:51:28

Mengle, S.; Goharian, N.: Passage detection using text classification (2009) 0.00

0.0039227554 = product of:
  0.015691021 = sum of:
    0.015691021 = product of:
      0.031382043 = sum of:
        0.031382043 = weight(_text_:22 in 2765) [ClassicSimilarity], result of:
          0.031382043 = score(doc=2765,freq=2.0), product of:
            0.16222252 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046325076 = queryNorm
            0.19345059 = fieldWeight in 2765, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2765)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 22. 3.2009 19:14:43

Khoo, C.S.G.; Ng, K.; Ou, S.: ¬An exploratory study of human clustering of Web pages (2003) 0.00

0.003138204 = product of:
  0.012552816 = sum of:
    0.012552816 = product of:
      0.025105633 = sum of:
        0.025105633 = weight(_text_:22 in 2741) [ClassicSimilarity], result of:
          0.025105633 = score(doc=2741,freq=2.0), product of:
            0.16222252 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046325076 = queryNorm
            0.15476047 = fieldWeight in 2741, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=2741)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 12. 9.2004 9:56:22

Reiner, U.: Automatische DDC-Klassifizierung bibliografischer Titeldatensätze der Deutschen Nationalbibliografie (2009) 0.00

0.003138204 = product of:
  0.012552816 = sum of:
    0.012552816 = product of:
      0.025105633 = sum of:
        0.025105633 = weight(_text_:22 in 3284) [ClassicSimilarity], result of:
          0.025105633 = score(doc=3284,freq=2.0), product of:
            0.16222252 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046325076 = queryNorm
            0.15476047 = fieldWeight in 3284, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=3284)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 22. 1.2010 14:41:24

Search (13 results, page 1 of 1)

Authors

Languages

Types

Themes