Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004)
0.31
0.3089268 = product of:
0.6178536 = sum of:
0.023380058 = product of:
0.11690029 = sum of:
0.11690029 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
0.11690029 = score(doc=562,freq=2.0), product of:
0.20800096 = queryWeight, product of:
8.478011 = idf(docFreq=24, maxDocs=44218)
0.02453417 = queryNorm
0.56201804 = fieldWeight in 562, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
8.478011 = idf(docFreq=24, maxDocs=44218)
0.046875 = fieldNorm(doc=562)
0.2 = coord(1/5)
0.11690029 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
0.11690029 = score(doc=562,freq=2.0), product of:
0.20800096 = queryWeight, product of:
8.478011 = idf(docFreq=24, maxDocs=44218)
0.02453417 = queryNorm
0.56201804 = fieldWeight in 562, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
8.478011 = idf(docFreq=24, maxDocs=44218)
0.046875 = fieldNorm(doc=562)
0.11690029 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
0.11690029 = score(doc=562,freq=2.0), product of:
0.20800096 = queryWeight, product of:
8.478011 = idf(docFreq=24, maxDocs=44218)
0.02453417 = queryNorm
0.56201804 = fieldWeight in 562, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
8.478011 = idf(docFreq=24, maxDocs=44218)
0.046875 = fieldNorm(doc=562)
0.11690029 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
0.11690029 = score(doc=562,freq=2.0), product of:
0.20800096 = queryWeight, product of:
8.478011 = idf(docFreq=24, maxDocs=44218)
0.02453417 = queryNorm
0.56201804 = fieldWeight in 562, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
8.478011 = idf(docFreq=24, maxDocs=44218)
0.046875 = fieldNorm(doc=562)
0.11690029 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
0.11690029 = score(doc=562,freq=2.0), product of:
0.20800096 = queryWeight, product of:
8.478011 = idf(docFreq=24, maxDocs=44218)
0.02453417 = queryNorm
0.56201804 = fieldWeight in 562, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
8.478011 = idf(docFreq=24, maxDocs=44218)
0.046875 = fieldNorm(doc=562)
0.11690029 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
0.11690029 = score(doc=562,freq=2.0), product of:
0.20800096 = queryWeight, product of:
8.478011 = idf(docFreq=24, maxDocs=44218)
0.02453417 = queryNorm
0.56201804 = fieldWeight in 562, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
8.478011 = idf(docFreq=24, maxDocs=44218)
0.046875 = fieldNorm(doc=562)
0.009972124 = product of:
0.019944249 = sum of:
0.019944249 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
0.019944249 = score(doc=562,freq=2.0), product of:
0.085914485 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.02453417 = queryNorm
0.23214069 = fieldWeight in 562, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.046875 = fieldNorm(doc=562)
0.5 = coord(1/2)
0.5 = coord(7/14)
- Content
- Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
- Date
- 8. 1.2013 10:22:32
Ko, Y.: ¬A new term-weighting scheme for text classification using the odds of positive and negative class probabilities (2015)
0.01
0.0062771165 = product of:
0.043939814 = sum of:
0.014176315 = weight(_text_:information in 2339) [ClassicSimilarity], result of:
0.014176315 = score(doc=2339,freq=16.0), product of:
0.04306919 = queryWeight, product of:
1.7554779 = idf(docFreq=20772, maxDocs=44218)
0.02453417 = queryNorm
0.3291521 = fieldWeight in 2339, product of:
4.0 = tf(freq=16.0), with freq of:
16.0 = termFreq=16.0
1.7554779 = idf(docFreq=20772, maxDocs=44218)
0.046875 = fieldNorm(doc=2339)
0.029763501 = weight(_text_:retrieval in 2339) [ClassicSimilarity], result of:
0.029763501 = score(doc=2339,freq=8.0), product of:
0.07421378 = queryWeight, product of:
3.024915 = idf(docFreq=5836, maxDocs=44218)
0.02453417 = queryNorm
0.40105087 = fieldWeight in 2339, product of:
2.828427 = tf(freq=8.0), with freq of:
8.0 = termFreq=8.0
3.024915 = idf(docFreq=5836, maxDocs=44218)
0.046875 = fieldNorm(doc=2339)
0.14285715 = coord(2/14)
- Abstract
- Text classification (TC) is a core technique for text mining and information retrieval. It has been applied to many applications in many different research and industrial areas. Term-weighting schemes assign an appropriate weight to each term to obtain a high TC performance. Although term weighting is one of the important modules for TC and TC has different peculiarities from those in information retrieval, many term-weighting schemes used in information retrieval, such as term frequency-inverse document frequency (tf-idf), have been used in TC in the same manner. The peculiarity of TC that differs most from information retrieval is the existence of class information. This article proposes a new term-weighting scheme that uses class information using positive and negative class distributions. As a result, the proposed scheme, log tf-TRR, consistently performs better than do other schemes using class information as well as traditional schemes such as tf-idf.
- Source
- Journal of the Association for Information Science and Technology. 66(2015) no.12, S.2553-2565