Search (43 results, page 1 of 3)

HaCohen-Kerner, Y. et al.: Classification using various machine learning methods and combinations of key-phrases and visual features (2016) 0.04

0.036428396 = product of:
  0.07285679 = sum of:
    0.07285679 = sum of:
      0.010504999 = weight(_text_:e in 2748) [ClassicSimilarity], result of:
        0.010504999 = score(doc=2748,freq=2.0), product of:
          0.06614887 = queryWeight, product of:
            1.43737 = idf(docFreq=28552, maxDocs=44218)
            0.04602077 = queryNorm
          0.15880844 = fieldWeight in 2748, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            1.43737 = idf(docFreq=28552, maxDocs=44218)
            0.078125 = fieldNorm(doc=2748)
      0.06235179 = weight(_text_:22 in 2748) [ClassicSimilarity], result of:
        0.06235179 = score(doc=2748,freq=2.0), product of:
          0.1611569 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04602077 = queryNorm
          0.38690117 = fieldWeight in 2748, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.078125 = fieldNorm(doc=2748)
  0.5 = coord(1/2)

Date: 1. 2.2016 18:25:22
Language: e

Zhu, W.Z.; Allen, R.B.: Document clustering using the LSI subspace signature model (2013) 0.02

0.021857034 = product of:
  0.04371407 = sum of:
    0.04371407 = sum of:
      0.006302999 = weight(_text_:e in 690) [ClassicSimilarity], result of:
        0.006302999 = score(doc=690,freq=2.0), product of:
          0.06614887 = queryWeight, product of:
            1.43737 = idf(docFreq=28552, maxDocs=44218)
            0.04602077 = queryNorm
          0.09528506 = fieldWeight in 690, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            1.43737 = idf(docFreq=28552, maxDocs=44218)
            0.046875 = fieldNorm(doc=690)
      0.03741107 = weight(_text_:22 in 690) [ClassicSimilarity], result of:
        0.03741107 = score(doc=690,freq=2.0), product of:
          0.1611569 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04602077 = queryNorm
          0.23214069 = fieldWeight in 690, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=690)
  0.5 = coord(1/2)

Date: 23. 3.2013 13:22:36
Language: e

Egbert, J.; Biber, D.; Davies, M.: Developing a bottom-up, user-based method of web register classification (2015) 0.02

0.021857034 = product of:
  0.04371407 = sum of:
    0.04371407 = sum of:
      0.006302999 = weight(_text_:e in 2158) [ClassicSimilarity], result of:
        0.006302999 = score(doc=2158,freq=2.0), product of:
          0.06614887 = queryWeight, product of:
            1.43737 = idf(docFreq=28552, maxDocs=44218)
            0.04602077 = queryNorm
          0.09528506 = fieldWeight in 2158, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            1.43737 = idf(docFreq=28552, maxDocs=44218)
            0.046875 = fieldNorm(doc=2158)
      0.03741107 = weight(_text_:22 in 2158) [ClassicSimilarity], result of:
        0.03741107 = score(doc=2158,freq=2.0), product of:
          0.1611569 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04602077 = queryNorm
          0.23214069 = fieldWeight in 2158, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=2158)
  0.5 = coord(1/2)

Date: 4. 8.2015 19:22:04
Language: e

Liu, R.-L.: ¬A passage extractor for classification of disease aspect information (2013) 0.02

0.018214198 = product of:
  0.036428396 = sum of:
    0.036428396 = sum of:
      0.0052524996 = weight(_text_:e in 1107) [ClassicSimilarity], result of:
        0.0052524996 = score(doc=1107,freq=2.0), product of:
          0.06614887 = queryWeight, product of:
            1.43737 = idf(docFreq=28552, maxDocs=44218)
            0.04602077 = queryNorm
          0.07940422 = fieldWeight in 1107, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            1.43737 = idf(docFreq=28552, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1107)
      0.031175895 = weight(_text_:22 in 1107) [ClassicSimilarity], result of:
        0.031175895 = score(doc=1107,freq=2.0), product of:
          0.1611569 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04602077 = queryNorm
          0.19345059 = fieldWeight in 1107, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1107)
  0.5 = coord(1/2)

Date: 28.10.2013 19:22:57
Language: e

Cortez, E.; Herrera, M.R.; Silva, A.S. da; Moura, E.S. de; Neubert, M.: Lightweight methods for large-scale product categorization (2011) 0.00
```
0.0031514994 = product of:
  0.006302999 = sum of:
    0.006302999 = product of:
      0.012605998 = sum of:
        0.012605998 = weight(_text_:e in 4758) [ClassicSimilarity], result of:
          0.012605998 = score(doc=4758,freq=8.0), product of:
            0.06614887 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.04602077 = queryNorm
            0.19057012 = fieldWeight in 4758, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.046875 = fieldNorm(doc=4758)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

In this article, we present a study about classification methods for large-scale categorization of product offers on e-shopping web sites. We present a study about the performance of previously proposed approaches and deployed a probabilistic approach to model the classification problem. We also studied an alternative way of modeling information about the description of product offers and investigated the usage of price and store of product offers as features adopted in the classification process. Our experiments used two collections of over a million product offers previously categorized by human editors and taxonomies of hundreds of categories from a real e-shopping web site. In these experiments, our method achieved an improvement of up to 9% in the quality of the categorization in comparison with the best baseline we have found.

Language

e

Teich, E.; Degaetano-Ortlieb, S.; Fankhauser, P.; Kermes, H.; Lapshinova-Koltunski, E.: ¬The linguistic construal of disciplinarity : a data-mining approach using register features (2016) 0.00

0.002729279 = product of:
  0.005458558 = sum of:
    0.005458558 = product of:
      0.010917116 = sum of:
        0.010917116 = weight(_text_:e in 3015) [ClassicSimilarity], result of:
          0.010917116 = score(doc=3015,freq=6.0), product of:
            0.06614887 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.04602077 = queryNorm
            0.16503859 = fieldWeight in 3015, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.046875 = fieldNorm(doc=3015)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Language: e

Aphinyanaphongs, Y.; Fu, L.D.; Li, Z.; Peskin, E.R.; Efstathiadis, E.; Aliferis, C.F.; Statnikov, A.: ¬A comprehensive empirical comparison of modern supervised classification and feature selection methods for text categorization (2014) 0.00

0.0022284468 = product of:
  0.0044568935 = sum of:
    0.0044568935 = product of:
      0.008913787 = sum of:
        0.008913787 = weight(_text_:e in 1496) [ClassicSimilarity], result of:
          0.008913787 = score(doc=1496,freq=4.0), product of:
            0.06614887 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.04602077 = queryNorm
            0.13475344 = fieldWeight in 1496, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.046875 = fieldNorm(doc=1496)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Language: e

Barbu, E.: What kind of knowledge is in Wikipedia? : unsupervised extraction of properties for similar concepts (2014) 0.00

0.0022284468 = product of:
  0.0044568935 = sum of:
    0.0044568935 = product of:
      0.008913787 = sum of:
        0.008913787 = weight(_text_:e in 1547) [ClassicSimilarity], result of:
          0.008913787 = score(doc=1547,freq=4.0), product of:
            0.06614887 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.04602077 = queryNorm
            0.13475344 = fieldWeight in 1547, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.046875 = fieldNorm(doc=1547)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Language: e

HaCohen-Kerner, Y.; Beck, H.; Yehudai, E.; Rosenstein, M.; Mughaz, D.: Cuisine : classification using stylistic feature sets and/or name-based feature sets (2010) 0.00

0.0018570389 = product of:
  0.0037140779 = sum of:
    0.0037140779 = product of:
      0.0074281557 = sum of:
        0.0074281557 = weight(_text_:e in 3706) [ClassicSimilarity], result of:
          0.0074281557 = score(doc=3706,freq=4.0), product of:
            0.06614887 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.04602077 = queryNorm
            0.112294525 = fieldWeight in 3706, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3706)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Language: e

Qu, B.; Cong, G.; Li, C.; Sun, A.; Chen, H.: ¬An evaluation of classification models for question topic categorization (2012) 0.00
```
0.0018570389 = product of:
  0.0037140779 = sum of:
    0.0037140779 = product of:
      0.0074281557 = sum of:
        0.0074281557 = weight(_text_:e in 237) [ClassicSimilarity], result of:
          0.0074281557 = score(doc=237,freq=4.0), product of:
            0.06614887 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.04602077 = queryNorm
            0.112294525 = fieldWeight in 237, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.0390625 = fieldNorm(doc=237)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

We study the problem of question topic classification using a very large real-world Community Question Answering (CQA) dataset from Yahoo! Answers. The dataset comprises 3.9 million questions and these questions are organized into more than 1,000 categories in a hierarchy. To the best knowledge, this is the first systematic evaluation of the performance of different classification methods on question topic classification as well as short texts. Specifically, we empirically evaluate the following in classifying questions into CQA categories: (a) the usefulness of n-gram features and bag-of-word features; (b) the performance of three standard classification algorithms (naive Bayes, maximum entropy, and support vector machines); (c) the performance of the state-of-the-art hierarchical classification algorithms; (d) the effect of training data size on performance; and (e) the effectiveness of the different components of CQA data, including subject, content, asker, and the best answer. The experimental results show what aspects are important for question topic classification in terms of both effectiveness and efficiency. We believe that the experimental findings from this study will be useful in real-world classification problems.

Language

e

Liu, R.-L.: Context-based term frequency assessment for text classification (2010) 0.00

0.0015757497 = product of:
  0.0031514994 = sum of:
    0.0031514994 = product of:
      0.006302999 = sum of:
        0.006302999 = weight(_text_:e in 3331) [ClassicSimilarity], result of:
          0.006302999 = score(doc=3331,freq=2.0), product of:
            0.06614887 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.04602077 = queryNorm
            0.09528506 = fieldWeight in 3331, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.046875 = fieldNorm(doc=3331)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Language: e

Liu, X.; Yu, S.; Janssens, F.; Glänzel, W.; Moreau, Y.; Moor, B.de: Weighted hybrid clustering by combining text mining and bibliometrics on a large-scale journal database (2010) 0.00

0.0015757497 = product of:
  0.0031514994 = sum of:
    0.0031514994 = product of:
      0.006302999 = sum of:
        0.006302999 = weight(_text_:e in 3464) [ClassicSimilarity], result of:
          0.006302999 = score(doc=3464,freq=2.0), product of:
            0.06614887 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.04602077 = queryNorm
            0.09528506 = fieldWeight in 3464, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.046875 = fieldNorm(doc=3464)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Language: e

Golub, K.: Automated subject classification of textual documents in the context of Web-based hierarchical browsing (2011) 0.00

0.0015757497 = product of:
  0.0031514994 = sum of:
    0.0031514994 = product of:
      0.006302999 = sum of:
        0.006302999 = weight(_text_:e in 4558) [ClassicSimilarity], result of:
          0.006302999 = score(doc=4558,freq=2.0), product of:
            0.06614887 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.04602077 = queryNorm
            0.09528506 = fieldWeight in 4558, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.046875 = fieldNorm(doc=4558)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Language: e

Malo, P.; Sinha, A.; Wallenius, J.; Korhonen, P.: Concept-based document classification using Wikipedia and value function (2011) 0.00

0.0015757497 = product of:
  0.0031514994 = sum of:
    0.0031514994 = product of:
      0.006302999 = sum of:
        0.006302999 = weight(_text_:e in 4948) [ClassicSimilarity], result of:
          0.006302999 = score(doc=4948,freq=2.0), product of:
            0.06614887 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.04602077 = queryNorm
            0.09528506 = fieldWeight in 4948, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.046875 = fieldNorm(doc=4948)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Language: e

Schaalje, G.B.; Blades, N.J.; Funai, T.: ¬An open-set size-adjusted Bayesian classifier for authorship attribution (2013) 0.00

0.0015757497 = product of:
  0.0031514994 = sum of:
    0.0031514994 = product of:
      0.006302999 = sum of:
        0.006302999 = weight(_text_:e in 1041) [ClassicSimilarity], result of:
          0.006302999 = score(doc=1041,freq=2.0), product of:
            0.06614887 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.04602077 = queryNorm
            0.09528506 = fieldWeight in 1041, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.046875 = fieldNorm(doc=1041)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Language: e

Sojka, P.; Lee, M.; Rehurek, R.; Hatlapatka, R.; Kucbel, M.; Bouche, T.; Goutorbe, C.; Anghelache, R.; Wojciechowski, K.: Toolset for entity and semantic associations : Final Release (2013) 0.00

0.0015757497 = product of:
  0.0031514994 = sum of:
    0.0031514994 = product of:
      0.006302999 = sum of:
        0.006302999 = weight(_text_:e in 1057) [ClassicSimilarity], result of:
          0.006302999 = score(doc=1057,freq=2.0), product of:
            0.06614887 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.04602077 = queryNorm
            0.09528506 = fieldWeight in 1057, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.046875 = fieldNorm(doc=1057)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Language: e

Desale, S.K.; Kumbhar, R.: Research on automatic classification of documents in library environment : a literature review (2013) 0.00

0.0015757497 = product of:
  0.0031514994 = sum of:
    0.0031514994 = product of:
      0.006302999 = sum of:
        0.006302999 = weight(_text_:e in 1071) [ClassicSimilarity], result of:
          0.006302999 = score(doc=1071,freq=2.0), product of:
            0.06614887 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.04602077 = queryNorm
            0.09528506 = fieldWeight in 1071, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.046875 = fieldNorm(doc=1071)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Language: e

Ko, Y.: ¬A new term-weighting scheme for text classification using the odds of positive and negative class probabilities (2015) 0.00

0.0015757497 = product of:
  0.0031514994 = sum of:
    0.0031514994 = product of:
      0.006302999 = sum of:
        0.006302999 = weight(_text_:e in 2339) [ClassicSimilarity], result of:
          0.006302999 = score(doc=2339,freq=2.0), product of:
            0.06614887 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.04602077 = queryNorm
            0.09528506 = fieldWeight in 2339, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.046875 = fieldNorm(doc=2339)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Language: e

Kishida, K.: High-speed rough clustering for very large document collections (2010) 0.00

0.0013131249 = product of:
  0.0026262498 = sum of:
    0.0026262498 = product of:
      0.0052524996 = sum of:
        0.0052524996 = weight(_text_:e in 3463) [ClassicSimilarity], result of:
          0.0052524996 = score(doc=3463,freq=2.0), product of:
            0.06614887 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.04602077 = queryNorm
            0.07940422 = fieldWeight in 3463, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3463)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Language: e

Fagni, T.; Sebastiani, F.: Selecting negative examples for hierarchical text classification: An experimental comparison (2010) 0.00

0.0013131249 = product of:
  0.0026262498 = sum of:
    0.0026262498 = product of:
      0.0052524996 = sum of:
        0.0052524996 = weight(_text_:e in 4101) [ClassicSimilarity], result of:
          0.0052524996 = score(doc=4101,freq=2.0), product of:
            0.06614887 = queryWeight, product of:
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.04602077 = queryNorm
            0.07940422 = fieldWeight in 4101, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.43737 = idf(docFreq=28552, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4101)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Language: e

Search (43 results, page 1 of 3)

Authors

Types

Themes