Search (15 results, page 1 of 1)

Egbert, J.; Biber, D.; Davies, M.: Developing a bottom-up, user-based method of web register classification (2015) 0.02

0.022825247 = product of:
  0.045650493 = sum of:
    0.045650493 = sum of:
      0.010381345 = weight(_text_:d in 2158) [ClassicSimilarity], result of:
        0.010381345 = score(doc=2158,freq=2.0), product of:
          0.08242767 = queryWeight, product of:
            1.899872 = idf(docFreq=17979, maxDocs=44218)
            0.04338591 = queryNorm
          0.1259449 = fieldWeight in 2158, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            1.899872 = idf(docFreq=17979, maxDocs=44218)
            0.046875 = fieldNorm(doc=2158)
      0.03526915 = weight(_text_:22 in 2158) [ClassicSimilarity], result of:
        0.03526915 = score(doc=2158,freq=2.0), product of:
          0.15193006 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04338591 = queryNorm
          0.23214069 = fieldWeight in 2158, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=2158)
  0.5 = coord(1/2)

Date: 4. 8.2015 19:22:04

HaCohen-Kerner, Y. et al.: Classification using various machine learning methods and combinations of key-phrases and visual features (2016) 0.01

0.01469548 = product of:
  0.02939096 = sum of:
    0.02939096 = product of:
      0.05878192 = sum of:
        0.05878192 = weight(_text_:22 in 2748) [ClassicSimilarity], result of:
          0.05878192 = score(doc=2748,freq=2.0), product of:
            0.15193006 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04338591 = queryNorm
            0.38690117 = fieldWeight in 2748, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2748)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 1. 2.2016 18:25:22

Zhu, W.Z.; Allen, R.B.: Document clustering using the LSI subspace signature model (2013) 0.01

0.008817287 = product of:
  0.017634574 = sum of:
    0.017634574 = product of:
      0.03526915 = sum of:
        0.03526915 = weight(_text_:22 in 690) [ClassicSimilarity], result of:
          0.03526915 = score(doc=690,freq=2.0), product of:
            0.15193006 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04338591 = queryNorm
            0.23214069 = fieldWeight in 690, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=690)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 23. 3.2013 13:22:36

Liu, R.-L.: ¬A passage extractor for classification of disease aspect information (2013) 0.01

0.00734774 = product of:
  0.01469548 = sum of:
    0.01469548 = product of:
      0.02939096 = sum of:
        0.02939096 = weight(_text_:22 in 1107) [ClassicSimilarity], result of:
          0.02939096 = score(doc=1107,freq=2.0), product of:
            0.15193006 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04338591 = queryNorm
            0.19345059 = fieldWeight in 1107, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1107)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 28.10.2013 19:22:57

Liu, R.-L.: Context-based term frequency assessment for text classification (2010) 0.00
```
0.0044952547 = product of:
  0.008990509 = sum of:
    0.008990509 = product of:
      0.017981019 = sum of:
        0.017981019 = weight(_text_:d in 3331) [ClassicSimilarity], result of:
          0.017981019 = score(doc=3331,freq=6.0), product of:
            0.08242767 = queryWeight, product of:
              1.899872 = idf(docFreq=17979, maxDocs=44218)
              0.04338591 = queryNorm
            0.21814299 = fieldWeight in 3331, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.899872 = idf(docFreq=17979, maxDocs=44218)
              0.046875 = fieldNorm(doc=3331)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Automatic text classification (TC) is essential for the management of information. To properly classify a document d, it is essential to identify the semantics of each term t in d, while the semantics heavily depend on context (neighboring terms) of t in d. Therefore, we present a technique CTFA (Context-based Term Frequency Assessment) that improves text classifiers by considering term contexts in test documents. The results of the term context recognition are used to assess term frequencies of terms, and hence CTFA may easily work with various kinds of text classifiers that base their TC decisions on term frequencies, without needing to modify the classifiers. Moreover, CTFA is efficient, and neither huge memory nor domain-specific knowledge is required. Empirical results show that CTFA successfully enhances performance of several kinds of text classifiers on different experimental data.

Golub, K.; Soergel, D.; Buchanan, G.; Tudhope, D.; Lykke, M.; Hiom, D.: ¬A framework for evaluating automatic indexing or classification in the context of retrieval (2016) 0.00

0.0037460455 = product of:
  0.007492091 = sum of:
    0.007492091 = product of:
      0.014984182 = sum of:
        0.014984182 = weight(_text_:d in 3311) [ClassicSimilarity], result of:
          0.014984182 = score(doc=3311,freq=6.0), product of:
            0.08242767 = queryWeight, product of:
              1.899872 = idf(docFreq=17979, maxDocs=44218)
              0.04338591 = queryNorm
            0.18178582 = fieldWeight in 3311, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.899872 = idf(docFreq=17979, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3311)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Jersek, T.: Automatische DDC-Klassifizierung mit Lingo : Vorgehensweise und Ergebnisse (2012) 0.00

0.0034604485 = product of:
  0.006920897 = sum of:
    0.006920897 = product of:
      0.013841794 = sum of:
        0.013841794 = weight(_text_:d in 122) [ClassicSimilarity], result of:
          0.013841794 = score(doc=122,freq=2.0), product of:
            0.08242767 = queryWeight, product of:
              1.899872 = idf(docFreq=17979, maxDocs=44218)
              0.04338591 = queryNorm
            0.16792654 = fieldWeight in 122, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.899872 = idf(docFreq=17979, maxDocs=44218)
              0.0625 = fieldNorm(doc=122)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Language: d

Golub, K.; Hansson, J.; Soergel, D.; Tudhope, D.: Managing classification in libraries : a methodological outline for evaluating automatic subject indexing and classification in Swedish library catalogues (2015) 0.00

0.0030586333 = product of:
  0.0061172666 = sum of:
    0.0061172666 = product of:
      0.012234533 = sum of:
        0.012234533 = weight(_text_:d in 2300) [ClassicSimilarity], result of:
          0.012234533 = score(doc=2300,freq=4.0), product of:
            0.08242767 = queryWeight, product of:
              1.899872 = idf(docFreq=17979, maxDocs=44218)
              0.04338591 = queryNorm
            0.1484275 = fieldWeight in 2300, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.899872 = idf(docFreq=17979, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2300)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Sommer, M.: Automatische Generierung von DDC-Notationen für Hochschulveröffentlichungen (2012) 0.00

0.0025953362 = product of:
  0.0051906724 = sum of:
    0.0051906724 = product of:
      0.010381345 = sum of:
        0.010381345 = weight(_text_:d in 587) [ClassicSimilarity], result of:
          0.010381345 = score(doc=587,freq=2.0), product of:
            0.08242767 = queryWeight, product of:
              1.899872 = idf(docFreq=17979, maxDocs=44218)
              0.04338591 = queryNorm
            0.1259449 = fieldWeight in 587, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.899872 = idf(docFreq=17979, maxDocs=44218)
              0.046875 = fieldNorm(doc=587)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Language: d

Kasprzik, A.: Automatisierte und semiautomatisierte Klassifizierung : eine Analyse aktueller Projekte (2014) 0.00

0.0025953362 = product of:
  0.0051906724 = sum of:
    0.0051906724 = product of:
      0.010381345 = sum of:
        0.010381345 = weight(_text_:d in 2470) [ClassicSimilarity], result of:
          0.010381345 = score(doc=2470,freq=2.0), product of:
            0.08242767 = queryWeight, product of:
              1.899872 = idf(docFreq=17979, maxDocs=44218)
              0.04338591 = queryNorm
            0.1259449 = fieldWeight in 2470, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.899872 = idf(docFreq=17979, maxDocs=44218)
              0.046875 = fieldNorm(doc=2470)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Language: d

Groß, T.; Faden, M.: Automatische Indexierung elektronischer Dokumente an der Deutschen Zentralbibliothek für Wirtschaftswissenschaften : Bericht über die Jahrestagung der Internationalen Buchwissenschaftlichen Gesellschaft (2010) 0.00

0.0024469066 = product of:
  0.0048938133 = sum of:
    0.0048938133 = product of:
      0.009787627 = sum of:
        0.009787627 = weight(_text_:d in 4051) [ClassicSimilarity], result of:
          0.009787627 = score(doc=4051,freq=4.0), product of:
            0.08242767 = queryWeight, product of:
              1.899872 = idf(docFreq=17979, maxDocs=44218)
              0.04338591 = queryNorm
            0.118742 = fieldWeight in 4051, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.899872 = idf(docFreq=17979, maxDocs=44218)
              0.03125 = fieldNorm(doc=4051)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Language: d
Location: D

HaCohen-Kerner, Y.; Beck, H.; Yehudai, E.; Rosenstein, M.; Mughaz, D.: Cuisine : classification using stylistic feature sets and/or name-based feature sets (2010) 0.00

0.0021627804 = product of:
  0.0043255608 = sum of:
    0.0043255608 = product of:
      0.0086511215 = sum of:
        0.0086511215 = weight(_text_:d in 3706) [ClassicSimilarity], result of:
          0.0086511215 = score(doc=3706,freq=2.0), product of:
            0.08242767 = queryWeight, product of:
              1.899872 = idf(docFreq=17979, maxDocs=44218)
              0.04338591 = queryNorm
            0.104954086 = fieldWeight in 3706, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.899872 = idf(docFreq=17979, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3706)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Qu, B.; Cong, G.; Li, C.; Sun, A.; Chen, H.: ¬An evaluation of classification models for question topic categorization (2012) 0.00
```
0.0021627804 = product of:
  0.0043255608 = sum of:
    0.0043255608 = product of:
      0.0086511215 = sum of:
        0.0086511215 = weight(_text_:d in 237) [ClassicSimilarity], result of:
          0.0086511215 = score(doc=237,freq=2.0), product of:
            0.08242767 = queryWeight, product of:
              1.899872 = idf(docFreq=17979, maxDocs=44218)
              0.04338591 = queryNorm
            0.104954086 = fieldWeight in 237, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.899872 = idf(docFreq=17979, maxDocs=44218)
              0.0390625 = fieldNorm(doc=237)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

We study the problem of question topic classification using a very large real-world Community Question Answering (CQA) dataset from Yahoo! Answers. The dataset comprises 3.9 million questions and these questions are organized into more than 1,000 categories in a hierarchy. To the best knowledge, this is the first systematic evaluation of the performance of different classification methods on question topic classification as well as short texts. Specifically, we empirically evaluate the following in classifying questions into CQA categories: (a) the usefulness of n-gram features and bag-of-word features; (b) the performance of three standard classification algorithms (naive Bayes, maximum entropy, and support vector machines); (c) the performance of the state-of-the-art hierarchical classification algorithms; (d) the effect of training data size on performance; and (e) the effectiveness of the different components of CQA data, including subject, content, asker, and the best answer. The experimental results show what aspects are important for question topic classification in terms of both effectiveness and efficiency. We believe that the experimental findings from this study will be useful in real-world classification problems.

Alberts, I.; Forest, D.: Email pragmatics and automatic classification : a study in the organizational context (2012) 0.00

0.0021627804 = product of:
  0.0043255608 = sum of:
    0.0043255608 = product of:
      0.0086511215 = sum of:
        0.0086511215 = weight(_text_:d in 238) [ClassicSimilarity], result of:
          0.0086511215 = score(doc=238,freq=2.0), product of:
            0.08242767 = queryWeight, product of:
              1.899872 = idf(docFreq=17979, maxDocs=44218)
              0.04338591 = queryNorm
            0.104954086 = fieldWeight in 238, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.899872 = idf(docFreq=17979, maxDocs=44218)
              0.0390625 = fieldNorm(doc=238)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Vilares, D.; Alonso, M.A.; Gómez-Rodríguez, C.: On the usefulness of lexical and syntactic processing in polarity classification of Twitter messages (2015) 0.00

0.0021627804 = product of:
  0.0043255608 = sum of:
    0.0043255608 = product of:
      0.0086511215 = sum of:
        0.0086511215 = weight(_text_:d in 2161) [ClassicSimilarity], result of:
          0.0086511215 = score(doc=2161,freq=2.0), product of:
            0.08242767 = queryWeight, product of:
              1.899872 = idf(docFreq=17979, maxDocs=44218)
              0.04338591 = queryNorm
            0.104954086 = fieldWeight in 2161, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.899872 = idf(docFreq=17979, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2161)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Search (15 results, page 1 of 1)

Authors

Languages

Types

Themes