Search (22 results, page 1 of 2)

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.18

0.17800082 = product of:
  0.35600165 = sum of:
    0.3041166 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
      0.3041166 = score(doc=562,freq=2.0), product of:
        0.5411154 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.06382575 = queryNorm
        0.56201804 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.051885046 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
      0.051885046 = score(doc=562,freq=2.0), product of:
        0.2235069 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.06382575 = queryNorm
        0.23214069 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
  0.5 = coord(2/4)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32

Pech, G.; Delgado, C.; Sorella, S.P.: Classifying papers into subfields using Abstracts, Titles, Keywords and KeyWords Plus through pattern detection and optimization procedures : an application in Physics (2022) 0.03
```
0.030565115 = product of:
  0.12226046 = sum of:
    0.12226046 = weight(_text_:fields in 744) [ClassicSimilarity], result of:
      0.12226046 = score(doc=744,freq=4.0), product of:
        0.31604284 = queryWeight, product of:
          4.951651 = idf(docFreq=849, maxDocs=44218)
          0.06382575 = queryNorm
        0.38684773 = fieldWeight in 744, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.951651 = idf(docFreq=849, maxDocs=44218)
          0.0390625 = fieldNorm(doc=744)
  0.25 = coord(1/4)
```
Abstract

Classifying papers according to the fields of knowledge is critical to clearly understand the dynamics of scientific (sub)fields, their leading questions, and trends. Most studies rely on journal categories defined by popular databases such as WoS or Scopus, but some experts find that those categories may not correctly map the existing subfields nor identify the subfield of a specific article. This study addresses the classification problem using data from each paper (Abstract, Title, Keywords, and the KeyWords Plus) and the help of experts to identify the existing subfields and journals exclusive of each subfield. These "exclusive journals" are critical to obtain, through a pattern detection procedure that uses machine learning techniques (from software NVivo), a list of the frequent terms that are specific to each subfield. With that list of terms and with the help of optimization procedures, we can identify to which subfield each paper most likely belongs. This study can contribute to support scientific policy-makers, funding, and research institutions-via more accurate academic performance evaluations-, to support editors in their tasks to redefine the scopes of journals, and to support popular databases in their processes of refining categories.

Subramanian, S.; Shafer, K.E.: Clustering (2001) 0.03

0.025942523 = product of:
  0.10377009 = sum of:
    0.10377009 = weight(_text_:22 in 1046) [ClassicSimilarity], result of:
      0.10377009 = score(doc=1046,freq=2.0), product of:
        0.2235069 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.06382575 = queryNorm
        0.46428138 = fieldWeight in 1046, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.09375 = fieldNorm(doc=1046)
  0.25 = coord(1/4)

Date: 5. 5.2003 14:17:22

Liu, X.; Yu, S.; Janssens, F.; Glänzel, W.; Moreau, Y.; Moor, B.de: Weighted hybrid clustering by combining text mining and bibliometrics on a large-scale journal database (2010) 0.03
```
0.02593536 = product of:
  0.10374144 = sum of:
    0.10374144 = weight(_text_:fields in 3464) [ClassicSimilarity], result of:
      0.10374144 = score(doc=3464,freq=2.0), product of:
        0.31604284 = queryWeight, product of:
          4.951651 = idf(docFreq=849, maxDocs=44218)
          0.06382575 = queryNorm
        0.32825118 = fieldWeight in 3464, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.951651 = idf(docFreq=849, maxDocs=44218)
          0.046875 = fieldNorm(doc=3464)
  0.25 = coord(1/4)
```
Abstract

We propose a new hybrid clustering framework to incorporate text mining with bibliometrics in journal set analysis. The framework integrates two different approaches: clustering ensemble and kernel-fusion clustering. To improve the flexibility and the efficiency of processing large-scale data, we propose an information-based weighting scheme to leverage the effect of multiple data sources in hybrid clustering. Three different algorithms are extended by the proposed weighting scheme and they are employed on a large journal set retrieved from the Web of Science (WoS) database. The clustering performance of the proposed algorithms is systematically evaluated using multiple evaluation methods, and they were cross-compared with alternative methods. Experimental results demonstrate that the proposed weighted hybrid clustering strategy is superior to other methods in clustering performance and efficiency. The proposed approach also provides a more refined structural mapping of journal sets, which is useful for monitoring and detecting new trends in different scientific fields.

Reiner, U.: Automatische DDC-Klassifizierung von bibliografischen Titeldatensätzen (2009) 0.02

0.02161877 = product of:
  0.08647508 = sum of:
    0.08647508 = weight(_text_:22 in 611) [ClassicSimilarity], result of:
      0.08647508 = score(doc=611,freq=2.0), product of:
        0.2235069 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.06382575 = queryNorm
        0.38690117 = fieldWeight in 611, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.078125 = fieldNorm(doc=611)
  0.25 = coord(1/4)

Date: 22. 8.2009 12:54:24

HaCohen-Kerner, Y. et al.: Classification using various machine learning methods and combinations of key-phrases and visual features (2016) 0.02

0.02161877 = product of:
  0.08647508 = sum of:
    0.08647508 = weight(_text_:22 in 2748) [ClassicSimilarity], result of:
      0.08647508 = score(doc=2748,freq=2.0), product of:
        0.2235069 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.06382575 = queryNorm
        0.38690117 = fieldWeight in 2748, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.078125 = fieldNorm(doc=2748)
  0.25 = coord(1/4)

Date: 1. 2.2016 18:25:22

Alberts, I.; Forest, D.: Email pragmatics and automatic classification : a study in the organizational context (2012) 0.02
```
0.0216128 = product of:
  0.0864512 = sum of:
    0.0864512 = weight(_text_:fields in 238) [ClassicSimilarity], result of:
      0.0864512 = score(doc=238,freq=2.0), product of:
        0.31604284 = queryWeight, product of:
          4.951651 = idf(docFreq=849, maxDocs=44218)
          0.06382575 = queryNorm
        0.27354267 = fieldWeight in 238, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.951651 = idf(docFreq=849, maxDocs=44218)
          0.0390625 = fieldNorm(doc=238)
  0.25 = coord(1/4)
```
Abstract

This paper presents a two-phased research project aiming to improve email triage for public administration managers. The first phase developed a typology of email classification patterns through a qualitative study involving 34 participants. Inspired by the fields of pragmatics and speech act theory, this typology comprising four top level categories and 13 subcategories represents the typical email triage behaviors of managers in an organizational context. The second study phase was conducted on a corpus of 1,703 messages using email samples of two managers. Using the k-NN (k-nearest neighbor) algorithm, statistical treatments automatically classified the email according to lexical and nonlexical features representative of managers' triage patterns. The automatic classification of email according to the lexicon of the messages was found to be substantially more efficient when k = 2 and n = 2,000. For four categories, the average recall rate was 94.32%, the average precision rate was 94.50%, and the accuracy rate was 94.54%. For 13 categories, the average recall rate was 91.09%, the average precision rate was 84.18%, and the accuracy rate was 88.70%. It appears that a message's nonlexical features are also deeply influenced by email pragmatics. Features related to the recipient and the sender were the most relevant for characterizing email.
Mu, T.; Goulermas, J.Y.; Korkontzelos, I.; Ananiadou, S.: Descriptive document clustering via discriminant learning in a co-embedded space of multilevel similarities (2016) 0.02
```
0.0216128 = product of:
  0.0864512 = sum of:
    0.0864512 = weight(_text_:fields in 2496) [ClassicSimilarity], result of:
      0.0864512 = score(doc=2496,freq=2.0), product of:
        0.31604284 = queryWeight, product of:
          4.951651 = idf(docFreq=849, maxDocs=44218)
          0.06382575 = queryNorm
        0.27354267 = fieldWeight in 2496, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.951651 = idf(docFreq=849, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2496)
  0.25 = coord(1/4)
```
Abstract

Descriptive document clustering aims at discovering clusters of semantically interrelated documents together with meaningful labels to summarize the content of each document cluster. In this work, we propose a novel descriptive clustering framework, referred to as CEDL. It relies on the formulation and generation of 2 types of heterogeneous objects, which correspond to documents and candidate phrases, using multilevel similarity information. CEDL is composed of 5 main processing stages. First, it simultaneously maps the documents and candidate phrases into a common co-embedded space that preserves higher-order, neighbor-based proximities between the combined sets of documents and phrases. Then, it discovers an approximate cluster structure of documents in the common space. The third stage extracts promising topic phrases by constructing a discriminant model where documents along with their cluster memberships are used as training instances. Subsequently, the final cluster labels are selected from the topic phrases using a ranking scheme using multiple scores based on the extracted co-embedding information and the discriminant output. The final stage polishes the initial clusters to reduce noise and accommodate the multitopic nature of documents. The effectiveness and competitiveness of CEDL is demonstrated qualitatively and quantitatively with experiments using document databases from different application fields.

Bock, H.-H.: Datenanalyse zur Strukturierung und Ordnung von Information (1989) 0.02

0.015133139 = product of:
  0.060532555 = sum of:
    0.060532555 = weight(_text_:22 in 141) [ClassicSimilarity], result of:
      0.060532555 = score(doc=141,freq=2.0), product of:
        0.2235069 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.06382575 = queryNorm
        0.2708308 = fieldWeight in 141, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.0546875 = fieldNorm(doc=141)
  0.25 = coord(1/4)

Pages: S.1-22

Dubin, D.: Dimensions and discriminability (1998) 0.02

0.015133139 = product of:
  0.060532555 = sum of:
    0.060532555 = weight(_text_:22 in 2338) [ClassicSimilarity], result of:
      0.060532555 = score(doc=2338,freq=2.0), product of:
        0.2235069 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.06382575 = queryNorm
        0.2708308 = fieldWeight in 2338, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2338)
  0.25 = coord(1/4)

Date: 22. 9.1997 19:16:05

Automatic classification research at OCLC (2002) 0.02

0.015133139 = product of:
  0.060532555 = sum of:
    0.060532555 = weight(_text_:22 in 1563) [ClassicSimilarity], result of:
      0.060532555 = score(doc=1563,freq=2.0), product of:
        0.2235069 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.06382575 = queryNorm
        0.2708308 = fieldWeight in 1563, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1563)
  0.25 = coord(1/4)

Date: 5. 5.2003 9:22:09

Jenkins, C.: Automatic classification of Web resources using Java and Dewey Decimal Classification (1998) 0.02

0.015133139 = product of:
  0.060532555 = sum of:
    0.060532555 = weight(_text_:22 in 1673) [ClassicSimilarity], result of:
      0.060532555 = score(doc=1673,freq=2.0), product of:
        0.2235069 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.06382575 = queryNorm
        0.2708308 = fieldWeight in 1673, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1673)
  0.25 = coord(1/4)

Date: 1. 8.1996 22:08:06

Yoon, Y.; Lee, C.; Lee, G.G.: ¬An effective procedure for constructing a hierarchical text classification system (2006) 0.02

0.015133139 = product of:
  0.060532555 = sum of:
    0.060532555 = weight(_text_:22 in 5273) [ClassicSimilarity], result of:
      0.060532555 = score(doc=5273,freq=2.0), product of:
        0.2235069 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.06382575 = queryNorm
        0.2708308 = fieldWeight in 5273, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5273)
  0.25 = coord(1/4)

Date: 22. 7.2006 16:24:52

Yi, K.: Automatic text classification using library classification schemes : trends, issues and challenges (2007) 0.02

0.015133139 = product of:
  0.060532555 = sum of:
    0.060532555 = weight(_text_:22 in 2560) [ClassicSimilarity], result of:
      0.060532555 = score(doc=2560,freq=2.0), product of:
        0.2235069 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.06382575 = queryNorm
        0.2708308 = fieldWeight in 2560, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2560)
  0.25 = coord(1/4)

Date: 22. 9.2008 18:31:54

Liu, R.-L.: Context recognition for hierarchical text classification (2009) 0.01

0.0129712615 = product of:
  0.051885046 = sum of:
    0.051885046 = weight(_text_:22 in 2760) [ClassicSimilarity], result of:
      0.051885046 = score(doc=2760,freq=2.0), product of:
        0.2235069 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.06382575 = queryNorm
        0.23214069 = fieldWeight in 2760, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.046875 = fieldNorm(doc=2760)
  0.25 = coord(1/4)

Date: 22. 3.2009 19:11:54

Pfeffer, M.: Automatische Vergabe von RVK-Notationen mittels fallbasiertem Schließen (2009) 0.01

0.0129712615 = product of:
  0.051885046 = sum of:
    0.051885046 = weight(_text_:22 in 3051) [ClassicSimilarity], result of:
      0.051885046 = score(doc=3051,freq=2.0), product of:
        0.2235069 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.06382575 = queryNorm
        0.23214069 = fieldWeight in 3051, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.046875 = fieldNorm(doc=3051)
  0.25 = coord(1/4)

Date: 22. 8.2009 19:51:28

Zhu, W.Z.; Allen, R.B.: Document clustering using the LSI subspace signature model (2013) 0.01

0.0129712615 = product of:
  0.051885046 = sum of:
    0.051885046 = weight(_text_:22 in 690) [ClassicSimilarity], result of:
      0.051885046 = score(doc=690,freq=2.0), product of:
        0.2235069 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.06382575 = queryNorm
        0.23214069 = fieldWeight in 690, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.046875 = fieldNorm(doc=690)
  0.25 = coord(1/4)

Date: 23. 3.2013 13:22:36

Egbert, J.; Biber, D.; Davies, M.: Developing a bottom-up, user-based method of web register classification (2015) 0.01

0.0129712615 = product of:
  0.051885046 = sum of:
    0.051885046 = weight(_text_:22 in 2158) [ClassicSimilarity], result of:
      0.051885046 = score(doc=2158,freq=2.0), product of:
        0.2235069 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.06382575 = queryNorm
        0.23214069 = fieldWeight in 2158, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.046875 = fieldNorm(doc=2158)
  0.25 = coord(1/4)

Date: 4. 8.2015 19:22:04

Mengle, S.; Goharian, N.: Passage detection using text classification (2009) 0.01

0.010809385 = product of:
  0.04323754 = sum of:
    0.04323754 = weight(_text_:22 in 2765) [ClassicSimilarity], result of:
      0.04323754 = score(doc=2765,freq=2.0), product of:
        0.2235069 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.06382575 = queryNorm
        0.19345059 = fieldWeight in 2765, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2765)
  0.25 = coord(1/4)

Date: 22. 3.2009 19:14:43

Liu, R.-L.: ¬A passage extractor for classification of disease aspect information (2013) 0.01

0.010809385 = product of:
  0.04323754 = sum of:
    0.04323754 = weight(_text_:22 in 1107) [ClassicSimilarity], result of:
      0.04323754 = score(doc=1107,freq=2.0), product of:
        0.2235069 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.06382575 = queryNorm
        0.19345059 = fieldWeight in 1107, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1107)
  0.25 = coord(1/4)

Date: 28.10.2013 19:22:57

Search (22 results, page 1 of 2)

Authors

Years

Languages

Types

Themes