Search (22 results, page 1 of 2)

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.10

0.10192762 = sum of:
  0.08115815 = product of:
    0.24347445 = sum of:
      0.24347445 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
        0.24347445 = score(doc=562,freq=2.0), product of:
          0.43321466 = queryWeight, product of:
            8.478011 = idf(docFreq=24, maxDocs=44218)
            0.05109862 = queryNorm
          0.56201804 = fieldWeight in 562, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            8.478011 = idf(docFreq=24, maxDocs=44218)
            0.046875 = fieldNorm(doc=562)
    0.33333334 = coord(1/3)
  0.020769471 = product of:
    0.041538943 = sum of:
      0.041538943 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
        0.041538943 = score(doc=562,freq=2.0), product of:
          0.17893866 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05109862 = queryNorm
          0.23214069 = fieldWeight in 562, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=562)
    0.5 = coord(1/2)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32

Fang, H.: Classifying research articles in multidisciplinary sciences journals into subject categories (2015) 0.04
```
0.039783288 = product of:
  0.079566576 = sum of:
    0.079566576 = product of:
      0.15913315 = sum of:
        0.15913315 = weight(_text_:journals in 2194) [ClassicSimilarity], result of:
          0.15913315 = score(doc=2194,freq=10.0), product of:
            0.25656942 = queryWeight, product of:
              5.021064 = idf(docFreq=792, maxDocs=44218)
              0.05109862 = queryNorm
            0.6202343 = fieldWeight in 2194, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              5.021064 = idf(docFreq=792, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2194)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

In the Thomson Reuters Web of Science database, the subject categories of a journal are applied to all articles in the journal. However, many articles in multidisciplinary Sciences journals may only be represented by a small number of subject categories. To provide more accurate information on the research areas of articles in such journals, we can classify articles in these journals into subject categories as defined by Web of Science based on their references. For an article in a multidisciplinary sciences journal, the method counts the subject categories in all of the article's references indexed by Web of Science, and uses the most numerous subject categories of the references to determine the most appropriate classification of the article. We used articles in an issue of Proceedings of the National Academy of Sciences (PNAS) to validate the correctness of the method by comparing the obtained results with the categories of the articles as defined by PNAS and their content. This study shows that the method provides more precise search results for the subject category of interest in bibliometric investigations through recognition of articles in multidisciplinary sciences journals whose work relates to a particular subject category.
Pech, G.; Delgado, C.; Sorella, S.P.: Classifying papers into subfields using Abstracts, Titles, Keywords and KeyWords Plus through pattern detection and optimization procedures : an application in Physics (2022) 0.03
```
0.030816004 = product of:
  0.061632007 = sum of:
    0.061632007 = product of:
      0.123264015 = sum of:
        0.123264015 = weight(_text_:journals in 744) [ClassicSimilarity], result of:
          0.123264015 = score(doc=744,freq=6.0), product of:
            0.25656942 = queryWeight, product of:
              5.021064 = idf(docFreq=792, maxDocs=44218)
              0.05109862 = queryNorm
            0.48043144 = fieldWeight in 744, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.021064 = idf(docFreq=792, maxDocs=44218)
              0.0390625 = fieldNorm(doc=744)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Classifying papers according to the fields of knowledge is critical to clearly understand the dynamics of scientific (sub)fields, their leading questions, and trends. Most studies rely on journal categories defined by popular databases such as WoS or Scopus, but some experts find that those categories may not correctly map the existing subfields nor identify the subfield of a specific article. This study addresses the classification problem using data from each paper (Abstract, Title, Keywords, and the KeyWords Plus) and the help of experts to identify the existing subfields and journals exclusive of each subfield. These "exclusive journals" are critical to obtain, through a pattern detection procedure that uses machine learning techniques (from software NVivo), a list of the frequent terms that are specific to each subfield. With that list of terms and with the help of optimization procedures, we can identify to which subfield each paper most likely belongs. This study can contribute to support scientific policy-makers, funding, and research institutions-via more accurate academic performance evaluations-, to support editors in their tasks to redefine the scopes of journals, and to support popular databases in their processes of refining categories.
Desale, S.K.; Kumbhar, R.: Research on automatic classification of documents in library environment : a literature review (2013) 0.02
```
0.021349952 = product of:
  0.042699903 = sum of:
    0.042699903 = product of:
      0.08539981 = sum of:
        0.08539981 = weight(_text_:journals in 1071) [ClassicSimilarity], result of:
          0.08539981 = score(doc=1071,freq=2.0), product of:
            0.25656942 = queryWeight, product of:
              5.021064 = idf(docFreq=792, maxDocs=44218)
              0.05109862 = queryNorm
            0.33285263 = fieldWeight in 1071, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.021064 = idf(docFreq=792, maxDocs=44218)
              0.046875 = fieldNorm(doc=1071)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This paper aims to provide an overview of automatic classification research, which focuses on issues related to the automatic classification of documents in a library environment. The review covers literature published in mainstream library and information science studies. The review was done on literature published in both academic and professional LIS journals and other documents. This review reveals that basically three types of research are being done on automatic classification: 1) hierarchical classification using different library classification schemes, 2) text categorization and document categorization using different type of classifiers with or without using training documents, and 3) automatic bibliographic classification. Predominantly this research is directed towards solving problems of organization of digital documents in an online environment. However, very little research is devoted towards solving the problems of arrangement of physical documents.

Subramanian, S.; Shafer, K.E.: Clustering (2001) 0.02

0.020769471 = product of:
  0.041538943 = sum of:
    0.041538943 = product of:
      0.083077885 = sum of:
        0.083077885 = weight(_text_:22 in 1046) [ClassicSimilarity], result of:
          0.083077885 = score(doc=1046,freq=2.0), product of:
            0.17893866 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05109862 = queryNorm
            0.46428138 = fieldWeight in 1046, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=1046)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 5. 5.2003 14:17:22

Humphrey, S.M.; Névéol, A.; Browne, A.; Gobeil, J.; Ruch, P.; Darmoni, S.J.: Comparing a rule-based versus statistical system for automatic categorization of MEDLINE documents according to biomedical specialty (2009) 0.02
```
0.017791625 = product of:
  0.03558325 = sum of:
    0.03558325 = product of:
      0.0711665 = sum of:
        0.0711665 = weight(_text_:journals in 3300) [ClassicSimilarity], result of:
          0.0711665 = score(doc=3300,freq=2.0), product of:
            0.25656942 = queryWeight, product of:
              5.021064 = idf(docFreq=792, maxDocs=44218)
              0.05109862 = queryNorm
            0.2773772 = fieldWeight in 3300, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.021064 = idf(docFreq=792, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3300)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Automatic document categorization is an important research problem in Information Science and Natural Language Processing. Many applications, including, Word Sense Disambiguation and Information Retrieval in large collections, can benefit from such categorization. This paper focuses on automatic categorization of documents from the biomedical literature into broad discipline-based categories. Two different systems are described and contrasted: CISMeF, which uses rules based on human indexing of the documents by the Medical Subject Headings (MeSH) controlled vocabulary in order to assign metaterms (MTs), and Journal Descriptor Indexing (JDI), based on human categorization of about 4,000 journals and statistical associations between journal descriptors (JDs) and textwords in the documents. We evaluate and compare the performance of these systems against a gold standard of humanly assigned categories for 100 MEDLINE documents, using six measures selected from trec_eval. The results show that for five of the measures performance is comparable, and for one measure JDI is superior. We conclude that these results favor JDI, given the significantly greater intellectual overhead involved in human indexing and maintaining a rule base for mapping MeSH terms to MTs. We also note a JDI method that associates JDs with MeSH indexing rather than textwords, and it may be worthwhile to investigate whether this JDI method (statistical) and CISMeF (rule-based) might be combined and then evaluated showing they are complementary to one another.

Reiner, U.: Automatische DDC-Klassifizierung von bibliografischen Titeldatensätzen (2009) 0.02

0.017307894 = product of:
  0.03461579 = sum of:
    0.03461579 = product of:
      0.06923158 = sum of:
        0.06923158 = weight(_text_:22 in 611) [ClassicSimilarity], result of:
          0.06923158 = score(doc=611,freq=2.0), product of:
            0.17893866 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05109862 = queryNorm
            0.38690117 = fieldWeight in 611, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=611)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 8.2009 12:54:24

HaCohen-Kerner, Y. et al.: Classification using various machine learning methods and combinations of key-phrases and visual features (2016) 0.02

0.017307894 = product of:
  0.03461579 = sum of:
    0.03461579 = product of:
      0.06923158 = sum of:
        0.06923158 = weight(_text_:22 in 2748) [ClassicSimilarity], result of:
          0.06923158 = score(doc=2748,freq=2.0), product of:
            0.17893866 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05109862 = queryNorm
            0.38690117 = fieldWeight in 2748, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2748)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 1. 2.2016 18:25:22

Bock, H.-H.: Datenanalyse zur Strukturierung und Ordnung von Information (1989) 0.01

0.012115525 = product of:
  0.02423105 = sum of:
    0.02423105 = product of:
      0.0484621 = sum of:
        0.0484621 = weight(_text_:22 in 141) [ClassicSimilarity], result of:
          0.0484621 = score(doc=141,freq=2.0), product of:
            0.17893866 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05109862 = queryNorm
            0.2708308 = fieldWeight in 141, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=141)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Pages: S.1-22

Dubin, D.: Dimensions and discriminability (1998) 0.01

0.012115525 = product of:
  0.02423105 = sum of:
    0.02423105 = product of:
      0.0484621 = sum of:
        0.0484621 = weight(_text_:22 in 2338) [ClassicSimilarity], result of:
          0.0484621 = score(doc=2338,freq=2.0), product of:
            0.17893866 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05109862 = queryNorm
            0.2708308 = fieldWeight in 2338, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2338)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 9.1997 19:16:05

Automatic classification research at OCLC (2002) 0.01

0.012115525 = product of:
  0.02423105 = sum of:
    0.02423105 = product of:
      0.0484621 = sum of:
        0.0484621 = weight(_text_:22 in 1563) [ClassicSimilarity], result of:
          0.0484621 = score(doc=1563,freq=2.0), product of:
            0.17893866 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05109862 = queryNorm
            0.2708308 = fieldWeight in 1563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1563)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 5. 5.2003 9:22:09

Jenkins, C.: Automatic classification of Web resources using Java and Dewey Decimal Classification (1998) 0.01

0.012115525 = product of:
  0.02423105 = sum of:
    0.02423105 = product of:
      0.0484621 = sum of:
        0.0484621 = weight(_text_:22 in 1673) [ClassicSimilarity], result of:
          0.0484621 = score(doc=1673,freq=2.0), product of:
            0.17893866 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05109862 = queryNorm
            0.2708308 = fieldWeight in 1673, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1673)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 1. 8.1996 22:08:06

Yoon, Y.; Lee, C.; Lee, G.G.: ¬An effective procedure for constructing a hierarchical text classification system (2006) 0.01

0.012115525 = product of:
  0.02423105 = sum of:
    0.02423105 = product of:
      0.0484621 = sum of:
        0.0484621 = weight(_text_:22 in 5273) [ClassicSimilarity], result of:
          0.0484621 = score(doc=5273,freq=2.0), product of:
            0.17893866 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05109862 = queryNorm
            0.2708308 = fieldWeight in 5273, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5273)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 7.2006 16:24:52

Yi, K.: Automatic text classification using library classification schemes : trends, issues and challenges (2007) 0.01

0.012115525 = product of:
  0.02423105 = sum of:
    0.02423105 = product of:
      0.0484621 = sum of:
        0.0484621 = weight(_text_:22 in 2560) [ClassicSimilarity], result of:
          0.0484621 = score(doc=2560,freq=2.0), product of:
            0.17893866 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05109862 = queryNorm
            0.2708308 = fieldWeight in 2560, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2560)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 9.2008 18:31:54

Liu, R.-L.: Context recognition for hierarchical text classification (2009) 0.01

0.010384736 = product of:
  0.020769471 = sum of:
    0.020769471 = product of:
      0.041538943 = sum of:
        0.041538943 = weight(_text_:22 in 2760) [ClassicSimilarity], result of:
          0.041538943 = score(doc=2760,freq=2.0), product of:
            0.17893866 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05109862 = queryNorm
            0.23214069 = fieldWeight in 2760, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2760)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 3.2009 19:11:54

Pfeffer, M.: Automatische Vergabe von RVK-Notationen mittels fallbasiertem Schließen (2009) 0.01

0.010384736 = product of:
  0.020769471 = sum of:
    0.020769471 = product of:
      0.041538943 = sum of:
        0.041538943 = weight(_text_:22 in 3051) [ClassicSimilarity], result of:
          0.041538943 = score(doc=3051,freq=2.0), product of:
            0.17893866 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05109862 = queryNorm
            0.23214069 = fieldWeight in 3051, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=3051)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 8.2009 19:51:28

Zhu, W.Z.; Allen, R.B.: Document clustering using the LSI subspace signature model (2013) 0.01

0.010384736 = product of:
  0.020769471 = sum of:
    0.020769471 = product of:
      0.041538943 = sum of:
        0.041538943 = weight(_text_:22 in 690) [ClassicSimilarity], result of:
          0.041538943 = score(doc=690,freq=2.0), product of:
            0.17893866 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05109862 = queryNorm
            0.23214069 = fieldWeight in 690, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=690)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 23. 3.2013 13:22:36

Egbert, J.; Biber, D.; Davies, M.: Developing a bottom-up, user-based method of web register classification (2015) 0.01

0.010384736 = product of:
  0.020769471 = sum of:
    0.020769471 = product of:
      0.041538943 = sum of:
        0.041538943 = weight(_text_:22 in 2158) [ClassicSimilarity], result of:
          0.041538943 = score(doc=2158,freq=2.0), product of:
            0.17893866 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05109862 = queryNorm
            0.23214069 = fieldWeight in 2158, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2158)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 4. 8.2015 19:22:04

Mengle, S.; Goharian, N.: Passage detection using text classification (2009) 0.01

0.008653947 = product of:
  0.017307894 = sum of:
    0.017307894 = product of:
      0.03461579 = sum of:
        0.03461579 = weight(_text_:22 in 2765) [ClassicSimilarity], result of:
          0.03461579 = score(doc=2765,freq=2.0), product of:
            0.17893866 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05109862 = queryNorm
            0.19345059 = fieldWeight in 2765, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2765)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 3.2009 19:14:43

Liu, R.-L.: ¬A passage extractor for classification of disease aspect information (2013) 0.01

0.008653947 = product of:
  0.017307894 = sum of:
    0.017307894 = product of:
      0.03461579 = sum of:
        0.03461579 = weight(_text_:22 in 1107) [ClassicSimilarity], result of:
          0.03461579 = score(doc=1107,freq=2.0), product of:
            0.17893866 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05109862 = queryNorm
            0.19345059 = fieldWeight in 1107, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1107)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 28.10.2013 19:22:57

Search (22 results, page 1 of 2)

Authors

Years

Languages

Types

Themes