Search (125 results, page 1 of 7)

Orwig, R.E.; Chen, H.; Nunamaker, J.F.: ¬A graphical, self-organizing approach to classifying electronic meeting output (1997) 0.08

0.08303067 = product of:
  0.11070756 = sum of:
    0.027080212 = weight(_text_:science in 6928) [ClassicSimilarity], result of:
      0.027080212 = score(doc=6928,freq=2.0), product of:
        0.1329271 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.050463587 = queryNorm
        0.20372227 = fieldWeight in 6928, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.0546875 = fieldNorm(doc=6928)
    0.044925686 = weight(_text_:research in 6928) [ClassicSimilarity], result of:
      0.044925686 = score(doc=6928,freq=4.0), product of:
        0.14397179 = queryWeight, product of:
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.050463587 = queryNorm
        0.31204507 = fieldWeight in 6928, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.0546875 = fieldNorm(doc=6928)
    0.03870166 = product of:
      0.07740332 = sum of:
        0.07740332 = weight(_text_:network in 6928) [ClassicSimilarity], result of:
          0.07740332 = score(doc=6928,freq=2.0), product of:
            0.22473325 = queryWeight, product of:
              4.4533744 = idf(docFreq=1398, maxDocs=44218)
              0.050463587 = queryNorm
            0.3444231 = fieldWeight in 6928, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4533744 = idf(docFreq=1398, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6928)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Abstract: Describes research in the application of a Kohonen Self-Organizing Map (SOM) to the problem of classification of electronic brainstorming output and an evaluation of the results. Describes an electronic meeting system and describes the classification problem that exists in the group problem solving process. Surveys the literature concerning classification. Describes the application of the Kohonen SOM to the meeting output classification problem. Describes an experiment that evaluated the classification performed by the Kohonen SOM by comparing it with those of a human expert and a Hopfield neural network. Discusses conclusions and directions for future research
Source: Journal of the American Society for Information Science. 48(1997) no.2, S.157-170

Subramanian, S.; Shafer, K.E.: Clustering (1998) 0.05

0.051432785 = product of:
  0.10286557 = sum of:
    0.038686015 = weight(_text_:science in 1103) [ClassicSimilarity], result of:
      0.038686015 = score(doc=1103,freq=2.0), product of:
        0.1329271 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.050463587 = queryNorm
        0.2910318 = fieldWeight in 1103, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.078125 = fieldNorm(doc=1103)
    0.064179555 = weight(_text_:research in 1103) [ClassicSimilarity], result of:
      0.064179555 = score(doc=1103,freq=4.0), product of:
        0.14397179 = queryWeight, product of:
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.050463587 = queryNorm
        0.44577867 = fieldWeight in 1103, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.078125 = fieldNorm(doc=1103)
  0.5 = coord(2/4)

Abstract: This article presents our exploration of computer science clustering algorithms as they relate to the Scorpion system. Scorpion is a research project at OCLC that explores the indexing and cataloging of electronic resources. For a more complete description of the Scorpion, please visit the Scorpion Web site at <http://purl.oclc.org/scorpion>
Source: http://www.oclc.org/research/publications/arr/1997/

Shafer, K.E.: Evaluating Scorpion results (1998) 0.05

0.051432785 = product of:
  0.10286557 = sum of:
    0.038686015 = weight(_text_:science in 1569) [ClassicSimilarity], result of:
      0.038686015 = score(doc=1569,freq=2.0), product of:
        0.1329271 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.050463587 = queryNorm
        0.2910318 = fieldWeight in 1569, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.078125 = fieldNorm(doc=1569)
    0.064179555 = weight(_text_:research in 1569) [ClassicSimilarity], result of:
      0.064179555 = score(doc=1569,freq=4.0), product of:
        0.14397179 = queryWeight, product of:
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.050463587 = queryNorm
        0.44577867 = fieldWeight in 1569, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.078125 = fieldNorm(doc=1569)
  0.5 = coord(2/4)

Abstract: Scorpion is a research project at OCLC that builds tools for automatic subject assignment by combining library science and information retrieval techniques. A thesis of Scorpion is that the Dewey Decimal Classification (Dewey) can be used to perform automatic subject assignment for electronic items.
Source: http://www.oclc.org/research/publications/arr/1997/

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.05

0.050330456 = product of:
  0.10066091 = sum of:
    0.08014955 = product of:
      0.24044865 = sum of:
        0.24044865 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
          0.24044865 = score(doc=562,freq=2.0), product of:
            0.42783085 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.050463587 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.33333334 = coord(1/3)
    0.020511357 = product of:
      0.041022714 = sum of:
        0.041022714 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
          0.041022714 = score(doc=562,freq=2.0), product of:
            0.17671488 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050463587 = queryNorm
            0.23214069 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32

Subramanian, S.; Shafer, K.E.: Clustering (2001) 0.05

0.047740437 = product of:
  0.095480874 = sum of:
    0.054458156 = weight(_text_:research in 1046) [ClassicSimilarity], result of:
      0.054458156 = score(doc=1046,freq=2.0), product of:
        0.14397179 = queryWeight, product of:
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.050463587 = queryNorm
        0.37825575 = fieldWeight in 1046, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.09375 = fieldNorm(doc=1046)
    0.041022714 = product of:
      0.08204543 = sum of:
        0.08204543 = weight(_text_:22 in 1046) [ClassicSimilarity], result of:
          0.08204543 = score(doc=1046,freq=2.0), product of:
            0.17671488 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050463587 = queryNorm
            0.46428138 = fieldWeight in 1046, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=1046)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Date: 5. 5.2003 14:17:22
Footnote: Teil eines Themenheftes: OCLC and the Internet: An Historical Overview of Research Activities, 1990-1999 - Part II

Meder, N.: Artificial intelligence as a tool of classification, or: the network of language games as cognitive paradigm (1985) 0.04

0.043249834 = product of:
  0.08649967 = sum of:
    0.031767257 = weight(_text_:research in 7694) [ClassicSimilarity], result of:
      0.031767257 = score(doc=7694,freq=2.0), product of:
        0.14397179 = queryWeight, product of:
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.050463587 = queryNorm
        0.22064918 = fieldWeight in 7694, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7694)
    0.054732412 = product of:
      0.109464824 = sum of:
        0.109464824 = weight(_text_:network in 7694) [ClassicSimilarity], result of:
          0.109464824 = score(doc=7694,freq=4.0), product of:
            0.22473325 = queryWeight, product of:
              4.4533744 = idf(docFreq=1398, maxDocs=44218)
              0.050463587 = queryNorm
            0.48708782 = fieldWeight in 7694, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.4533744 = idf(docFreq=1398, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7694)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: It is shown that the cognitive paradigm may be an orientation mark for automatic classification. On the basis of research in Artificial Intelligence, the cognitive paradigm - as opposed to the behavioristic paradigm - was developed as a multiplicity of competitive world-views. This is the thesis of DeMey in his book "The cognitive paradigm". Multiplicity in a loosely-coupled network of cognitive knots is also the principle of dynamic restlessness. In competititon with cognitive views, a classification system that follows various models may learn by concrete information retrieval. During his actions the user builds implicitly a new classification order

Desale, S.K.; Kumbhar, R.: Research on automatic classification of documents in library environment : a literature review (2013) 0.04
```
0.042048838 = product of:
  0.084097676 = sum of:
    0.023211608 = weight(_text_:science in 1071) [ClassicSimilarity], result of:
      0.023211608 = score(doc=1071,freq=2.0), product of:
        0.1329271 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.050463587 = queryNorm
        0.17461908 = fieldWeight in 1071, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.046875 = fieldNorm(doc=1071)
    0.060886066 = weight(_text_:research in 1071) [ClassicSimilarity], result of:
      0.060886066 = score(doc=1071,freq=10.0), product of:
        0.14397179 = queryWeight, product of:
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.050463587 = queryNorm
        0.42290276 = fieldWeight in 1071, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.046875 = fieldNorm(doc=1071)
  0.5 = coord(2/4)
```
Abstract

This paper aims to provide an overview of automatic classification research, which focuses on issues related to the automatic classification of documents in a library environment. The review covers literature published in mainstream library and information science studies. The review was done on literature published in both academic and professional LIS journals and other documents. This review reveals that basically three types of research are being done on automatic classification: 1) hierarchical classification using different library classification schemes, 2) text categorization and document categorization using different type of classifiers with or without using training documents, and 3) automatic bibliographic classification. Predominantly this research is directed towards solving problems of organization of digital documents in an online environment. However, very little research is devoted towards solving the problems of arrangement of physical documents.

Ruiz, M.E.; Srinivasan, P.: Combining machine learning and hierarchical indexing structures for text categorization (2001) 0.04

0.04181367 = product of:
  0.08362734 = sum of:
    0.044925686 = weight(_text_:research in 1595) [ClassicSimilarity], result of:
      0.044925686 = score(doc=1595,freq=4.0), product of:
        0.14397179 = queryWeight, product of:
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.050463587 = queryNorm
        0.31204507 = fieldWeight in 1595, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1595)
    0.03870166 = product of:
      0.07740332 = sum of:
        0.07740332 = weight(_text_:network in 1595) [ClassicSimilarity], result of:
          0.07740332 = score(doc=1595,freq=2.0), product of:
            0.22473325 = queryWeight, product of:
              4.4533744 = idf(docFreq=1398, maxDocs=44218)
              0.050463587 = queryNorm
            0.3444231 = fieldWeight in 1595, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4533744 = idf(docFreq=1398, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1595)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: This paper presents a method that exploits the hierarchical structure of an indexing vocabulary to guide the development and training of machine learning methods for automatic text categorization. We present the design of a hierarchical classifier based an the divide-and-conquer principle. The method is evaluated using backpropagation neural networks, such as the machine learning algorithm, that leam to assign MeSH categories to a subset of MEDLINE records. Comparisons with traditional Rocchio's algorithm adapted for text categorization, as well as flat neural network classifiers, are provided. The results indicate that the use of hierarchical structures improves Performance significantly.
Source: Advances in classification research, vol.10: proceedings of the 10th ASIS SIG/CR Classification Research Workshop. Ed.: Albrechtsen, H. u. J.E. Mai

HaCohen-Kerner, Y. et al.: Classification using various machine learning methods and combinations of key-phrases and visual features (2016) 0.04

0.036435805 = product of:
  0.07287161 = sum of:
    0.038686015 = weight(_text_:science in 2748) [ClassicSimilarity], result of:
      0.038686015 = score(doc=2748,freq=2.0), product of:
        0.1329271 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.050463587 = queryNorm
        0.2910318 = fieldWeight in 2748, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.078125 = fieldNorm(doc=2748)
    0.034185596 = product of:
      0.06837119 = sum of:
        0.06837119 = weight(_text_:22 in 2748) [ClassicSimilarity], result of:
          0.06837119 = score(doc=2748,freq=2.0), product of:
            0.17671488 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050463587 = queryNorm
            0.38690117 = fieldWeight in 2748, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2748)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Date: 1. 2.2016 18:25:22
Series: Lecture notes in computer science ; 9398

Golub, K.: Automated subject classification of textual web documents (2006) 0.04
```
0.036402434 = product of:
  0.07280487 = sum of:
    0.033503074 = weight(_text_:science in 5600) [ClassicSimilarity], result of:
      0.033503074 = score(doc=5600,freq=6.0), product of:
        0.1329271 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.050463587 = queryNorm
        0.25204095 = fieldWeight in 5600, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5600)
    0.03930179 = weight(_text_:research in 5600) [ClassicSimilarity], result of:
      0.03930179 = score(doc=5600,freq=6.0), product of:
        0.14397179 = queryWeight, product of:
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.050463587 = queryNorm
        0.2729826 = fieldWeight in 5600, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5600)
  0.5 = coord(2/4)
```
Abstract

Purpose - To provide an integrated perspective to similarities and differences between approaches to automated classification in different research communities (machine learning, information retrieval and library science), and point to problems with the approaches and automated classification as such. Design/methodology/approach - A range of works dealing with automated classification of full-text web documents are discussed. Explorations of individual approaches are given in the following sections: special features (description, differences, evaluation), application and characteristics of web pages. Findings - Provides major similarities and differences between the three approaches: document pre-processing and utilization of web-specific document characteristics is common to all the approaches; major differences are in applied algorithms, employment or not of the vector space model and of controlled vocabularies. Problems of automated classification are recognized. Research limitations/implications - The paper does not attempt to provide an exhaustive bibliography of related resources. Practical implications - As an integrated overview of approaches from different research communities with application examples, it is very useful for students in library and information science and computer science, as well as for practitioners. Researchers from one community have the information on how similar tasks are conducted in different communities. Originality/value - To the author's knowledge, no review paper on automated text classification attempted to discuss more than one community's approach from an integrated perspective.
Fang, H.: Classifying research articles in multidisciplinary sciences journals into subject categories (2015) 0.04
```
0.035387896 = product of:
  0.07077579 = sum of:
    0.038686015 = weight(_text_:science in 2194) [ClassicSimilarity], result of:
      0.038686015 = score(doc=2194,freq=8.0), product of:
        0.1329271 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.050463587 = queryNorm
        0.2910318 = fieldWeight in 2194, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2194)
    0.032089777 = weight(_text_:research in 2194) [ClassicSimilarity], result of:
      0.032089777 = score(doc=2194,freq=4.0), product of:
        0.14397179 = queryWeight, product of:
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.050463587 = queryNorm
        0.22288933 = fieldWeight in 2194, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2194)
  0.5 = coord(2/4)
```
Abstract

In the Thomson Reuters Web of Science database, the subject categories of a journal are applied to all articles in the journal. However, many articles in multidisciplinary Sciences journals may only be represented by a small number of subject categories. To provide more accurate information on the research areas of articles in such journals, we can classify articles in these journals into subject categories as defined by Web of Science based on their references. For an article in a multidisciplinary sciences journal, the method counts the subject categories in all of the article's references indexed by Web of Science, and uses the most numerous subject categories of the references to determine the most appropriate classification of the article. We used articles in an issue of Proceedings of the National Academy of Sciences (PNAS) to validate the correctness of the method by comparing the obtained results with the categories of the articles as defined by PNAS and their content. This study shows that the method provides more precise search results for the subject category of interest in bibliometric investigations through recognition of articles in multidisciplinary sciences journals whose work relates to a particular subject category.

Object

Web of science

Automatic classification research at OCLC (2002) 0.03

0.034427803 = product of:
  0.068855606 = sum of:
    0.044925686 = weight(_text_:research in 1563) [ClassicSimilarity], result of:
      0.044925686 = score(doc=1563,freq=4.0), product of:
        0.14397179 = queryWeight, product of:
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.050463587 = queryNorm
        0.31204507 = fieldWeight in 1563, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1563)
    0.023929918 = product of:
      0.047859836 = sum of:
        0.047859836 = weight(_text_:22 in 1563) [ClassicSimilarity], result of:
          0.047859836 = score(doc=1563,freq=2.0), product of:
            0.17671488 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050463587 = queryNorm
            0.2708308 = fieldWeight in 1563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1563)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: OCLC enlists the cooperation of the world's libraries to make the written record of humankind's cultural heritage more accessible through electronic media. Part of this goal can be accomplished through the application of the principles of knowledge organization. We believe that cultural artifacts are effectively lost unless they are indexed, cataloged and classified. Accordingly, OCLC has developed products, sponsored research projects, and encouraged the participation in international standards communities whose outcome has been improved library classification schemes, cataloging productivity tools, and new proposals for the creation and maintenance of metadata. Though cataloging and classification requires expert intellectual effort, we recognize that at least some of the work must be automated if we hope to keep pace with cultural change
Date: 5. 5.2003 9:22:09

Dubin, D.: Dimensions and discriminability (1998) 0.03

0.031113561 = product of:
  0.062227122 = sum of:
    0.038297202 = weight(_text_:science in 2338) [ClassicSimilarity], result of:
      0.038297202 = score(doc=2338,freq=4.0), product of:
        0.1329271 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.050463587 = queryNorm
        0.2881068 = fieldWeight in 2338, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2338)
    0.023929918 = product of:
      0.047859836 = sum of:
        0.047859836 = weight(_text_:22 in 2338) [ClassicSimilarity], result of:
          0.047859836 = score(doc=2338,freq=2.0), product of:
            0.17671488 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050463587 = queryNorm
            0.2708308 = fieldWeight in 2338, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2338)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Date: 22. 9.1997 19:16:05
Imprint: Urbana-Champaign, IL : Illinois University at Urbana-Champaign, Graduate School of Library and Information Science
Source: Visualizing subject access for 21st century information resources: Papers presented at the 1997 Clinic on Library Applications of Data Processing, 2-4 Mar 1997, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign. Ed.: P.A. Cochrane et al

Kanaan, G.; Al-Shalabi, R.; Ghwanmeh, S.; Al-Ma'adeed, H.: ¬A comparison of text-classification techniques applied to Arabic text (2009) 0.03

0.030859668 = product of:
  0.061719336 = sum of:
    0.023211608 = weight(_text_:science in 3096) [ClassicSimilarity], result of:
      0.023211608 = score(doc=3096,freq=2.0), product of:
        0.1329271 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.050463587 = queryNorm
        0.17461908 = fieldWeight in 3096, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.046875 = fieldNorm(doc=3096)
    0.03850773 = weight(_text_:research in 3096) [ClassicSimilarity], result of:
      0.03850773 = score(doc=3096,freq=4.0), product of:
        0.14397179 = queryWeight, product of:
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.050463587 = queryNorm
        0.2674672 = fieldWeight in 3096, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.046875 = fieldNorm(doc=3096)
  0.5 = coord(2/4)

Abstract: Many algorithms have been implemented for the problem of text classification. Most of the work in this area was carried out for English text. Very little research has been carried out on Arabic text. The nature of Arabic text is different than that of English text, and preprocessing of Arabic text is more challenging. This paper presents an implementation of three automatic text-classification techniques for Arabic text. A corpus of 1445 Arabic text documents belonging to nine categories has been automatically classified using the kNN, Rocchio, and naïve Bayes algorithms. The research results reveal that Naïve Bayes was the best performer, followed by kNN and Rocchio.
Source: Journal of the American Society for Information Science and Technology. 60(2009) no.9, S.1836-1844

Denoyer, L.; Gallinari, P.: Bayesian network model for semi-structured document classification (2004) 0.03

0.030200964 = product of:
  0.060401928 = sum of:
    0.027229078 = weight(_text_:research in 995) [ClassicSimilarity], result of:
      0.027229078 = score(doc=995,freq=2.0), product of:
        0.14397179 = queryWeight, product of:
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.050463587 = queryNorm
        0.18912788 = fieldWeight in 995, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.046875 = fieldNorm(doc=995)
    0.03317285 = product of:
      0.0663457 = sum of:
        0.0663457 = weight(_text_:network in 995) [ClassicSimilarity], result of:
          0.0663457 = score(doc=995,freq=2.0), product of:
            0.22473325 = queryWeight, product of:
              4.4533744 = idf(docFreq=1398, maxDocs=44218)
              0.050463587 = queryNorm
            0.29521978 = fieldWeight in 995, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4533744 = idf(docFreq=1398, maxDocs=44218)
              0.046875 = fieldNorm(doc=995)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: Recently, a new community has started to emerge around the development of new information research methods for searching and analyzing semi-structured and XML like documents. The goal is to handle both content and structural information, and to deal with different types of information content (text, image, etc.). We consider here the task of structured document classification. We propose a generative model able to handle both structure and content which is based on Bayesian networks. We then show how to transform this generative model into a discriminant classifier using the method of Fisher kernel. The model is then extended for dealing with different types of content information (here text and images). The model was tested on three databases: the classical webKB corpus composed of HTML pages, the new INEX corpus which has become a reference in the field of ad-hoc retrieval for XML documents, and a multimedia corpus of Web pages.

Teich, E.; Degaetano-Ortlieb, S.; Fankhauser, P.; Kermes, H.; Lapshinova-Koltunski, E.: ¬The linguistic construal of disciplinarity : a data-mining approach using register features (2016) 0.03
```
0.030027626 = product of:
  0.060055252 = sum of:
    0.032826174 = weight(_text_:science in 3015) [ClassicSimilarity], result of:
      0.032826174 = score(doc=3015,freq=4.0), product of:
        0.1329271 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.050463587 = queryNorm
        0.24694869 = fieldWeight in 3015, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.046875 = fieldNorm(doc=3015)
    0.027229078 = weight(_text_:research in 3015) [ClassicSimilarity], result of:
      0.027229078 = score(doc=3015,freq=2.0), product of:
        0.14397179 = queryWeight, product of:
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.050463587 = queryNorm
        0.18912788 = fieldWeight in 3015, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.046875 = fieldNorm(doc=3015)
  0.5 = coord(2/4)
```
Abstract

We analyze the linguistic evolution of selected scientific disciplines over a 30-year time span (1970s to 2000s). Our focus is on four highly specialized disciplines at the boundaries of computer science that emerged during that time: computational linguistics, bioinformatics, digital construction, and microelectronics. Our analysis is driven by the question whether these disciplines develop a distinctive language use-both individually and collectively-over the given time period. The data set is the English Scientific Text Corpus (scitex), which includes texts from the 1970s/1980s and early 2000s. Our theoretical basis is register theory. In terms of methods, we combine corpus-based methods of feature extraction (various aggregated features [part-of-speech based], n-grams, lexico-grammatical patterns) and automatic text classification. The results of our research are directly relevant to the study of linguistic variation and languages for specific purposes (LSP) and have implications for various natural language processing (NLP) tasks, for example, authorship attribution, text mining, or training NLP tools.

Source

Journal of the Association for Information Science and Technology. 67(2016) no.7, S.1668-1678

Huang, Y.-L.: ¬A theoretic and empirical research of cluster indexing for Mandarine Chinese full text document (1998) 0.03

0.029423734 = product of:
  0.05884747 = sum of:
    0.027080212 = weight(_text_:science in 513) [ClassicSimilarity], result of:
      0.027080212 = score(doc=513,freq=2.0), product of:
        0.1329271 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.050463587 = queryNorm
        0.20372227 = fieldWeight in 513, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.0546875 = fieldNorm(doc=513)
    0.031767257 = weight(_text_:research in 513) [ClassicSimilarity], result of:
      0.031767257 = score(doc=513,freq=2.0), product of:
        0.14397179 = queryWeight, product of:
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.050463587 = queryNorm
        0.22064918 = fieldWeight in 513, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.0546875 = fieldNorm(doc=513)
  0.5 = coord(2/4)

Source: Bulletin of library and information science. 1998, no.24, S.44-68

Yi, K.: Automatic text classification using library classification schemes : trends, issues and challenges (2007) 0.03

0.027848586 = product of:
  0.055697173 = sum of:
    0.031767257 = weight(_text_:research in 2560) [ClassicSimilarity], result of:
      0.031767257 = score(doc=2560,freq=2.0), product of:
        0.14397179 = queryWeight, product of:
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.050463587 = queryNorm
        0.22064918 = fieldWeight in 2560, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2560)
    0.023929918 = product of:
      0.047859836 = sum of:
        0.047859836 = weight(_text_:22 in 2560) [ClassicSimilarity], result of:
          0.047859836 = score(doc=2560,freq=2.0), product of:
            0.17671488 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050463587 = queryNorm
            0.2708308 = fieldWeight in 2560, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2560)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: The proliferation of digital resources and their integration into a traditional library setting has created a pressing need for an automated tool that organizes textual information based on library classification schemes. Automated text classification is a research field of developing tools, methods, and models to automate text classification. This article describes the current popular approach for text classification and major text classification projects and applications that are based on library classification schemes. Related issues and challenges are discussed, and a number of considerations for the challenges are examined.
Date: 22. 9.2008 18:31:54

Chung, Y.M.; Lee, J.Y.: ¬A corpus-based approach to comparative evaluation of statistical term association measures (2001) 0.03
```
0.025716392 = product of:
  0.051432785 = sum of:
    0.019343007 = weight(_text_:science in 5769) [ClassicSimilarity], result of:
      0.019343007 = score(doc=5769,freq=2.0), product of:
        0.1329271 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.050463587 = queryNorm
        0.1455159 = fieldWeight in 5769, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5769)
    0.032089777 = weight(_text_:research in 5769) [ClassicSimilarity], result of:
      0.032089777 = score(doc=5769,freq=4.0), product of:
        0.14397179 = queryWeight, product of:
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.050463587 = queryNorm
        0.22288933 = fieldWeight in 5769, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.8529835 = idf(docFreq=6931, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5769)
  0.5 = coord(2/4)
```
Abstract

Statistical association measures have been widely applied in information retrieval research, usually employing a clustering of documents or terms on the basis of their relationships. Applications of the association measures for term clustering include automatic thesaurus construction and query expansion. This research evaluates the similarity of six association measures by comparing the relationship and behavior they demonstrate in various analyses of a test corpus. Analysis techniques include comparisons of highly ranked term pairs and term clusters, analyses of the correlation among the association measures using Pearson's correlation coefficient and MDS mapping, and an analysis of the impact of a term frequency on the association values by means of z-score. The major findings of the study are as follows: First, the most similar association measures are mutual information and Yule's coefficient of colligation Y, whereas cosine and Jaccard coefficients, as well as X**2 statistic and likelihood ratio, demonstrate quite similar behavior for terms with high frequency. Second, among all the measures, the X**2 statistic is the least affected by the frequency of terms. Third, although cosine and Jaccard coefficients tend to emphasize high frequency terms, mutual information and Yule's Y seem to overestimate rare terms

Source

Journal of the American Society for Information Science and technology. 52(2001) no.4, S.283-296

Yoon, Y.; Lee, C.; Lee, G.G.: ¬An effective procedure for constructing a hierarchical text classification system (2006) 0.03

0.025505066 = product of:
  0.05101013 = sum of:
    0.027080212 = weight(_text_:science in 5273) [ClassicSimilarity], result of:
      0.027080212 = score(doc=5273,freq=2.0), product of:
        0.1329271 = queryWeight, product of:
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.050463587 = queryNorm
        0.20372227 = fieldWeight in 5273, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6341193 = idf(docFreq=8627, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5273)
    0.023929918 = product of:
      0.047859836 = sum of:
        0.047859836 = weight(_text_:22 in 5273) [ClassicSimilarity], result of:
          0.047859836 = score(doc=5273,freq=2.0), product of:
            0.17671488 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050463587 = queryNorm
            0.2708308 = fieldWeight in 5273, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5273)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Date: 22. 7.2006 16:24:52
Source: Journal of the American Society for Information Science and Technology. 57(2006) no.3, S.431-442

Search (125 results, page 1 of 7)

Authors

Years

Languages

Types

Themes

Subjects