Search (131 results, page 6 of 7)

Billal, B.; Fonseca, A.; Sadat, F.; Lounis, H.: Semi-supervised learning and social media text analysis towards multi-labeling categorization (2017) 0.00

2.5196982E-4 = product of:
  0.0057953056 = sum of:
    0.0057953056 = product of:
      0.011590611 = sum of:
        0.011590611 = weight(_text_:international in 4095) [ClassicSimilarity], result of:
          0.011590611 = score(doc=4095,freq=2.0), product of:
            0.078619614 = queryWeight, product of:
              3.33588 = idf(docFreq=4276, maxDocs=44218)
              0.023567878 = queryNorm
            0.14742646 = fieldWeight in 4095, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.33588 = idf(docFreq=4276, maxDocs=44218)
              0.03125 = fieldNorm(doc=4095)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)

Source: IEEE International Conference on Big Data (Big Data) (2017)

AlQenaei, Z.M.; Monarchi, D.E.: ¬The use of learning techniques to analyze the results of a manual classification system (2016) 0.00
```
2.4153895E-4 = product of:
  0.0055553955 = sum of:
    0.0055553955 = product of:
      0.011110791 = sum of:
        0.011110791 = weight(_text_:1 in 2836) [ClassicSimilarity], result of:
          0.011110791 = score(doc=2836,freq=4.0), product of:
            0.057894554 = queryWeight, product of:
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.023567878 = queryNorm
            0.19191428 = fieldWeight in 2836, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2836)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)
```
Abstract

Classification is the process of assigning objects to pre-defined classes based on observations or characteristics of those objects, and there are many approaches to performing this task. The overall objective of this study is to demonstrate the use of two learning techniques to analyze the results of a manual classification system. Our sample consisted of 1,026 documents, from the ACM Computing Classification System, classified by their authors as belonging to one of the groups of the classification system: "H.3 Information Storage and Retrieval." A singular value decomposition of the documents' weighted term-frequency matrix was used to represent each document in a 50-dimensional vector space. The analysis of the representation using both supervised (decision tree) and unsupervised (clustering) techniques suggests that two pairs of the ACM classes are closely related to each other in the vector space. Class 1 (Content Analysis and Indexing) is closely related to Class 3 (Information Search and Retrieval), and Class 4 (Systems and Software) is closely related to Class 5 (Online Information Services). Further analysis was performed to test the diffusion of the words in the two classes using both cosine and Euclidean distance.

Source

Knowledge organization. 43(2016) no.1, S.56-63

May, A.D.: Automatic classification of e-mail messages by message type (1997) 0.00

2.3911135E-4 = product of:
  0.005499561 = sum of:
    0.005499561 = product of:
      0.010999122 = sum of:
        0.010999122 = weight(_text_:1 in 6493) [ClassicSimilarity], result of:
          0.010999122 = score(doc=6493,freq=2.0), product of:
            0.057894554 = queryWeight, product of:
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.023567878 = queryNorm
            0.18998542 = fieldWeight in 6493, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6493)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)

Source: Journal of the American Society for Information Science. 48(1997) no.1, S.32-39

Rose, J.R.; Gasteiger, J.: HORACE: an automatic system for the hierarchical classification of chemical reactions (1994) 0.00

2.3911135E-4 = product of:
  0.005499561 = sum of:
    0.005499561 = product of:
      0.010999122 = sum of:
        0.010999122 = weight(_text_:1 in 7696) [ClassicSimilarity], result of:
          0.010999122 = score(doc=7696,freq=2.0), product of:
            0.057894554 = queryWeight, product of:
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.023567878 = queryNorm
            0.18998542 = fieldWeight in 7696, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7696)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)

Source: Journal of chemical information and computer sciences. 34(1994) no.1, S.74-90

Dang, E.K.F.; Luk, R.W.P.; Ho, K.S.; Chan, S.C.F.; Lee, D.L.: ¬A new measure of clustering effectiveness : algorithms and experimental studies (2008) 0.00
```
2.3911135E-4 = product of:
  0.005499561 = sum of:
    0.005499561 = product of:
      0.010999122 = sum of:
        0.010999122 = weight(_text_:1 in 1367) [ClassicSimilarity], result of:
          0.010999122 = score(doc=1367,freq=2.0), product of:
            0.057894554 = queryWeight, product of:
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.023567878 = queryNorm
            0.18998542 = fieldWeight in 1367, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1367)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)
```
Abstract

We propose a new optimal clustering effectiveness measure, called CS1, based on a combination of clusters rather than selecting a single optimal cluster as in the traditional MK1 measure. For hierarchical clustering, we present an algorithm to compute CS1, defined by seeking the optimal combinations of disjoint clusters obtained by cutting the hierarchical structure at a certain similarity level. By reformulating the optimization to a 0-1 linear fractional programming problem, we demonstrate that an exact solution can be obtained by a linear time algorithm. We further discuss how our approach can be generalized to more general problems involving overlapping clusters, and we show how optimal estimates can be obtained by greedy algorithms.

Hu, G.; Zhou, S.; Guan, J.; Hu, X.: Towards effective document clustering : a constrained K-means based approach (2008) 0.00

2.3911135E-4 = product of:
  0.005499561 = sum of:
    0.005499561 = product of:
      0.010999122 = sum of:
        0.010999122 = weight(_text_:1 in 2113) [ClassicSimilarity], result of:
          0.010999122 = score(doc=2113,freq=2.0), product of:
            0.057894554 = queryWeight, product of:
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.023567878 = queryNorm
            0.18998542 = fieldWeight in 2113, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2113)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)

Date: 1. 8.2008 12:15:45

Borko, H.: Research in computer based classification systems (1985) 0.00

2.2047358E-4 = product of:
  0.005070892 = sum of:
    0.005070892 = product of:
      0.010141784 = sum of:
        0.010141784 = weight(_text_:international in 3647) [ClassicSimilarity], result of:
          0.010141784 = score(doc=3647,freq=2.0), product of:
            0.078619614 = queryWeight, product of:
              3.33588 = idf(docFreq=4276, maxDocs=44218)
              0.023567878 = queryNorm
            0.12899815 = fieldWeight in 3647, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.33588 = idf(docFreq=4276, maxDocs=44218)
              0.02734375 = fieldNorm(doc=3647)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)

Footnote: Original in: Classification research: Proceedings of the Second International Study Conference held at Hotel Prins Hamlet, Elsinore, Denmark, 14th-18th Sept. 1964. Ed.: Pauline Atherton. Copenhagen: Munksgaard 1965. S.220-238.

Dolin, R.; Agrawal, D.; El Abbadi, A.; Pearlman, J.: Using automated classification for summarizing and selecting heterogeneous information sources (1998) 0.00

2.0495258E-4 = product of:
  0.0047139092 = sum of:
    0.0047139092 = product of:
      0.0094278185 = sum of:
        0.0094278185 = weight(_text_:1 in 316) [ClassicSimilarity], result of:
          0.0094278185 = score(doc=316,freq=2.0), product of:
            0.057894554 = queryWeight, product of:
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.023567878 = queryNorm
            0.16284466 = fieldWeight in 316, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.046875 = fieldNorm(doc=316)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)

Source: D-Lib magazine. 4(1998) no.1

Koch, T.; Ardö, A.; Noodén, L.: ¬The construction of a robot-generated subject index : DESIRE II D3.6a, Working Paper 1 (1999) 0.00

2.0495258E-4 = product of:
  0.0047139092 = sum of:
    0.0047139092 = product of:
      0.0094278185 = sum of:
        0.0094278185 = weight(_text_:1 in 1668) [ClassicSimilarity], result of:
          0.0094278185 = score(doc=1668,freq=2.0), product of:
            0.057894554 = queryWeight, product of:
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.023567878 = queryNorm
            0.16284466 = fieldWeight in 1668, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.046875 = fieldNorm(doc=1668)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)

Golub, K.: Automated subject classification of textual Web pages, based on a controlled vocabulary : challenges and recommendations (2006) 0.00

2.0495258E-4 = product of:
  0.0047139092 = sum of:
    0.0047139092 = product of:
      0.0094278185 = sum of:
        0.0094278185 = weight(_text_:1 in 5897) [ClassicSimilarity], result of:
          0.0094278185 = score(doc=5897,freq=2.0), product of:
            0.057894554 = queryWeight, product of:
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.023567878 = queryNorm
            0.16284466 = fieldWeight in 5897, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.046875 = fieldNorm(doc=5897)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)

Source: New review of hypermedia and multimedia. 12(2006) no.1, S.11-27

Hung, C.-M.; Chien, L.-F.: Web-based text classification in the absence of manually labeled training documents (2007) 0.00

2.0495258E-4 = product of:
  0.0047139092 = sum of:
    0.0047139092 = product of:
      0.0094278185 = sum of:
        0.0094278185 = weight(_text_:1 in 87) [ClassicSimilarity], result of:
          0.0094278185 = score(doc=87,freq=2.0), product of:
            0.057894554 = queryWeight, product of:
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.023567878 = queryNorm
            0.16284466 = fieldWeight in 87, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.046875 = fieldNorm(doc=87)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)

Source: Journal of the American Society for Information Science and Technology. 58(2007) no.1, S.88-96

Liu, R.-L.: Dynamic category profiling for text filtering and classification (2007) 0.00

2.0495258E-4 = product of:
  0.0047139092 = sum of:
    0.0047139092 = product of:
      0.0094278185 = sum of:
        0.0094278185 = weight(_text_:1 in 900) [ClassicSimilarity], result of:
          0.0094278185 = score(doc=900,freq=2.0), product of:
            0.057894554 = queryWeight, product of:
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.023567878 = queryNorm
            0.16284466 = fieldWeight in 900, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.046875 = fieldNorm(doc=900)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)

Source: Information processing and management. 43(2007) no.1, S.154-168

Malenica, M.; Smuc, T.; Snajder, J.; Basic, B.D.: Language morphology offset : text classification on a Croatian-English parallel corpus (2008) 0.00

2.0495258E-4 = product of:
  0.0047139092 = sum of:
    0.0047139092 = product of:
      0.0094278185 = sum of:
        0.0094278185 = weight(_text_:1 in 2035) [ClassicSimilarity], result of:
          0.0094278185 = score(doc=2035,freq=2.0), product of:
            0.057894554 = queryWeight, product of:
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.023567878 = queryNorm
            0.16284466 = fieldWeight in 2035, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.046875 = fieldNorm(doc=2035)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)

Source: Information processing and management. 44(2008) no.1, S.325-339

Zhou, G.D.; Zhang, M.; Ji, D.H.; Zhu, Q.M.: Hierarchical learning strategy in semantic relation extraction (2008) 0.00

2.0495258E-4 = product of:
  0.0047139092 = sum of:
    0.0047139092 = product of:
      0.0094278185 = sum of:
        0.0094278185 = weight(_text_:1 in 2077) [ClassicSimilarity], result of:
          0.0094278185 = score(doc=2077,freq=2.0), product of:
            0.057894554 = queryWeight, product of:
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.023567878 = queryNorm
            0.16284466 = fieldWeight in 2077, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.046875 = fieldNorm(doc=2077)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)

Date: 1. 8.2008 9:11:41

Montesi, M.; Navarrete, T.: Classifying web genres in context : A case study documenting the web genres used by a software engineer (2008) 0.00

2.0495258E-4 = product of:
  0.0047139092 = sum of:
    0.0047139092 = product of:
      0.0094278185 = sum of:
        0.0094278185 = weight(_text_:1 in 2100) [ClassicSimilarity], result of:
          0.0094278185 = score(doc=2100,freq=2.0), product of:
            0.057894554 = queryWeight, product of:
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.023567878 = queryNorm
            0.16284466 = fieldWeight in 2100, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.046875 = fieldNorm(doc=2100)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)

Date: 1. 8.2008 12:17:23

Ko, Y.; Seo, J.: Text classification from unlabeled documents with bootstrapping and feature projection techniques (2009) 0.00

2.0495258E-4 = product of:
  0.0047139092 = sum of:
    0.0047139092 = product of:
      0.0094278185 = sum of:
        0.0094278185 = weight(_text_:1 in 2452) [ClassicSimilarity], result of:
          0.0094278185 = score(doc=2452,freq=2.0), product of:
            0.057894554 = queryWeight, product of:
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.023567878 = queryNorm
            0.16284466 = fieldWeight in 2452, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.046875 = fieldNorm(doc=2452)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)

Source: Information processing and management. 45(2009) no.1, S.70-83

Ozmutlu, S.; Cosar, G.C.: Analyzing the results of automatic new topic identification (2008) 0.00

2.0495258E-4 = product of:
  0.0047139092 = sum of:
    0.0047139092 = product of:
      0.0094278185 = sum of:
        0.0094278185 = weight(_text_:1 in 2604) [ClassicSimilarity], result of:
          0.0094278185 = score(doc=2604,freq=2.0), product of:
            0.057894554 = queryWeight, product of:
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.023567878 = queryNorm
            0.16284466 = fieldWeight in 2604, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.046875 = fieldNorm(doc=2604)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)

Date: 1. 1.2009 10:26:42

Gauch, S.; Chandramouli, A.; Ranganathan, S.: Training a hierarchical classifier using inter document relationships (2009) 0.00

2.0495258E-4 = product of:
  0.0047139092 = sum of:
    0.0047139092 = product of:
      0.0094278185 = sum of:
        0.0094278185 = weight(_text_:1 in 2697) [ClassicSimilarity], result of:
          0.0094278185 = score(doc=2697,freq=2.0), product of:
            0.057894554 = queryWeight, product of:
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.023567878 = queryNorm
            0.16284466 = fieldWeight in 2697, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.046875 = fieldNorm(doc=2697)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)

Source: Journal of the American Society for Information Science and Technology. 60(2009) no.1, S.47-58

Golub, K.: Automated subject classification of textual documents in the context of Web-based hierarchical browsing (2011) 0.00
```
2.0495258E-4 = product of:
  0.0047139092 = sum of:
    0.0047139092 = product of:
      0.0094278185 = sum of:
        0.0094278185 = weight(_text_:1 in 4558) [ClassicSimilarity], result of:
          0.0094278185 = score(doc=4558,freq=2.0), product of:
            0.057894554 = queryWeight, product of:
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.023567878 = queryNorm
            0.16284466 = fieldWeight in 4558, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.046875 = fieldNorm(doc=4558)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)
```
Abstract

While automated methods for information organization have been around for several decades now, exponential growth of the World Wide Web has put them into the forefront of research in different communities, within which several approaches can be identified: 1) machine learning (algorithms that allow computers to improve their performance based on learning from pre-existing data); 2) document clustering (algorithms for unsupervised document organization and automated topic extraction); and 3) string matching (algorithms that match given strings within larger text). Here the aim was to automatically organize textual documents into hierarchical structures for subject browsing. The string-matching approach was tested using a controlled vocabulary (containing pre-selected and pre-defined authorized terms, each corresponding to only one concept). The results imply that an appropriate controlled vocabulary, with a sufficient number of entry terms designating classes, could in itself be a solution for automated classification. Then, if the same controlled vocabulary had an appropriat hierarchical structure, it would at the same time provide a good browsing structure for the collection of automatically classified documents.
Desale, S.K.; Kumbhar, R.: Research on automatic classification of documents in library environment : a literature review (2013) 0.00
```
2.0495258E-4 = product of:
  0.0047139092 = sum of:
    0.0047139092 = product of:
      0.0094278185 = sum of:
        0.0094278185 = weight(_text_:1 in 1071) [ClassicSimilarity], result of:
          0.0094278185 = score(doc=1071,freq=2.0), product of:
            0.057894554 = queryWeight, product of:
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.023567878 = queryNorm
            0.16284466 = fieldWeight in 1071, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4565027 = idf(docFreq=10304, maxDocs=44218)
              0.046875 = fieldNorm(doc=1071)
      0.5 = coord(1/2)
  0.04347826 = coord(1/23)
```
Abstract

This paper aims to provide an overview of automatic classification research, which focuses on issues related to the automatic classification of documents in a library environment. The review covers literature published in mainstream library and information science studies. The review was done on literature published in both academic and professional LIS journals and other documents. This review reveals that basically three types of research are being done on automatic classification: 1) hierarchical classification using different library classification schemes, 2) text categorization and document categorization using different type of classifiers with or without using training documents, and 3) automatic bibliographic classification. Predominantly this research is directed towards solving problems of organization of digital documents in an online environment. However, very little research is devoted towards solving the problems of arrangement of physical documents.

Search (131 results, page 6 of 7)

Authors

Years

Languages

Types

Themes