Search (3 results, page 1 of 1)

Schiminovich, S.: Automatic classification and retrieval of documents by means of a bibliographic pattern discovery algorithm (1971) 0.06

0.05646474 = product of:
  0.22585896 = sum of:
    0.15730032 = weight(_text_:storage in 4846) [ClassicSimilarity], result of:
      0.15730032 = score(doc=4846,freq=2.0), product of:
        0.1866346 = queryWeight, product of:
          5.4488444 = idf(docFreq=516, maxDocs=44218)
          0.034252144 = queryNorm
        0.8428251 = fieldWeight in 4846, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.4488444 = idf(docFreq=516, maxDocs=44218)
          0.109375 = fieldNorm(doc=4846)
    0.06855863 = weight(_text_:retrieval in 4846) [ClassicSimilarity], result of:
      0.06855863 = score(doc=4846,freq=4.0), product of:
        0.10360982 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.034252144 = queryNorm
        0.6617001 = fieldWeight in 4846, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.109375 = fieldNorm(doc=4846)
  0.25 = coord(2/8)

Source: Information storage and retrieval. 6(1971), S.417-435

AlQenaei, Z.M.; Monarchi, D.E.: ¬The use of learning techniques to analyze the results of a manual classification system (2016) 0.04

0.036950413 = product of:
  0.098534435 = sum of:
    0.05617869 = weight(_text_:storage in 2836) [ClassicSimilarity], result of:
      0.05617869 = score(doc=2836,freq=2.0), product of:
        0.1866346 = queryWeight, product of:
          5.4488444 = idf(docFreq=516, maxDocs=44218)
          0.034252144 = queryNorm
        0.30100897 = fieldWeight in 2836, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.4488444 = idf(docFreq=516, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2836)
    0.024485227 = weight(_text_:retrieval in 2836) [ClassicSimilarity], result of:
      0.024485227 = score(doc=2836,freq=4.0), product of:
        0.10360982 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.034252144 = queryNorm
        0.23632148 = fieldWeight in 2836, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2836)
    0.01787052 = weight(_text_:systems in 2836) [ClassicSimilarity], result of:
      0.01787052 = score(doc=2836,freq=2.0), product of:
        0.10526281 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.034252144 = queryNorm
        0.1697705 = fieldWeight in 2836, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2836)
  0.375 = coord(3/8)

Abstract: Classification is the process of assigning objects to pre-defined classes based on observations or characteristics of those objects, and there are many approaches to performing this task. The overall objective of this study is to demonstrate the use of two learning techniques to analyze the results of a manual classification system. Our sample consisted of 1,026 documents, from the ACM Computing Classification System, classified by their authors as belonging to one of the groups of the classification system: "H.3 Information Storage and Retrieval." A singular value decomposition of the documents' weighted term-frequency matrix was used to represent each document in a 50-dimensional vector space. The analysis of the representation using both supervised (decision tree) and unsupervised (clustering) techniques suggests that two pairs of the ACM classes are closely related to each other in the vector space. Class 1 (Content Analysis and Indexing) is closely related to Class 3 (Information Search and Retrieval), and Class 4 (Systems and Software) is closely related to Class 5 (Online Information Services). Further analysis was performed to test the diffusion of the words in the two classes using both cosine and Euclidean distance.

Adamson, G.W.; Boreham, J.: ¬The use of an association measure based on character structure to identify semantically related pairs of words and document titles (1974) 0.03

0.025722325 = product of:
  0.1028893 = sum of:
    0.07865016 = weight(_text_:storage in 398) [ClassicSimilarity], result of:
      0.07865016 = score(doc=398,freq=2.0), product of:
        0.1866346 = queryWeight, product of:
          5.4488444 = idf(docFreq=516, maxDocs=44218)
          0.034252144 = queryNorm
        0.42141256 = fieldWeight in 398, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.4488444 = idf(docFreq=516, maxDocs=44218)
          0.0546875 = fieldNorm(doc=398)
    0.024239138 = weight(_text_:retrieval in 398) [ClassicSimilarity], result of:
      0.024239138 = score(doc=398,freq=2.0), product of:
        0.10360982 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.034252144 = queryNorm
        0.23394634 = fieldWeight in 398, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=398)
  0.25 = coord(2/8)

Source: Information storage and retrieval. 10(1974), S.253-260

Search (3 results, page 1 of 1)

Authors

Years