Search (3 results, page 1 of 1)

Wolfekuhler, M.R.; Punch, W.F.: Finding salient features for personal Web pages categories (1997) 0.11

0.106233686 = product of:
  0.21246737 = sum of:
    0.21246737 = sum of:
      0.1615221 = weight(_text_:word in 2673) [ClassicSimilarity], result of:
        0.1615221 = score(doc=2673,freq=4.0), product of:
          0.28165168 = queryWeight, product of:
            5.2432623 = idf(docFreq=634, maxDocs=44218)
            0.05371688 = queryNorm
          0.5734818 = fieldWeight in 2673, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            5.2432623 = idf(docFreq=634, maxDocs=44218)
            0.0546875 = fieldNorm(doc=2673)
      0.050945267 = weight(_text_:22 in 2673) [ClassicSimilarity], result of:
        0.050945267 = score(doc=2673,freq=2.0), product of:
          0.18810736 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05371688 = queryNorm
          0.2708308 = fieldWeight in 2673, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0546875 = fieldNorm(doc=2673)
  0.5 = coord(1/2)

Abstract: Examines techniques that discover features in sets of pre-categorized documents, such that similar documents can be found on the WWW. Examines techniques which will classifiy training examples with high accuracy, then explains why this is not necessarily useful. Describes a method for extracting word clusters from the raw document features. Results show that the clustering technique is successful in discovering word groups in personal Web pages which can be used to find similar information on the WWW
Date: 1. 8.1996 22:08:06

Cheng, K.-H.: Automatic identification for topics of electronic documents (1997) 0.06

0.05710669 = product of:
  0.11421338 = sum of:
    0.11421338 = product of:
      0.22842675 = sum of:
        0.22842675 = weight(_text_:word in 1811) [ClassicSimilarity], result of:
          0.22842675 = score(doc=1811,freq=8.0), product of:
            0.28165168 = queryWeight, product of:
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.05371688 = queryNorm
            0.81102574 = fieldWeight in 1811, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1811)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: With the rapid rise in numbers of electronic documents on the Internet, how to effectively assign topics to documents become an important issue. Current research in this area focuses on the behaviour of nouns in documents. Proposes, however, that nouns and verbs together contribute to the process of topic identification. Constructs a mathematical model taking into account the following factors: word importance, word frequency, word co-occurence, and word distance. Preliminary experiments ahow that the performance of the proposed model is equivalent to that of a human being

Hirawa, M.: Role of keywords in the network searching era (1998) 0.03

0.032632396 = product of:
  0.06526479 = sum of:
    0.06526479 = product of:
      0.13052958 = sum of:
        0.13052958 = weight(_text_:word in 3446) [ClassicSimilarity], result of:
          0.13052958 = score(doc=3446,freq=2.0), product of:
            0.28165168 = queryWeight, product of:
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.05371688 = queryNorm
            0.46344328 = fieldWeight in 3446, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.0625 = fieldNorm(doc=3446)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: A survey of Japanese OPACs available on the Internet was conducted relating to use of keywords for subject access. The findings suggest that present OPACs are not capable of storing subject-oriented information. Currently available keyword access derives from a merely title-based retrieval system. Contents data should be added to bibliographic records as an efficient way of providing subject access, and costings for this process should be estimated. Word standardisation issues must also be addressed

Search (3 results, page 1 of 1)

Authors

Languages

Themes