Search (5 results, page 1 of 1)

  • × author_ss:"Yao, Y.Y."
  1. Lingras, P.J.; Yao, Y.Y.: Data mining using extensions of the rough set model (1998) 0.01
    0.006474727 = product of:
      0.016186817 = sum of:
        0.010661141 = weight(_text_:a in 2910) [ClassicSimilarity], result of:
          0.010661141 = score(doc=2910,freq=10.0), product of:
            0.053464882 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046368346 = queryNorm
            0.19940455 = fieldWeight in 2910, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2910)
        0.005525676 = product of:
          0.011051352 = sum of:
            0.011051352 = weight(_text_:information in 2910) [ClassicSimilarity], result of:
              0.011051352 = score(doc=2910,freq=2.0), product of:
                0.08139861 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046368346 = queryNorm
                0.13576832 = fieldWeight in 2910, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2910)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    Examines basic issues of data mining using the theory of rough sets, which is a recent proposal for generalizing classical set theory. The Pawlak rough set model is based on the concept of an equivalence relation. A generalized rough set model need not be based on equivalence relation axioms. The Pawlak rough set model has been used for deriving deterministic as well as probabilistic rules froma complete database. Demonstrates that a generalised rough set model can be used for generating rules from incomplete databases. These rules are based on plausability functions proposed by Shafer. Discusses the importance of rule extraction from incomplete databases in data mining
    Footnote
    Contribution to a special issue devoted to knowledge discovery and data mining
    Source
    Journal of the American Society for Information Science. 49(1998) no.5, S.415-422
    Type
    a
  2. Wong, S.K.M.; Yao, Y.Y.: ¬An information-theoretic measure of term specifics (1992) 0.01
    0.0063276635 = product of:
      0.015819158 = sum of:
        0.004767807 = weight(_text_:a in 4807) [ClassicSimilarity], result of:
          0.004767807 = score(doc=4807,freq=2.0), product of:
            0.053464882 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046368346 = queryNorm
            0.089176424 = fieldWeight in 4807, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4807)
        0.011051352 = product of:
          0.022102704 = sum of:
            0.022102704 = weight(_text_:information in 4807) [ClassicSimilarity], result of:
              0.022102704 = score(doc=4807,freq=8.0), product of:
                0.08139861 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046368346 = queryNorm
                0.27153665 = fieldWeight in 4807, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4807)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    The inverse document frequency (IDF) and signal-noise ratio (S/N) approaches are term weighting schemes based on term specifics. However, the existing justifications for these methods are still some what inconclusive and sometimes even based on incompatible assumptions. Introduces an information-theoretic measure for term specifics. Shows that the IDF weighting scheme can be derived from the proposed approach by assuming that the frequency of occurrence of each index term is uniform within the set of documents containing the term. The information-theoretic interpretation of term specifics also establishes the relationship between the IDF and S/N methods
    Source
    Journal of the American Society for Information Science. 43(1992) no.1, S.54-61
    Type
    a
  3. Wong, S.K.M.; Yao, Y.Y.: Query formulation in linear retrieval models (1990) 0.01
    0.0063011474 = product of:
      0.015752869 = sum of:
        0.009437811 = weight(_text_:a in 3571) [ClassicSimilarity], result of:
          0.009437811 = score(doc=3571,freq=6.0), product of:
            0.053464882 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046368346 = queryNorm
            0.17652355 = fieldWeight in 3571, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=3571)
        0.006315058 = product of:
          0.012630116 = sum of:
            0.012630116 = weight(_text_:information in 3571) [ClassicSimilarity], result of:
              0.012630116 = score(doc=3571,freq=2.0), product of:
                0.08139861 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046368346 = queryNorm
                0.1551638 = fieldWeight in 3571, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3571)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    The subject of query formulation is analysed within the framework of adaptive linear models. The study is based on the notions of user preference and an acceptable ranking strategy. A gradient descent algorithm is used to formulate the query vector by an inductive process. Presents a critical analysis of the existing relevance feedback and probabilistic approaches.
    Source
    Journal of the American Society for Information Science. 41(1990) no.5, S.334-341
    Type
    a
  4. Wong, S.K.M.; Yao, Y.Y.; Salton, G.; Buckley, C.: Evaluation of an adaptive linear model (1991) 0.01
    0.0056083994 = product of:
      0.014020998 = sum of:
        0.00770594 = weight(_text_:a in 4836) [ClassicSimilarity], result of:
          0.00770594 = score(doc=4836,freq=4.0), product of:
            0.053464882 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046368346 = queryNorm
            0.14413087 = fieldWeight in 4836, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=4836)
        0.006315058 = product of:
          0.012630116 = sum of:
            0.012630116 = weight(_text_:information in 4836) [ClassicSimilarity], result of:
              0.012630116 = score(doc=4836,freq=2.0), product of:
                0.08139861 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046368346 = queryNorm
                0.1551638 = fieldWeight in 4836, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4836)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    Reports on the experimental evaluation of an adaptive linear model that constructs improved user query vectors from user preference judgements on a sample set of documents. The performance of this method is compared with that of the standard relevance feedback techniques. The experimental results seem to demonstrate the effectiveness of the adaptive method
    Source
    Journal of the American Society for Information Science. 42(1991) no.10, S.723-730
    Type
    a
  5. Yao, Y.Y.: Measuring retrieval effectiveness based on user preference of documents (1995) 0.01
    0.005513504 = product of:
      0.01378376 = sum of:
        0.008258085 = weight(_text_:a in 1748) [ClassicSimilarity], result of:
          0.008258085 = score(doc=1748,freq=6.0), product of:
            0.053464882 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046368346 = queryNorm
            0.1544581 = fieldWeight in 1748, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1748)
        0.005525676 = product of:
          0.011051352 = sum of:
            0.011051352 = weight(_text_:information in 1748) [ClassicSimilarity], result of:
              0.011051352 = score(doc=1748,freq=2.0), product of:
                0.08139861 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046368346 = queryNorm
                0.13576832 = fieldWeight in 1748, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1748)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    The notion of user preference is adopted for the representation, interpretation, and measurement of the relevance or usefulness of documents. Use judgements on documents may be formally describes by a weak order (i.e. user ranking) and measured using an ordinal scale. Within this framework, a new measure of system performance is suggested based on the distance between user ranking and system ranking. It only uses the relative order of documents and therefore confirms to the valid use of an ordinal scale measuring relevance. It is also applicable to multilevel relevance judgements and ranked system output. The appropriateness of the proposed measure is demonstrated through an axiomatic approach. The inherent relationships between the new measure and many existing measures provide further supporting evidence
    Source
    Journal of the American Society for Information Science. 46(1995) no.2, S.133-145
    Type
    a