Search (5 results, page 1 of 1)

  • × author_ss:"Wong, S.K.M."
  • × type_ss:"a"
  1. Wong, S.K.M.: On modelling information retrieval with probabilistic inference (1995) 0.01
    0.008150326 = product of:
      0.020375814 = sum of:
        0.009437811 = weight(_text_:a in 1938) [ClassicSimilarity], result of:
          0.009437811 = score(doc=1938,freq=6.0), product of:
            0.053464882 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046368346 = queryNorm
            0.17652355 = fieldWeight in 1938, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=1938)
        0.010938003 = product of:
          0.021876005 = sum of:
            0.021876005 = weight(_text_:information in 1938) [ClassicSimilarity], result of:
              0.021876005 = score(doc=1938,freq=6.0), product of:
                0.08139861 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046368346 = queryNorm
                0.2687516 = fieldWeight in 1938, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1938)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    Examines and extends the logical models of information retrieval in the context of probability theory and extends the applications of these fundamental ideas to term weighting and relevance. Develops a unified framework for modelling the retrieval process with probabilistic inference to provide a common conceptual and mathematical basis for many retrieval models, such as Boolean, fuzzy sets, vector space, and conventional probabilistic models. Employs this framework to identify the underlying assumptions by each model and analyzes the inherent relationships between them. Although the treatment is primarily theoretical, practical methods for rstimating the required probabilities are provided by simple examples
    Source
    ACM transactions on information systems. 13(1995) no.1, S.38-68
    Type
    a
  2. Wong, S.K.M.; Butz, C.J.; Xiang, X.: Automated database schema design using mined data dependencies (1998) 0.01
    0.0068817483 = product of:
      0.01720437 = sum of:
        0.011678694 = weight(_text_:a in 2897) [ClassicSimilarity], result of:
          0.011678694 = score(doc=2897,freq=12.0), product of:
            0.053464882 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046368346 = queryNorm
            0.21843673 = fieldWeight in 2897, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2897)
        0.005525676 = product of:
          0.011051352 = sum of:
            0.011051352 = weight(_text_:information in 2897) [ClassicSimilarity], result of:
              0.011051352 = score(doc=2897,freq=2.0), product of:
                0.08139861 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046368346 = queryNorm
                0.13576832 = fieldWeight in 2897, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2897)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    Data dependencies are used in database schema design to enforce the correctness of a database as well as to reduce redundant data. These dependencies are usually determined from the semantics of the attributes and are then enforced upon the relations. Describes a bottom-up procedure for discovering multivalued dependencies in observed data without knowing a priori the relationships among the attributes. The proposed algorithm is an application of the technique designed for learning conditional independencies in probabilistic reasoning. A prototype system for automated database schema design has been implemented. Experiments were carried out to demonstrate both the effectiveness and efficiency of the method
    Footnote
    Contribution to a special issue devoted to knowledge discovery and data mining
    Source
    Journal of the American Society for Information Science. 49(1998) no.5, S.455-470
    Type
    a
  3. Wong, S.K.M.; Yao, Y.Y.: ¬An information-theoretic measure of term specifics (1992) 0.01
    0.0063276635 = product of:
      0.015819158 = sum of:
        0.004767807 = weight(_text_:a in 4807) [ClassicSimilarity], result of:
          0.004767807 = score(doc=4807,freq=2.0), product of:
            0.053464882 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046368346 = queryNorm
            0.089176424 = fieldWeight in 4807, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4807)
        0.011051352 = product of:
          0.022102704 = sum of:
            0.022102704 = weight(_text_:information in 4807) [ClassicSimilarity], result of:
              0.022102704 = score(doc=4807,freq=8.0), product of:
                0.08139861 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046368346 = queryNorm
                0.27153665 = fieldWeight in 4807, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4807)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    The inverse document frequency (IDF) and signal-noise ratio (S/N) approaches are term weighting schemes based on term specifics. However, the existing justifications for these methods are still some what inconclusive and sometimes even based on incompatible assumptions. Introduces an information-theoretic measure for term specifics. Shows that the IDF weighting scheme can be derived from the proposed approach by assuming that the frequency of occurrence of each index term is uniform within the set of documents containing the term. The information-theoretic interpretation of term specifics also establishes the relationship between the IDF and S/N methods
    Source
    Journal of the American Society for Information Science. 43(1992) no.1, S.54-61
    Type
    a
  4. Wong, S.K.M.; Yao, Y.Y.: Query formulation in linear retrieval models (1990) 0.01
    0.0063011474 = product of:
      0.015752869 = sum of:
        0.009437811 = weight(_text_:a in 3571) [ClassicSimilarity], result of:
          0.009437811 = score(doc=3571,freq=6.0), product of:
            0.053464882 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046368346 = queryNorm
            0.17652355 = fieldWeight in 3571, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=3571)
        0.006315058 = product of:
          0.012630116 = sum of:
            0.012630116 = weight(_text_:information in 3571) [ClassicSimilarity], result of:
              0.012630116 = score(doc=3571,freq=2.0), product of:
                0.08139861 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046368346 = queryNorm
                0.1551638 = fieldWeight in 3571, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3571)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    The subject of query formulation is analysed within the framework of adaptive linear models. The study is based on the notions of user preference and an acceptable ranking strategy. A gradient descent algorithm is used to formulate the query vector by an inductive process. Presents a critical analysis of the existing relevance feedback and probabilistic approaches.
    Source
    Journal of the American Society for Information Science. 41(1990) no.5, S.334-341
    Type
    a
  5. Wong, S.K.M.; Yao, Y.Y.; Salton, G.; Buckley, C.: Evaluation of an adaptive linear model (1991) 0.01
    0.0056083994 = product of:
      0.014020998 = sum of:
        0.00770594 = weight(_text_:a in 4836) [ClassicSimilarity], result of:
          0.00770594 = score(doc=4836,freq=4.0), product of:
            0.053464882 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046368346 = queryNorm
            0.14413087 = fieldWeight in 4836, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=4836)
        0.006315058 = product of:
          0.012630116 = sum of:
            0.012630116 = weight(_text_:information in 4836) [ClassicSimilarity], result of:
              0.012630116 = score(doc=4836,freq=2.0), product of:
                0.08139861 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046368346 = queryNorm
                0.1551638 = fieldWeight in 4836, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4836)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    Reports on the experimental evaluation of an adaptive linear model that constructs improved user query vectors from user preference judgements on a sample set of documents. The performance of this method is compared with that of the standard relevance feedback techniques. The experimental results seem to demonstrate the effectiveness of the adaptive method
    Source
    Journal of the American Society for Information Science. 42(1991) no.10, S.723-730
    Type
    a