Search (8 results, page 1 of 1)

  • × author_ss:"Hui, S.C."
  1. Goh, A.; Hui, S.C.: TES: a text extraction system (1996) 0.03
    0.026701616 = product of:
      0.053403232 = sum of:
        0.053403232 = sum of:
          0.012533023 = weight(_text_:a in 6599) [ClassicSimilarity], result of:
            0.012533023 = score(doc=6599,freq=16.0), product of:
              0.043477926 = queryWeight, product of:
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.037706986 = queryNorm
              0.28826174 = fieldWeight in 6599, product of:
                4.0 = tf(freq=16.0), with freq of:
                  16.0 = termFreq=16.0
                1.153047 = idf(docFreq=37942, maxDocs=44218)
                0.0625 = fieldNorm(doc=6599)
          0.04087021 = weight(_text_:22 in 6599) [ClassicSimilarity], result of:
            0.04087021 = score(doc=6599,freq=2.0), product of:
              0.13204344 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.037706986 = queryNorm
              0.30952093 = fieldWeight in 6599, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0625 = fieldNorm(doc=6599)
      0.5 = coord(1/2)
    
    Abstract
    With the onset of the information explosion arising from digital libraries and access to a wealth of information through the Internet, the need to efficiently determine the relevance of a document becomes even more urgent. Describes a text extraction system (TES), which retrieves a set of sentences from a document to form an indicative abstract. Such an automated process enables information to be filtered more quickly. Discusses the combination of various text extraction techniques. Compares results with manually produced abstracts
    Date
    26. 2.1997 10:22:43
    Type
    a
  2. He, Y.; Hui, S.C.: Mining a web database for author cocitation analysis (2002) 0.00
    0.0027415988 = product of:
      0.0054831975 = sum of:
        0.0054831975 = product of:
          0.010966395 = sum of:
            0.010966395 = weight(_text_:a in 2584) [ClassicSimilarity], result of:
              0.010966395 = score(doc=2584,freq=4.0), product of:
                0.043477926 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.037706986 = queryNorm
                0.25222903 = fieldWeight in 2584, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.109375 = fieldNorm(doc=2584)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Type
    a
  3. Foo, S.; Hui, S.C.; Lim, H.K.; Hui, L.: Automated thesaurus for enhanced Chinese text retrieval (2000) 0.00
    0.0019582848 = product of:
      0.0039165695 = sum of:
        0.0039165695 = product of:
          0.007833139 = sum of:
            0.007833139 = weight(_text_:a in 759) [ClassicSimilarity], result of:
              0.007833139 = score(doc=759,freq=16.0), product of:
                0.043477926 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.037706986 = queryNorm
                0.18016359 = fieldWeight in 759, product of:
                  4.0 = tf(freq=16.0), with freq of:
                    16.0 = termFreq=16.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=759)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Asian languages such as Japanese, Korean and in particular Chinese, are beginning to gain popularity in the information retrieval (IR) domain. The quality of IR systems has traditionally been judged by the system's retrieval effectiveness which, in turn, is commonly measured by data recall and data precision. This paper proposes and describes a process for generating an automatic Chinese thesaurus that can be used to provide related terms to a user's queries to enhance retrieval effectiveness. In the absence of existing automatic Chinese thesauri, techniques used in English thesaurus generation have been evaluated and adapted to generate a Chinese equivalent. The automatic thesaurus is generated by computing the co-occurrence values between domain-specific terms found in a document collection. These co-occurrence values are in turn derived from the term and document frequencies of the terms. A set of experiments was subsequently carried out on a document test set to evaluate the applicability of the thesaurus. Results obtained from these experiments confirmed that such an automatic generated thesaurus is able to improve the retrieval effectiveness of a Chinese IR system.
    Type
    a
  4. Goh, A.; Hui, S.C.; Chan, S.K.: ¬A text extraction system for news reports (1996) 0.00
    0.0018318077 = product of:
      0.0036636153 = sum of:
        0.0036636153 = product of:
          0.0073272306 = sum of:
            0.0073272306 = weight(_text_:a in 6601) [ClassicSimilarity], result of:
              0.0073272306 = score(doc=6601,freq=14.0), product of:
                0.043477926 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.037706986 = queryNorm
                0.1685276 = fieldWeight in 6601, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=6601)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Describes the design and implementation of a text extraction tool, NEWS_EXT, which aztomatically produces summaries from news reports by extracting sentences to form indicative abstracts. Selection of sentences is based on sentence importance, measured by means of sentence scoring or simple linguistic analysis of sentence structure. Tests were conducted on 4 approaches for the functioning of the NEWS_EXT system; extraction by keyword frequency; extraction by title keywords; extraction by location; and extraction by indicative phrase. Reports results of a study to compare the results of the application of NEWS_EXT with manually produced extracts; using relevance as the criterion for effectiveness. 48 newspaper articles were assessed (The Straits Times, International Herald Tribune, Asian Wall Street Journal, and Financial Times). The evaluation was conducted in 2 stages: stage 1 involving abstracts produced manually by 2 human experts; stage 2 involving the generation of abstracts using NEWS_EXT. Results of each of the 4 approaches were compared with the human produced abstracts, where the title and location approaches were found to give the best results for both local and foreign news. Reports plans to refine and enhance NEWS_EXT and incorporate it as a module within a larger newspaper clipping system
    Type
    a
  5. He, Y.; Hui, S.C.: PubSearch : a Web citation-based retrieval system (2001) 0.00
    0.0016616598 = product of:
      0.0033233196 = sum of:
        0.0033233196 = product of:
          0.006646639 = sum of:
            0.006646639 = weight(_text_:a in 4806) [ClassicSimilarity], result of:
              0.006646639 = score(doc=4806,freq=8.0), product of:
                0.043477926 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.037706986 = queryNorm
                0.15287387 = fieldWeight in 4806, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4806)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Many scientific publications are now available on the World Wide Web for researchers to share research findings. However, they tend to be poorly organised, making the search of relevant publications difficult and time-consuming. Most existing search engines are ineffective in searching these publications, as they do not index Web publications that normally appear in PDF (portable document format) or PostScript formats. Proposes a Web citation-based retrieval system, known as PubSearch, for the retrieval of Web publications. PubSearch indexes Web publications based on citation indices and stores them into a Web Citation Database. The Web Citation Database is then mined to support publication retrieval. Apart from supporting the traditional cited reference search, PubSearch also provides document clustering search and author clustering search. Document clustering groups related publications into clusters, while author clustering categorizes authors into different research areas based on author co-citation analysis.
    Type
    a
  6. Tho, Q.T.; Hui, S.C.; Fong, A.C.M.: ¬A citation-based document retrieval system for finding research expertise (2007) 0.00
    0.0016616598 = product of:
      0.0033233196 = sum of:
        0.0033233196 = product of:
          0.006646639 = sum of:
            0.006646639 = weight(_text_:a in 956) [ClassicSimilarity], result of:
              0.006646639 = score(doc=956,freq=8.0), product of:
                0.043477926 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.037706986 = queryNorm
                0.15287387 = fieldWeight in 956, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=956)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Current citation-based document retrieval systems generally offer only limited search facilities, such as author search. In order to facilitate more advanced search functions, we have developed a significantly improved system that employs two novel techniques: Context-based Cluster Analysis (CCA) and Context-based Ontology Generation frAmework (COGA). CCA aims to extract relevant information from clusters originally obtained from disparate clustering methods by building relationships between them. The built relationships are then represented as formal context using the Formal Concept Analysis (FCA) technique. COGA aims to generate ontology from clusters relationship built by CCA. By combining these two techniques, we are able to perform ontology learning from a citation database using clustering results. We have implemented the improved system and have demonstrated its use for finding research domain expertise. We have also conducted performance evaluation on the system and the results are encouraging.
    Type
    a
  7. Fang, L.; Tuan, L.A.; Hui, S.C.; Wu, L.: Syntactic based approach for grammar question retrieval (2018) 0.00
    0.0015481601 = product of:
      0.0030963202 = sum of:
        0.0030963202 = product of:
          0.0061926404 = sum of:
            0.0061926404 = weight(_text_:a in 5086) [ClassicSimilarity], result of:
              0.0061926404 = score(doc=5086,freq=10.0), product of:
                0.043477926 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.037706986 = queryNorm
                0.14243183 = fieldWeight in 5086, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5086)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    With the popularity of online educational platforms, English learners can learn and practice no matter where they are and what they do. English grammar is one of the important components in learning English. To learn English grammar effectively, it requires students to practice questions containing focused grammar knowledge. In this paper, we study a novel problem of retrieving English grammar questions with similar grammatical focus. Since the grammatical focus similarity is different from textual similarity or sentence syntactic similarity, existing approaches cannot be applied directly to our problem. To address this problem, we propose a syntactic based approach for English grammar question retrieval which can retrieve related grammar questions with similar grammatical focus effectively. In the proposed syntactic based approach, we first propose a new syntactic tree, namely parse-key tree, to capture English grammar questions' grammatical focus. Next, we propose two kernel functions, namely relaxed tree kernel and part-of-speech order kernel, to compute the similarity between two parse-key trees of the query and grammar questions in the collection. Then, the retrieved grammar questions are ranked according to the similarity between the parse-key trees. In addition, if a query is submitted together with answer choices, conceptual similarity and textual similarity are also incorporated to further improve the retrieval accuracy. The performance results have shown that our proposed approach outperforms the state-of-the-art methods based on statistical analysis and syntactic analysis.
    Type
    a
  8. Hui, S.C.; Lau, K.L.: ¬An application of neural networks in document retrieval (1997) 0.00
    0.0011077732 = product of:
      0.0022155463 = sum of:
        0.0022155463 = product of:
          0.0044310926 = sum of:
            0.0044310926 = weight(_text_:a in 2287) [ClassicSimilarity], result of:
              0.0044310926 = score(doc=2287,freq=2.0), product of:
                0.043477926 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.037706986 = queryNorm
                0.10191591 = fieldWeight in 2287, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2287)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Type
    a