Search (3 results, page 1 of 1)

  • × author_ss:"Zhou, Y."
  1. Qin, J.; Zhou, Y.; Chau, M.; Chen, H.: Multilingual Web retrieval : an experiment in English-Chinese business intelligence (2006) 0.01
    0.007134348 = product of:
      0.01783587 = sum of:
        0.009010308 = weight(_text_:a in 5054) [ClassicSimilarity], result of:
          0.009010308 = score(doc=5054,freq=14.0), product of:
            0.053464882 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046368346 = queryNorm
            0.1685276 = fieldWeight in 5054, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5054)
        0.008825562 = product of:
          0.017651124 = sum of:
            0.017651124 = weight(_text_:information in 5054) [ClassicSimilarity], result of:
              0.017651124 = score(doc=5054,freq=10.0), product of:
                0.08139861 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046368346 = queryNorm
                0.21684799 = fieldWeight in 5054, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5054)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    As increasing numbers of non-English resources have become available on the Web, the interesting and important issue of how Web users can retrieve documents in different languages has arisen. Cross-language information retrieval (CLIP), the study of retrieving information in one language by queries expressed in another language, is a promising approach to the problem. Cross-language information retrieval has attracted much attention in recent years. Most research systems have achieved satisfactory performance on standard Text REtrieval Conference (TREC) collections such as news articles, but CLIR techniques have not been widely studied and evaluated for applications such as Web portals. In this article, the authors present their research in developing and evaluating a multilingual English-Chinese Web portal that incorporates various CLIP techniques for use in the business domain. A dictionary-based approach was adopted and combines phrasal translation, co-occurrence analysis, and pre- and posttranslation query expansion. The portal was evaluated by domain experts, using a set of queries in both English and Chinese. The experimental results showed that co-occurrence-based phrasal translation achieved a 74.6% improvement in precision over simple word-byword translation. When used together, pre- and posttranslation query expansion improved the performance slightly, achieving a 78.0% improvement over the baseline word-by-word translation approach. In general, applying CLIR techniques in Web applications shows promise.
    Footnote
    Beitrag einer special topic section on multilingual information systems
    Source
    Journal of the American Society for Information Science and Technology. 57(2006) no.5, S.671-683
    Type
    a
  2. Chau, M.; Wong, C.H.; Zhou, Y.; Qin, J.; Chen, H.: Evaluating the use of search engine development tools in IT education (2010) 0.00
    0.004592163 = product of:
      0.011480408 = sum of:
        0.005898632 = weight(_text_:a in 3325) [ClassicSimilarity], result of:
          0.005898632 = score(doc=3325,freq=6.0), product of:
            0.053464882 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046368346 = queryNorm
            0.11032722 = fieldWeight in 3325, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3325)
        0.0055817757 = product of:
          0.011163551 = sum of:
            0.011163551 = weight(_text_:information in 3325) [ClassicSimilarity], result of:
              0.011163551 = score(doc=3325,freq=4.0), product of:
                0.08139861 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046368346 = queryNorm
                0.13714671 = fieldWeight in 3325, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3325)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    It is important for education in computer science and information systems to keep up to date with the latest development in technology. With the rapid development of the Internet and the Web, many schools have included Internet-related technologies, such as Web search engines and e-commerce, as part of their curricula. Previous research has shown that it is effective to use search engine development tools to facilitate students' learning. However, the effectiveness of these tools in the classroom has not been evaluated. In this article, we review the design of three search engine development tools, SpidersRUs, Greenstone, and Alkaline, followed by an evaluation study that compared the three tools in the classroom. In the study, 33 students were divided into 13 groups and each group used the three tools to develop three independent search engines in a class project. Our evaluation results showed that SpidersRUs performed better than the two other tools in overall satisfaction and the level of knowledge gained in their learning experience when using the tools for a class project on Internet applications development.
    Source
    Journal of the American Society for Information Science and Technology. 61(2010) no.2, S.288-299
    Type
    a
  3. Na, J.-C.; Sui, H.; Khoo, C.; Chan, S.; Zhou, Y.: Effectiveness of simple linguistic processing in automatic sentiment classification of product reviews (2004) 0.00
    0.004303226 = product of:
      0.010758064 = sum of:
        0.0068111527 = weight(_text_:a in 2624) [ClassicSimilarity], result of:
          0.0068111527 = score(doc=2624,freq=8.0), product of:
            0.053464882 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046368346 = queryNorm
            0.12739488 = fieldWeight in 2624, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2624)
        0.003946911 = product of:
          0.007893822 = sum of:
            0.007893822 = weight(_text_:information in 2624) [ClassicSimilarity], result of:
              0.007893822 = score(doc=2624,freq=2.0), product of:
                0.08139861 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046368346 = queryNorm
                0.09697737 = fieldWeight in 2624, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2624)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    This paper reports a study in automatic sentiment classification, i.e., automatically classifying documents as expressing positive or negative Sentiments/opinions. The study investigates the effectiveness of using SVM (Support Vector Machine) an various text features to classify product reviews into recommended (positive Sentiment) and not recommended (negative sentiment). Compared with traditional topical classification, it was hypothesized that syntactic and semantic processing of text would be more important for sentiment classification. In the first part of this study, several different approaches, unigrams (individual words), selected words (such as verb, adjective, and adverb), and words labelled with part-of-speech tags were investigated. A sample of 1,800 various product reviews was retrieved from Review Centre (www.reviewcentre.com) for the study. 1,200 reviews were used for training, and 600 for testing. Using SVM, the baseline unigram approach obtained an accuracy rate of around 76%. The use of selected words obtained a marginally better result of 77.33%. Error analysis suggests various approaches for improving classification accuracy: use of negation phrase, making inference from superficial words, and solving the problem of comments an parts. The second part of the study that is in progress investigates the use of negation phrase through simple linguistic processing to improve classification accuracy. This approach increased the accuracy rate up to 79.33%.
    Source
    Knowledge organization and the global information society: Proceedings of the 8th International ISKO Conference 13-16 July 2004, London, UK. Ed.: I.C. McIlwaine
    Type
    a