Search (2 results, page 1 of 1)

  • × author_ss:"Chau, M."
  • × theme_ss:"Internet"
  • × year_i:[2000 TO 2010}
  1. Chau, M.; Fang, X.; Rittman, C.C.: Web searching in Chinese : a study of a search engine in Hong Kong (2007) 0.01
    0.012546628 = product of:
      0.037639882 = sum of:
        0.037639882 = weight(_text_:resources in 336) [ClassicSimilarity], result of:
          0.037639882 = score(doc=336,freq=2.0), product of:
            0.18665522 = queryWeight, product of:
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.051133685 = queryNorm
            0.20165458 = fieldWeight in 336, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.0390625 = fieldNorm(doc=336)
      0.33333334 = coord(1/3)
    
    Abstract
    The number of non-English resources has been increasing rapidly on the Web. Although many studies have been conducted on the query logs in search engines that are primarily English-based (e.g., Excite and AltaVista), only a few of them have studied the information-seeking behavior on the Web in non-English languages. In this article, we report the analysis of the search-query logs of a search engine that focused on Chinese. Three months of search-query logs of Timway, a search engine based in Hong Kong, were collected and analyzed. Metrics on sessions, queries, search topics, and character usage are reported. N-gram analysis also has been applied to perform character-based analysis. Our analysis suggests that some characteristics identified in the search log, such as search topics and the mean number of queries per sessions, are similar to those in English search engines; however, other characteristics, such as the use of operators in query formulation, are significantly different. The analysis also shows that only a very small number of unique Chinese characters are used in search queries. We believe the findings from this study have provided some insights into further research in non-English Web searching.
  2. Chen, H.; Chau, M.: Web mining : machine learning for Web applications (2003) 0.01
    0.0064184438 = product of:
      0.01925533 = sum of:
        0.01925533 = product of:
          0.03851066 = sum of:
            0.03851066 = weight(_text_:management in 4242) [ClassicSimilarity], result of:
              0.03851066 = score(doc=4242,freq=2.0), product of:
                0.17235184 = queryWeight, product of:
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.051133685 = queryNorm
                0.22344214 = fieldWeight in 4242, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4242)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    With more than two billion pages created by millions of Web page authors and organizations, the World Wide Web is a tremendously rich knowledge base. The knowledge comes not only from the content of the pages themselves, but also from the unique characteristics of the Web, such as its hyperlink structure and its diversity of content and languages. Analysis of these characteristics often reveals interesting patterns and new knowledge. Such knowledge can be used to improve users' efficiency and effectiveness in searching for information an the Web, and also for applications unrelated to the Web, such as support for decision making or business management. The Web's size and its unstructured and dynamic content, as well as its multilingual nature, make the extraction of useful knowledge a challenging research problem. Furthermore, the Web generates a large amount of data in other formats that contain valuable information. For example, Web server logs' information about user access patterns can be used for information personalization or improving Web page design.