Search (34 results, page 1 of 2)

  • × author_ss:"Chen, H."
  1. Chung, W.; Chen, H.: Browsing the underdeveloped Web : an experiment on the Arabic Medical Web Directory (2009) 0.04
    0.04119421 = product of:
      0.08238842 = sum of:
        0.08238842 = sum of:
          0.03989153 = weight(_text_:research in 2733) [ClassicSimilarity], result of:
            0.03989153 = score(doc=2733,freq=4.0), product of:
              0.1491455 = queryWeight, product of:
                2.8529835 = idf(docFreq=6931, maxDocs=44218)
                0.05227703 = queryNorm
              0.2674672 = fieldWeight in 2733, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.8529835 = idf(docFreq=6931, maxDocs=44218)
                0.046875 = fieldNorm(doc=2733)
          0.042496894 = weight(_text_:22 in 2733) [ClassicSimilarity], result of:
            0.042496894 = score(doc=2733,freq=2.0), product of:
              0.18306525 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.05227703 = queryNorm
              0.23214069 = fieldWeight in 2733, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=2733)
      0.5 = coord(1/2)
    
    Abstract
    While the Web has grown significantly in recent years, some portions of the Web remain largely underdeveloped, as shown in a lack of high-quality content and functionality. An example is the Arabic Web, in which a lack of well-structured Web directories limits users' ability to browse for Arabic resources. In this research, we proposed an approach to building Web directories for the underdeveloped Web and developed a proof-of-concept prototype called the Arabic Medical Web Directory (AMedDir) that supports browsing of over 5,000 Arabic medical Web sites and pages organized in a hierarchical structure. We conducted an experiment involving Arab participants and found that the AMedDir significantly outperformed two benchmark Arabic Web directories in terms of browsing effectiveness, efficiency, information quality, and user satisfaction. Participants expressed strong preference for the AMedDir and provided many positive comments. This research thus contributes to developing a useful Web directory for organizing the information in the Arabic medical domain and to a better understanding of how to support browsing on the underdeveloped Web.
    Date
    22. 3.2009 17:57:50
  2. Hu, D.; Kaza, S.; Chen, H.: Identifying significant facilitators of dark network evolution (2009) 0.03
    0.029460195 = product of:
      0.05892039 = sum of:
        0.05892039 = sum of:
          0.023506312 = weight(_text_:research in 2753) [ClassicSimilarity], result of:
            0.023506312 = score(doc=2753,freq=2.0), product of:
              0.1491455 = queryWeight, product of:
                2.8529835 = idf(docFreq=6931, maxDocs=44218)
                0.05227703 = queryNorm
              0.15760657 = fieldWeight in 2753, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.8529835 = idf(docFreq=6931, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2753)
          0.03541408 = weight(_text_:22 in 2753) [ClassicSimilarity], result of:
            0.03541408 = score(doc=2753,freq=2.0), product of:
              0.18306525 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.05227703 = queryNorm
              0.19345059 = fieldWeight in 2753, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2753)
      0.5 = coord(1/2)
    
    Abstract
    Social networks evolve over time with the addition and removal of nodes and links to survive and thrive in their environments. Previous studies have shown that the link-formation process in such networks is influenced by a set of facilitators. However, there have been few empirical evaluations to determine the important facilitators. In a research partnership with law enforcement agencies, we used dynamic social-network analysis methods to examine several plausible facilitators of co-offending relationships in a large-scale narcotics network consisting of individuals and vehicles. Multivariate Cox regression and a two-proportion z-test on cyclic and focal closures of the network showed that mutual acquaintance and vehicle affiliations were significant facilitators for the network under study. We also found that homophily with respect to age, race, and gender were not good predictors of future link formation in these networks. Moreover, we examined the social causes and policy implications for the significance and insignificance of various facilitators including common jails on future co-offending. These findings provide important insights into the link-formation processes and the resilience of social networks. In addition, they can be used to aid in the prediction of future links. The methods described can also help in understanding the driving forces behind the formation and evolution of social networks facilitated by mobile and Web technologies.
    Date
    22. 3.2009 18:50:30
  3. Dang, Y.; Zhang, Y.; Chen, H.; Hu, P.J.-H.; Brown, S.A.; Larson, C.: Arizona Literature Mapper : an integrated approach to monitor and analyze global bioterrorism research literature (2009) 0.02
    0.015547963 = product of:
      0.031095926 = sum of:
        0.031095926 = product of:
          0.06219185 = sum of:
            0.06219185 = weight(_text_:research in 2943) [ClassicSimilarity], result of:
              0.06219185 = score(doc=2943,freq=14.0), product of:
                0.1491455 = queryWeight, product of:
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.05227703 = queryNorm
                0.41698778 = fieldWeight in 2943, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2943)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Biomedical research is critical to biodefense, which is drawing increasing attention from governments globally as well as from various research communities. The U.S. government has been closely monitoring and regulating biomedical research activities, particularly those studying or involving bioterrorism agents or diseases. Effective surveillance requires comprehensive understanding of extant biomedical research and timely detection of new developments or emerging trends. The rapid knowledge expansion, technical breakthroughs, and spiraling collaboration networks demand greater support for literature search and sharing, which cannot be effectively supported by conventional literature search mechanisms or systems. In this study, we propose an integrated approach that integrates advanced techniques for content analysis, network analysis, and information visualization. We design and implement Arizona Literature Mapper, a Web-based portal that allows users to gain timely, comprehensive understanding of bioterrorism research, including leading scientists, research groups, institutions as well as insights about current mainstream interests or emerging trends. We conduct two user studies to evaluate Arizona Literature Mapper and include a well-known system for benchmarking purposes. According to our results, Arizona Literature Mapper is significantly more effective for supporting users' search of bioterrorism publications than PubMed. Users consider Arizona Literature Mapper more useful and easier to use than PubMed. Users are also more satisfied with Arizona Literature Mapper and show stronger intentions to use it in the future. Assessments of Arizona Literature Mapper's analysis functions are also positive, as our subjects consider them useful, easy to use, and satisfactory. Our results have important implications that are also discussed in the article.
  4. Dumais, S.; Chen, H.: Hierarchical classification of Web content (2000) 0.01
    0.014103786 = product of:
      0.028207572 = sum of:
        0.028207572 = product of:
          0.056415144 = sum of:
            0.056415144 = weight(_text_:research in 492) [ClassicSimilarity], result of:
              0.056415144 = score(doc=492,freq=2.0), product of:
                0.1491455 = queryWeight, product of:
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.05227703 = queryNorm
                0.37825575 = fieldWeight in 492, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.09375 = fieldNorm(doc=492)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Proceedings of ACM SIGIR 23rd International Conference on Research and Development in Information Retrieval. Ed. by N.J. Belkin, P. Ingwersen u. M.K. Leong
  5. Orwig, R.E.; Chen, H.; Nunamaker, J.F.: ¬A graphical, self-organizing approach to classifying electronic meeting output (1997) 0.01
    0.01163503 = product of:
      0.02327006 = sum of:
        0.02327006 = product of:
          0.04654012 = sum of:
            0.04654012 = weight(_text_:research in 6928) [ClassicSimilarity], result of:
              0.04654012 = score(doc=6928,freq=4.0), product of:
                0.1491455 = queryWeight, product of:
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.05227703 = queryNorm
                0.31204507 = fieldWeight in 6928, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=6928)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Describes research in the application of a Kohonen Self-Organizing Map (SOM) to the problem of classification of electronic brainstorming output and an evaluation of the results. Describes an electronic meeting system and describes the classification problem that exists in the group problem solving process. Surveys the literature concerning classification. Describes the application of the Kohonen SOM to the meeting output classification problem. Describes an experiment that evaluated the classification performed by the Kohonen SOM by comparing it with those of a human expert and a Hopfield neural network. Discusses conclusions and directions for future research
  6. Chung, W.; Chen, H.; Reid, E.: Business stakeholder analyzer : an experiment of classifying stakeholders on the Web (2009) 0.01
    0.0101785315 = product of:
      0.020357063 = sum of:
        0.020357063 = product of:
          0.040714126 = sum of:
            0.040714126 = weight(_text_:research in 2699) [ClassicSimilarity], result of:
              0.040714126 = score(doc=2699,freq=6.0), product of:
                0.1491455 = queryWeight, product of:
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.05227703 = queryNorm
                0.2729826 = fieldWeight in 2699, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2699)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    As the Web is used increasingly to share and disseminate information, business analysts and managers are challenged to understand stakeholder relationships. Traditional stakeholder theories and frameworks employ a manual approach to analysis and do not scale up to accommodate the rapid growth of the Web. Unfortunately, existing business intelligence (BI) tools lack analysis capability, and research on BI systems is sparse. This research proposes a framework for designing BI systems to identify and to classify stakeholders on the Web, incorporating human knowledge and machine-learned information from Web pages. Based on the framework, we have developed a prototype called Business Stakeholder Analyzer (BSA) that helps managers and analysts to identify and to classify their stakeholders on the Web. Results from our experiment involving algorithm comparison, feature comparison, and a user study showed that the system achieved better within-class accuracies in widespread stakeholder types such as partner/sponsor/supplier and media/reviewer, and was more efficient than human classification. The student and practitioner subjects in our user study strongly agreed that such a system would save analysts' time and help to identify and classify stakeholders. This research contributes to a better understanding of how to integrate information technology with stakeholder theory, and enriches the knowledge base of BI system design.
  7. Huang, Z.; Chung, Z.W.; Chen, H.: ¬A graph model for e-commerce recommender systems (2004) 0.01
    0.009972882 = product of:
      0.019945765 = sum of:
        0.019945765 = product of:
          0.03989153 = sum of:
            0.03989153 = weight(_text_:research in 501) [ClassicSimilarity], result of:
              0.03989153 = score(doc=501,freq=4.0), product of:
                0.1491455 = queryWeight, product of:
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.05227703 = queryNorm
                0.2674672 = fieldWeight in 501, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.046875 = fieldNorm(doc=501)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Information overload on the Web has created enormous challenges to customers selecting products for online purchases and to online businesses attempting to identify customers' preferences efficiently. Various recommender systems employing different data representations and recommendation methods are currently used to address these challenges. In this research, we developed a graph model that provides a generic data representation and can support different recommendation methods. To demonstrate its usefulness and flexibility, we developed three recommendation methods: direct retrieval, association mining, and high-degree association retrieval. We used a data set from an online bookstore as our research test-bed. Evaluation results showed that combining product content information and historical customer transaction information achieved more accurate predictions and relevant recommendations than using only collaborative information. However, comparisons among different methods showed that high-degree association retrieval did not perform significantly better than the association mining method or the direct retrieval method in our test-bed.
  8. Chen, H.; Dhar, V.: Cognitive process as a basis for intelligent retrieval system design (1991) 0.01
    0.009402524 = product of:
      0.018805047 = sum of:
        0.018805047 = product of:
          0.037610095 = sum of:
            0.037610095 = weight(_text_:research in 3845) [ClassicSimilarity], result of:
              0.037610095 = score(doc=3845,freq=2.0), product of:
                0.1491455 = queryWeight, product of:
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.05227703 = queryNorm
                0.2521705 = fieldWeight in 3845, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3845)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    2 studies were conducted to investigate the cognitive processes involved in online document-based information retrieval. These studies led to the development of 5 computerised models of online document retrieval. These models were incorporated into a design of an 'intelligent' document-based retrieval system. Following a discussion of this system, discusses the broader implications of the research for the design of information retrieval sysems
  9. Chen, H.: Explaining and alleviating information management indeterminism : a knowledge-based framework (1994) 0.01
    0.009402524 = product of:
      0.018805047 = sum of:
        0.018805047 = product of:
          0.037610095 = sum of:
            0.037610095 = weight(_text_:research in 8221) [ClassicSimilarity], result of:
              0.037610095 = score(doc=8221,freq=2.0), product of:
                0.1491455 = queryWeight, product of:
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.05227703 = queryNorm
                0.2521705 = fieldWeight in 8221, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.0625 = fieldNorm(doc=8221)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Attempts to identify the nature and causes of information management indeterminism in an online research environment and proposes solutions for alleviating this indeterminism. Conducts two empirical studies of information management activities. The first identified the types and nature of information management indeterminism by evaluating archived text. The second focused on four sources of indeterminism: subject area knowledge, classification knowledge, system knowledge, and collaboration knowledge. Proposes a knowledge based design for alleviating indeterminism, which contains a system generated thesaurus and an inferencing engine
  10. Chen, H.: Intelligence and security informatics : Introduction to the special topic issue (2005) 0.01
    0.009198298 = product of:
      0.018396595 = sum of:
        0.018396595 = product of:
          0.03679319 = sum of:
            0.03679319 = weight(_text_:research in 3232) [ClassicSimilarity], result of:
              0.03679319 = score(doc=3232,freq=10.0), product of:
                0.1491455 = queryWeight, product of:
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.05227703 = queryNorm
                0.24669328 = fieldWeight in 3232, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=3232)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Making the Nation Safer: The Role of Science and Technology in Countering Terrorism The commitment of the scientific, engineering, and health communities to helping the United States and the world respond to security challenges became evident after September 11, 2001. The U.S. National Research Council's report an "Making the Nation Safer: The Role of Science and Technology in Countering Terrorism," (National Research Council, 2002, p. 1) explains the context of such a new commitment: Terrorism is a serious threat to the Security of the United States and indeed the world. The vulnerability of societies to terrorist attacks results in part from the proliferation of chemical, biological, and nuclear weapons of mass destruction, but it also is a consequence of the highly efficient and interconnected systems that we rely an for key services such as transportation, information, energy, and health care. The efficient functioning of these systems reflects great technological achievements of the past century, but interconnectedness within and across systems also means that infrastructures are vulnerable to local disruptions, which could lead to widespread or catastrophic failures. As terrorists seek to exploit these vulnerabilities, it is fitting that we harness the nation's exceptional scientific and technological capabilities to Counter terrorist threats. A committee of 24 of the leading scientific, engineering, medical, and policy experts in the United States conducted the study described in the report. Eight panels were separately appointed and asked to provide input to the committee. The panels included: (a) biological sciences, (b) chemical issues, (c) nuclear and radiological issues, (d) information technology, (e) transportation, (f) energy facilities, Cities, and fixed infrastructure, (g) behavioral, social, and institutional issues, and (h) systems analysis and systems engineering. The focus of the committee's work was to make the nation safer from emerging terrorist threats that sought to inflict catastrophic damage an the nation's people, its infrastructure, or its economy. The committee considered nine areas, each of which is discussed in a separate chapter in the report: nuclear and radiological materials, human and agricultural health systems, toxic chemicals and explosive materials, information technology, energy systems, transportation systems, Cities and fixed infrastructure, the response of people to terrorism, and complex and interdependent systems. The chapter an information technology (IT) is particularly relevant to this special issue. The report recommends that "a strategic long-term research and development agenda should be established to address three primary counterterrorismrelated areas in IT: information and network security, the IT needs of emergency responders, and information fusion and management" (National Research Council, 2002, pp. 11 -12). The MD in information and network security should include approaches and architectures for prevention, identification, and containment of cyber-intrusions and recovery from them. The R&D to address IT needs of emergency responders should include ensuring interoperability, maintaining and expanding communications capability during an emergency, communicating with the public during an emergency, and providing support for decision makers. The R&D in information fusion and management for the intelligence, law enforcement, and emergency response communities should include data mining, data integration, language technologies, and processing of image and audio data. Much of the research reported in this special issue is related to information fusion and management for homeland security.
  11. Carmel, E.; Crawford, S.; Chen, H.: Browsing in hypertext : a cognitive study (1992) 0.01
    0.00885352 = product of:
      0.01770704 = sum of:
        0.01770704 = product of:
          0.03541408 = sum of:
            0.03541408 = weight(_text_:22 in 7469) [ClassicSimilarity], result of:
              0.03541408 = score(doc=7469,freq=2.0), product of:
                0.18306525 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05227703 = queryNorm
                0.19345059 = fieldWeight in 7469, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=7469)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    IEEE transactions on systems, man and cybernetics. 22(1992) no.5, S.865-884
  12. Leroy, G.; Chen, H.: Genescene: an ontology-enhanced integration of linguistic and co-occurrence based relations in biomedical texts (2005) 0.01
    0.00885352 = product of:
      0.01770704 = sum of:
        0.01770704 = product of:
          0.03541408 = sum of:
            0.03541408 = weight(_text_:22 in 5259) [ClassicSimilarity], result of:
              0.03541408 = score(doc=5259,freq=2.0), product of:
                0.18306525 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05227703 = queryNorm
                0.19345059 = fieldWeight in 5259, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5259)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 7.2006 14:26:01
  13. Zheng, R.; Li, J.; Chen, H.; Huang, Z.: ¬A framework for authorship identification of online messages : writing-style features and classification techniques (2006) 0.01
    0.00885352 = product of:
      0.01770704 = sum of:
        0.01770704 = product of:
          0.03541408 = sum of:
            0.03541408 = weight(_text_:22 in 5276) [ClassicSimilarity], result of:
              0.03541408 = score(doc=5276,freq=2.0), product of:
                0.18306525 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05227703 = queryNorm
                0.19345059 = fieldWeight in 5276, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5276)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 7.2006 16:14:37
  14. Qin, J.; Zhou, Y.; Chau, M.; Chen, H.: Multilingual Web retrieval : an experiment in English-Chinese business intelligence (2006) 0.01
    0.008310735 = product of:
      0.01662147 = sum of:
        0.01662147 = product of:
          0.03324294 = sum of:
            0.03324294 = weight(_text_:research in 5054) [ClassicSimilarity], result of:
              0.03324294 = score(doc=5054,freq=4.0), product of:
                0.1491455 = queryWeight, product of:
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.05227703 = queryNorm
                0.22288933 = fieldWeight in 5054, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5054)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    As increasing numbers of non-English resources have become available on the Web, the interesting and important issue of how Web users can retrieve documents in different languages has arisen. Cross-language information retrieval (CLIP), the study of retrieving information in one language by queries expressed in another language, is a promising approach to the problem. Cross-language information retrieval has attracted much attention in recent years. Most research systems have achieved satisfactory performance on standard Text REtrieval Conference (TREC) collections such as news articles, but CLIR techniques have not been widely studied and evaluated for applications such as Web portals. In this article, the authors present their research in developing and evaluating a multilingual English-Chinese Web portal that incorporates various CLIP techniques for use in the business domain. A dictionary-based approach was adopted and combines phrasal translation, co-occurrence analysis, and pre- and posttranslation query expansion. The portal was evaluated by domain experts, using a set of queries in both English and Chinese. The experimental results showed that co-occurrence-based phrasal translation achieved a 74.6% improvement in precision over simple word-byword translation. When used together, pre- and posttranslation query expansion improved the performance slightly, achieving a 78.0% improvement over the baseline word-by-word translation approach. In general, applying CLIR techniques in Web applications shows promise.
  15. Liu, X.; Kaza, S.; Zhang, P.; Chen, H.: Determining inventor status and its effect on knowledge diffusion : a study on nanotechnology literature from China, Russia, and India (2011) 0.01
    0.008310735 = product of:
      0.01662147 = sum of:
        0.01662147 = product of:
          0.03324294 = sum of:
            0.03324294 = weight(_text_:research in 4468) [ClassicSimilarity], result of:
              0.03324294 = score(doc=4468,freq=4.0), product of:
                0.1491455 = queryWeight, product of:
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.05227703 = queryNorm
                0.22288933 = fieldWeight in 4468, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4468)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In an increasingly global research landscape, it is important to identify the most prolific researchers in various institutions and their influence on the diffusion of knowledge. Knowledge diffusion within institutions is influenced by not just the status of individual researchers but also the collaborative culture that determines status. There are various methods to measure individual status, but few studies have compared them or explored the possible effects of different cultures on the status measures. In this article, we examine knowledge diffusion within science and technology-oriented research organizations. Using social network analysis metrics to measure individual status in large-scale coauthorship networks, we studied an individual's impact on the recombination of knowledge to produce innovation in nanotechnology. Data from the most productive and high-impact institutions in China (Chinese Academy of Sciences), Russia (Russian Academy of Sciences), and India (Indian Institutes of Technology) were used. We found that boundary-spanning individuals influenced knowledge diffusion in all countries. However, our results also indicate that cultural and institutional differences may influence knowledge diffusion.
  16. Chen, H.; Houston, A.L.; Sewell, R.R.; Schatz, B.R.: Internet browsing and searching : user evaluations of category map and concept space techniques (1998) 0.01
    0.008142825 = product of:
      0.01628565 = sum of:
        0.01628565 = product of:
          0.0325713 = sum of:
            0.0325713 = weight(_text_:research in 869) [ClassicSimilarity], result of:
              0.0325713 = score(doc=869,freq=6.0), product of:
                0.1491455 = queryWeight, product of:
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.05227703 = queryNorm
                0.21838607 = fieldWeight in 869, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.03125 = fieldNorm(doc=869)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The Internet provides an exceptional testbed for developing algorithms that can improve bowsing and searching large information spaces. Browsing and searching tasks are susceptible to problems of information overload and vocabulary differences. Much of the current research is aimed at the development and refinement of algorithms to improve browsing and searching by addressing these problems. Our research was focused on discovering whether two of the algorithms our research group has developed, a Kohonen algorithm category map for browsing, and an automatically generated concept space algorithm for searching, can help improve browsing and / or searching the Internet. Our results indicate that a Kohonen self-organizing map (SOM)-based algorithm can successfully categorize a large and eclectic Internet information space (the Entertainment subcategory of Yahoo!) into manageable sub-spaces that users can successfully navigate to locate a homepage of interest to them. The SOM algorithm worked best with browsing tasks that were very broad, and in which subjects skipped around between categories. Subjects especially liked the visual and graphical aspects of the map. Subjects who tried to do a directed search, and those that wanted to use the more familiar mental models (alphabetic or hierarchical organization) for browsing, found that the work did not work well. The results from the concept space experiment were especially encouraging. There were no significant differences among the precision measures for the set of documents identified by subject-suggested terms, thesaurus-suggested terms, and the combination of subject- and thesaurus-suggested terms. The recall measures indicated that the combination of subject- and thesaurs-suggested terms exhibited significantly better recall than subject-suggested terms alone. Furthermore, analysis of the homepages indicated that there was limited overlap between the homepages retrieved by the subject-suggested and thesaurus-suggested terms. Since the retrieval homepages for the most part were different, this suggests that a user can enhance a keyword-based search by using an automatically generated concept space. Subejcts especially liked the level of control that they could exert over the search, and the fact that the terms suggested by the thesaurus were 'real' (i.e., orininating in the homepages) and therefore guaranteed to have retrieval success
  17. Chen, H.; Shankaranarayanan, G.; She, L.: ¬A machine learning approach to inductive query by examples : an experiment using relevance feedback, ID3, genetic algorithms, and simulated annealing (1998) 0.01
    0.007051893 = product of:
      0.014103786 = sum of:
        0.014103786 = product of:
          0.028207572 = sum of:
            0.028207572 = weight(_text_:research in 1148) [ClassicSimilarity], result of:
              0.028207572 = score(doc=1148,freq=2.0), product of:
                0.1491455 = queryWeight, product of:
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.05227703 = queryNorm
                0.18912788 = fieldWeight in 1148, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1148)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Information retrieval using probabilistic techniques has attracted significant attention on the part of researchers in information and computer science over the past few decades. In the 1980s, knowledge-based techniques also made an impressive contribution to 'intelligent' information retrieval and indexing. More recently, information science researchers have tfurned to other newer inductive learning techniques including symbolic learning, genetic algorithms, and simulated annealing. These newer techniques, which are grounded in diverse paradigms, have provided great opportunities for researchers to enhance the information processing and retrieval capabilities of current information systems. In this article, we first provide an overview of these newer techniques and their use in information retrieval research. In order to femiliarize readers with the techniques, we present 3 promising methods: the symbolic ID3 algorithm, evolution-based genetic algorithms, and simulated annealing. We discuss their knowledge representations and algorithms in the unique context of information retrieval
  18. Zhu, B.; Chen, H.: Validating a geographical image retrieval system (2000) 0.01
    0.007051893 = product of:
      0.014103786 = sum of:
        0.014103786 = product of:
          0.028207572 = sum of:
            0.028207572 = weight(_text_:research in 4769) [ClassicSimilarity], result of:
              0.028207572 = score(doc=4769,freq=2.0), product of:
                0.1491455 = queryWeight, product of:
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.05227703 = queryNorm
                0.18912788 = fieldWeight in 4769, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4769)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This paper summarizes a prototype geographical image retrieval system that demonstrates how to integrate image processing and information analysis techniques to support large-scale content-based image retrieval. By using an image as its interface, the prototype system addresses a troublesome aspect of traditional retrieval models, which require users to have complete knowledge of the low-level features of an image. In addition we describe an experiment to validate against that of human subjects in an effort to address the scarcity of research evaluating performance of an algorithm against that of human beings. The results of the experiment indicate that the system could do as well as human subjects in accomplishing the tasks of similarity analysis and image categorization. We also found that under some circumstances texture features of an image are insufficient to represent an geographic image. We believe, however, that our image retrieval system provides a promising approach to integrating image processing techniques and information retrieval algorithms
  19. Chen, H.; Martinez, J.; Kirchhoff, A.; Ng, T.D.; Schatz, B.R.: Alleviating search uncertainty through concept associations : automatic indexing, co-occurence analysis, and parallel computing (1998) 0.01
    0.007051893 = product of:
      0.014103786 = sum of:
        0.014103786 = product of:
          0.028207572 = sum of:
            0.028207572 = weight(_text_:research in 5202) [ClassicSimilarity], result of:
              0.028207572 = score(doc=5202,freq=2.0), product of:
                0.1491455 = queryWeight, product of:
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.05227703 = queryNorm
                0.18912788 = fieldWeight in 5202, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5202)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In this article, we report research on an algorithmic approach to alleviating search uncertainty in a large information space. Grounded on object filtering, automatic indexing, and co-occurence analysis, we performed a large-scale experiment using a parallel supercomputer (SGI Power Challenge) to analyze 400.000+ abstracts in an INSPEC computer engineering collection. Two system-generated thesauri, one based on a combined object filtering and automatic indexing method, and the other based on automatic indexing only, were compaed with the human-generated INSPEC subject thesaurus. Our user evaluation revealed that the system-generated thesauri were better than the INSPEC thesaurus in 'concept recall', but in 'concept precision' the 3 thesauri were comparable. Our analysis also revealed that the terms suggested by the 3 thesauri were complementary and could be used to significantly increase 'variety' in search terms the thereby reduce search uncertainty
  20. Chen, H.: Introduction to the JASIST special topic section on Web retrieval and mining : A machine learning perspective (2003) 0.01
    0.007051893 = product of:
      0.014103786 = sum of:
        0.014103786 = product of:
          0.028207572 = sum of:
            0.028207572 = weight(_text_:research in 1610) [ClassicSimilarity], result of:
              0.028207572 = score(doc=1610,freq=2.0), product of:
                0.1491455 = queryWeight, product of:
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.05227703 = queryNorm
                0.18912788 = fieldWeight in 1610, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.8529835 = idf(docFreq=6931, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1610)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Research in information retrieval (IR) has advanced significantly in the past few decades. Many tasks, such as indexing and text categorization, can be performed automatically with minimal human effort. Machine learning has played an important role in such automation by learning various patterns such as document topics, text structures, and user interests from examples. In recent years, it has become increasingly difficult to search for useful information an the World Wide Web because of its large size and unstructured nature. Useful information and resources are often hidden in the Web. While machine learning has been successfully applied to traditional IR systems, it poses some new challenges to apply these algorithms to the Web due to its large size, link structure, diversity in content and languages, and dynamic nature. On the other hand, such characteristics of the Web also provide interesting patterns and knowledge that do not present in traditional information retrieval systems.