Search (14 results, page 1 of 1)

  • × author_ss:"Wu, Y."
  1. Liu, J.; Wu, Y.; Zhou, L.: ¬A hybrid method for abstracting newspaper articles (1999) 0.00
    0.0030255679 = product of:
      0.0060511357 = sum of:
        0.0060511357 = product of:
          0.012102271 = sum of:
            0.012102271 = weight(_text_:a in 4059) [ClassicSimilarity], result of:
              0.012102271 = score(doc=4059,freq=10.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.22789092 = fieldWeight in 4059, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4059)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This paper introduces a hybrid method for abstracting Chinese text. It integrates the statistical approach with language understanding. Some linguistics heuristics and segmentation are also incorporated into the abstracting process. The prototype system is of a multipurpose type catering for various users with different reqirements. Initial responses show that the proposed method contributes much to the flexibility and accuracy of the automatic Chinese abstracting system. In practice, the present work provides a path to developing an intelligent Chinese system for automating the information
    Type
    a
  2. Wu, Y.; Lehman, A.; Dunaway, D:J.: Evaluations of a large topic map as a knowledge organization tool for supporting self-regulated learning (2015) 0.00
    0.0028703054 = product of:
      0.005740611 = sum of:
        0.005740611 = product of:
          0.011481222 = sum of:
            0.011481222 = weight(_text_:a in 2820) [ClassicSimilarity], result of:
              0.011481222 = score(doc=2820,freq=16.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.2161963 = fieldWeight in 2820, product of:
                  4.0 = tf(freq=16.0), with freq of:
                    16.0 = termFreq=16.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2820)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    A large topic map was created to facilitate understanding of the impacts of the 2010 Gulf of Mexico Oil Spill Incident. The topic map has both a text and graphical interface, which complement each other. A formative evaluation and two summative evaluations were conducted, as qualitative studies, to assess the usefulness and usability of the large topic maps for facilitating self-regulated learning. The topic maps were found useful for knowledge fusion and discovery, and can be useful when undertaking interdisciplinary and multidisciplinary research. Users reported some usability issues about the graphical topic map, including information overload and cluttered display of topics when displaying large number of topics and their associated topics. The text topic map was found easier to use due to displaying topics, relationships and references in a linear view.
    Type
    a
  3. Allen, R.B.; Wu, Y.: Metrics for the scope of a collection (2005) 0.00
    0.0024857575 = product of:
      0.004971515 = sum of:
        0.004971515 = product of:
          0.00994303 = sum of:
            0.00994303 = weight(_text_:a in 4570) [ClassicSimilarity], result of:
              0.00994303 = score(doc=4570,freq=12.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.18723148 = fieldWeight in 4570, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4570)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Some collections cover many topics, while others are narrowly focused an a limited number of topics. We introduce the concept of the "scope" of a collection of documents and we compare two ways of measuring lt. These measures are based an the distances between documents. The first uses the overlap of words between pairs of documents. The second measure uses a novel method that calculates the semantic relatedness to pairs of words from the documents. Those values are combined to obtain an overall distance between the documents. The main validation for the measures compared Web pages categorized by Yahoo. Sets of pages sampied from broad categories were determined to have a higher scope than sets derived from subcategories. The measure was significant and confirmed the expected difference in scope. Finally, we discuss other measures related to scope.
    Type
    a
  4. Wu, Y.; Bai, R.: ¬An event relationship model for knowledge organization and visualization (2017) 0.00
    0.0024857575 = product of:
      0.004971515 = sum of:
        0.004971515 = product of:
          0.00994303 = sum of:
            0.00994303 = weight(_text_:a in 3867) [ClassicSimilarity], result of:
              0.00994303 = score(doc=3867,freq=12.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.18723148 = fieldWeight in 3867, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3867)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    An event is a specific occurrence involving participants, which is a typed, n-ary association of entities or other events, each identified as a participant in a specific semantic role in the event (Pyysalo et al. 2012; Linguistic Data Consortium 2005). Event types may vary across domains. Representing relationships between events can facilitate the understanding of knowledge in complex systems (such as economic systems, human body, social systems). In the simplest form, an event can be represented as Entity A <Relation> Entity B. This paper evaluates several knowledge organization and visualization models and tools, such as concept maps (Cmap), topic maps (Ontopia), network analysis models (Gephi), and ontology (Protégé), then proposes an event relationship model that aims to integrate the strengths of these models, and can represent complex knowledge expressed in events and their relationships.
    Type
    a
  5. Wu, Y.: Indexing historical, political cartoons for retrieval (2013) 0.00
    0.002269176 = product of:
      0.004538352 = sum of:
        0.004538352 = product of:
          0.009076704 = sum of:
            0.009076704 = weight(_text_:a in 1070) [ClassicSimilarity], result of:
              0.009076704 = score(doc=1070,freq=10.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.1709182 = fieldWeight in 1070, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1070)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Previous literature indicates that political cartoons are difficult to index because they have a subjective nature, and indexers may fail to understand the content of a cartoon or may interpret its content subjectively. This study aims to investigate the indexability of historical, political cartoons and the variables that affect the indexing results. It proposes an indexing scheme for describing historical, political cartoons, and uses that indexing scheme to conduct indexing experiments. Through indexing experiments and statistical analysis, three variables, which affect the indexing results, are identified: indexers, indexing fields, and cartoons. There is a statistically significant difference in inter-indexer consistency on indexers, indexing fields, and cartoons. The paper argues that historical, political cartoons can be indexed if knowledgeable indexers are available, and the context of the cartoons is provided. It also proposes a mediated, collaborative indexing approach to indexing such materials.
    Type
    a
  6. Yang, L.; Wu, Y.: Creating a taxonomy of earth-quake disaster response and recovery for online earthquake information management (2019) 0.00
    0.0022374375 = product of:
      0.004474875 = sum of:
        0.004474875 = product of:
          0.00894975 = sum of:
            0.00894975 = weight(_text_:a in 5231) [ClassicSimilarity], result of:
              0.00894975 = score(doc=5231,freq=14.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.1685276 = fieldWeight in 5231, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5231)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The goal of this study is to develop a taxonomy of earthquake response and recovery using online information re-sources for organizing and sharing earthquake-related online in-formation resources. A constructivist/interpretivist research par-adigm was used in the study. A combination of top-down and bottom-up approaches was used to build the taxonomy. Facet analysis of disaster management, the timeframe of disaster man-agement, and modular design were performed when designing the taxonomy. Two case studies were done to demonstrate the usefulness of the taxonomy for organizing and sharing infor-mation. The facet-based taxonomy can be used to organize online information for browsing and navigation. It can also be used to index and tag online information resources to support searching. It creates a common language for earthquake manage-ment stakeholders to share knowledge. The top three level cate-gories of the taxonomy can be applied to the management of other types of disasters. The taxonomy has implications for earthquake online information management, knowledge manage-ment and disaster management. The approach can be used to build taxonomies for managing online information resources on other topics (including various types of time-sensitive disaster re-sponses). We propose a common language for sharing infor-mation on disasters, which has great social relevance.
    Type
    a
  7. Kang, X.; Wu, Y.; Ren, W.: Toward action comprehension for searching : mining actionable intents in query entities (2020) 0.00
    0.0022374375 = product of:
      0.004474875 = sum of:
        0.004474875 = product of:
          0.00894975 = sum of:
            0.00894975 = weight(_text_:a in 5613) [ClassicSimilarity], result of:
              0.00894975 = score(doc=5613,freq=14.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.1685276 = fieldWeight in 5613, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5613)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Understanding search engine users' intents has been a popular study in information retrieval, which directly affects the quality of retrieved information. One of the fundamental problems in this field is to find a connection between the entity in a query and the potential intents of the users, the latter of which would further reveal important information for facilitating the users' future actions. In this article, we present a novel research method for mining the actionable intents for search users, by generating a ranked list of the potentially most informative actions based on a massive pool of action samples. We compare different search strategies and their combinations for retrieving the action pool and develop three criteria for measuring the informativeness of the selected action samples, that is, the significance of an action sample within the pool, the representativeness of an action sample for the other candidate samples, and the diverseness of an action sample with respect to the selected actions. Our experiment, based on the Action Mining (AM) query entity data set from the Actionable Knowledge Graph (AKG) task at NTCIR-13, suggests that the proposed approach is effective in generating an informative and early-satisfying ranking of potential actions for search users.
    Type
    a
  8. Wu, Y.: Organization of complex topics in comprehensive classification schemes : case studies of disaster and security (2023) 0.00
    0.0022374375 = product of:
      0.004474875 = sum of:
        0.004474875 = product of:
          0.00894975 = sum of:
            0.00894975 = weight(_text_:a in 1117) [ClassicSimilarity], result of:
              0.00894975 = score(doc=1117,freq=14.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.1685276 = fieldWeight in 1117, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1117)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This research investigates how comprehensive classifications and home-grown classifications organize complex topics. Two comprehensive classifications and two home-grown taxonomies are used to examine two complex topics: disaster and security. The two comprehensive classifications are the Library of Congress Classification and the Classification Scheme for Chinese Libraries. The two home-grown taxonomies are AIRS 211 LA County Taxonomy of Human Services - Disaster Services, and the Human Security Taxonomy. It is found that a comprehensive classification may provide many subclasses of a complex topic, which are scattered in various classes. Occasionally the classification scheme may provide several small taxonomies that organize the terms of a subclass of the complex topic that are pulled from multiple classes. However, the comprehensive classification provides no organization of the major subclasses of the complex topic. The lack of organization of the major subclasses of the complex topic may prevent users from understanding the complex topic systematically, and so preventing them from selecting an appropriate classification term for the complex topic. Ideally a comprehensive classification should provide a high-level conceptual framework for the complex topic, or at least organize the major subclasses in a way that help users understand the complex topic systematically.
    Type
    a
  9. Du, J.; Tang, X.; Wu, Y.: ¬The effects of research level and article type on the differences between citation metrics and F1000 recommendations (2016) 0.00
    0.0020714647 = product of:
      0.0041429293 = sum of:
        0.0041429293 = product of:
          0.008285859 = sum of:
            0.008285859 = weight(_text_:a in 3228) [ClassicSimilarity], result of:
              0.008285859 = score(doc=3228,freq=12.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.15602624 = fieldWeight in 3228, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3228)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    F1000 recommendations were assessed as a potential data source for research evaluation, but the reasons for differences between F1000 Article Factor (FFa scores) and citations remain unexplored. By linking recommendations for 28,254 publications in F1000 with citations in Scopus, we investigated the effect of research level (basic, clinical, mixed) and article type on the internal consistency of assessments based on citations and FFa scores. The research level has little impact on the differences between the 2 evaluation tools, while article type has a big effect. These 2 measures differ significantly for 2 groups: (a) nonprimary research or evidence-based research are more highly cited but not highly recommended, while (b) translational research or transformative research are more highly recommended but have fewer citations. This can be expected, since citation activity is usually practiced by academic authors while the potential for scientific revolutions and the suitability for clinical practice of an article should be investigated from a practitioners' perspective. We conclude with a recommendation that the application of bibliometric approaches in research evaluation should consider the proportion of 3 types of publications: evidence-based research, transformative research, and translational research. The latter 2 types are more suitable for assessment through peer review.
    Type
    a
  10. Wu, Y.; Yang, L.: Construction and evaluation of an oil spill semantic relation taxonomy for supporting knowledge discovery (2015) 0.00
    0.0020296127 = product of:
      0.0040592253 = sum of:
        0.0040592253 = product of:
          0.008118451 = sum of:
            0.008118451 = weight(_text_:a in 2202) [ClassicSimilarity], result of:
              0.008118451 = score(doc=2202,freq=8.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.15287387 = fieldWeight in 2202, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2202)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The paper presents the rationale, significance, method and procedure of building a taxonomy of semantic relations in the oil spill domain for supporting knowledge discovery through inference. Difficult problems during the development of the taxonomy are discussed and partial solutions are proposed. A preliminary functional evaluation of the taxonomy for supporting knowledge discovery was performed. Durability an expansibility of the taxonomy were evaluated by using the taxonomy to classifying the terms in a biomedical relation ontology. The taxonomy was found to have full expansibility and high degree of durability. The study proposes more research problems than solutions.
    Type
    a
  11. Li, J.; Zhang, P.; Song, D.; Wu, Y.: Understanding an enriched multidimensional user relevance model by analyzing query logs (2017) 0.00
    0.0020296127 = product of:
      0.0040592253 = sum of:
        0.0040592253 = product of:
          0.008118451 = sum of:
            0.008118451 = weight(_text_:a in 3961) [ClassicSimilarity], result of:
              0.008118451 = score(doc=3961,freq=8.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.15287387 = fieldWeight in 3961, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3961)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Modeling multidimensional relevance in information retrieval (IR) has attracted much attention in recent years. However, most existing studies are conducted through relatively small-scale user studies, which may not reflect a real-world and natural search scenario. In this article, we propose to study the multidimensional user relevance model (MURM) on large scale query logs, which record users' various search behaviors (e.g., query reformulations, clicks and dwelling time, etc.) in natural search settings. We advance an existing MURM model (including five dimensions: topicality, novelty, reliability, understandability, and scope) by providing two additional dimensions, that is, interest and habit. The two new dimensions represent personalized relevance judgment on retrieved documents. Further, for each dimension in the enriched MURM model, a set of computable features are formulated. By conducting extensive document ranking experiments on Bing's query logs and TREC session Track data, we systematically investigated the impact of each dimension on retrieval performance and gained a series of insightful findings which may bring benefits for the design of future IR systems.
    Type
    a
  12. Wu, Y.; Liu, Y.; Tsai, Y.-H.R.; Yau, S.-T.: Investigating the role of eye movements and physiological signals in search satisfaction prediction using geometric analysis (2019) 0.00
    0.0014647468 = product of:
      0.0029294936 = sum of:
        0.0029294936 = product of:
          0.005858987 = sum of:
            0.005858987 = weight(_text_:a in 5382) [ClassicSimilarity], result of:
              0.005858987 = score(doc=5382,freq=6.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.11032722 = fieldWeight in 5382, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5382)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Two general challenges faced by data analysis are the existence of noise and the extraction of meaningful information from collected data. In this study, we used a multiscale framework to reduce the effects caused by noise and to extract explainable geometric properties to characterize finite metric spaces. We conducted lab experiments that integrated the use of eye-tracking, electrodermal activity (EDA), and user logs to explore users' information-seeking behaviors on search engine result pages (SERPs). Experimental results of 1,590 search queries showed that the proposed strategies effectively predicted query-level user satisfaction using EDA and eye-tracking data. The bootstrap analysis showed that combining EDA and eye-tracking data with user behavior data extracted from user logs led to a significantly better linear model fit than using user behavior data alone. Furthermore, cross-user and cross-task validations showed that our methods can be generalized to different search engine users performing different preassigned tasks.
    Type
    a
  13. Xiao, C.; Zhou, F.; Wu, Y.: Predicting audience gender in online content-sharing social networks (2013) 0.00
    0.0011959607 = product of:
      0.0023919214 = sum of:
        0.0023919214 = product of:
          0.0047838427 = sum of:
            0.0047838427 = weight(_text_:a in 954) [ClassicSimilarity], result of:
              0.0047838427 = score(doc=954,freq=4.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.090081796 = fieldWeight in 954, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=954)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Understanding the behavior and characteristics of web users is valuable when improving information dissemination, designing recommendation systems, and so on. In this work, we explore various methods of predicting the ratio of male viewers to female viewers on YouTube. First, we propose and examine two hypotheses relating to audience consistency and topic consistency. The former means that videos made by the same authors tend to have similar male-to-female audience ratios, whereas the latter means that videos with similar topics tend to have similar audience gender ratios. To predict the audience gender ratio before video publication, two features based on these two hypotheses and other features are used in multiple linear regression (MLR) and support vector regression (SVR). We find that these two features are the key indicators of audience gender, whereas other features, such as gender of the user and duration of the video, have limited relationships. Second, another method is explored to predict the audience gender ratio. Specifically, we use the early comments collected after video publication to predict the ratio via simple linear regression (SLR). The experiments indicate that this model can achieve better performance by using a few early comments. We also observe that the correlation between the number of early comments (cost) and the predictive accuracy (gain) follows the law of diminishing marginal utility. We build the functions of these elements via curve fitting to find the appropriate number of early comments (approximately 250) that can achieve maximum gain at minimum cost.
    Type
    a
  14. Wang, F.; Yang, J.; Wu, Y.: Non-synchronism in theoretical research of information science (2021) 0.00
    0.0011959607 = product of:
      0.0023919214 = sum of:
        0.0023919214 = product of:
          0.0047838427 = sum of:
            0.0047838427 = weight(_text_:a in 602) [ClassicSimilarity], result of:
              0.0047838427 = score(doc=602,freq=4.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.090081796 = fieldWeight in 602, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=602)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Purpose This paper aims to reveal the global non-synchronism that exists in the theoretical research of information science (IS) by analyzing and comparing the distribution of theory use, creation and borrowing in four representative journals from the USA, the UK and China. Design/methodology/approach Quantitative content analysis is adopted as the research method. First, an analytical framework for non-synchronism in theoretical research of IS is constructed. Second, theories mentioned in the full texts of the research papers of four journals are extracted according to a theory dictionary made before. Third, the non-synchronism in the theoretical research of IS is analyzed. Findings Non-synchronism exists in many aspects of the theoretical research of IS between journals, subject areas and countries/regions. The theoretical underdevelopment still exists in some subject areas of IS. IS presents obvious interdisciplinary characteristics. The theoretical distance from IS to social sciences is shorter than that to natural sciences. Research limitations/implications This study investigates the theoretical research of IS from the perspective of non-synchronism theory, reveals the theoretical distance from IS to other sciences, deepens the communication between different subject and regional sub-communities of IS and provides new evidences for the necessity of developing domestic theories and theorists of IS. Originality/value This study introduces the theory of non-synchronism to IS research for the first time, investigates the new advances in theoretical research of IS and provides new quantitative evidences for the understanding of the interdisciplinary characteristics of IS and the necessity of better communication between sub-communities of IS.
    Type
    a