Search (10 results, page 1 of 1)

  • × author_ss:"Zhang, J."
  1. Zhang, J.; Zeng, M.L.: ¬A new similarity measure for subject hierarchical structures (2014) 0.02
    0.023821492 = product of:
      0.059553728 = sum of:
        0.024150565 = weight(_text_:it in 1778) [ClassicSimilarity], result of:
          0.024150565 = score(doc=1778,freq=2.0), product of:
            0.15115225 = queryWeight, product of:
              2.892262 = idf(docFreq=6664, maxDocs=44218)
              0.052260913 = queryNorm
            0.15977642 = fieldWeight in 1778, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.892262 = idf(docFreq=6664, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1778)
        0.035403162 = weight(_text_:22 in 1778) [ClassicSimilarity], result of:
          0.035403162 = score(doc=1778,freq=2.0), product of:
            0.18300882 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052260913 = queryNorm
            0.19345059 = fieldWeight in 1778, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1778)
      0.4 = coord(2/5)
    
    Abstract
    Purpose - The purpose of this paper is to introduce a new similarity method to gauge the differences between two subject hierarchical structures. Design/methodology/approach - In the proposed similarity measure, nodes on two hierarchical structures are projected onto a two-dimensional space, respectively, and both structural similarity and subject similarity of nodes are considered in the similarity between the two hierarchical structures. The extent to which the structural similarity impacts on the similarity can be controlled by adjusting a parameter. An experiment was conducted to evaluate soundness of the measure. Eight experts whose research interests were information retrieval and information organization participated in the study. Results from the new measure were compared with results from the experts. Findings - The evaluation shows strong correlations between the results from the new method and the results from the experts. It suggests that the similarity method achieved satisfactory results. Practical implications - Hierarchical structures that are found in subject directories, taxonomies, classification systems, and other classificatory structures play an extremely important role in information organization and information representation. Measuring the similarity between two subject hierarchical structures allows an accurate overarching understanding of the degree to which the two hierarchical structures are similar. Originality/value - Both structural similarity and subject similarity of nodes were considered in the proposed similarity method, and the extent to which the structural similarity impacts on the similarity can be adjusted. In addition, a new evaluation method for a hierarchical structure similarity was presented.
    Date
    8. 4.2015 16:22:13
  2. Zhang, J.; Korfhage, R.R.: ¬A distance and angle similarity measure method (1999) 0.01
    0.010929298 = product of:
      0.05464649 = sum of:
        0.05464649 = weight(_text_:it in 3915) [ClassicSimilarity], result of:
          0.05464649 = score(doc=3915,freq=4.0), product of:
            0.15115225 = queryWeight, product of:
              2.892262 = idf(docFreq=6664, maxDocs=44218)
              0.052260913 = queryNorm
            0.36153275 = fieldWeight in 3915, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.892262 = idf(docFreq=6664, maxDocs=44218)
              0.0625 = fieldNorm(doc=3915)
      0.2 = coord(1/5)
    
    Abstract
    This article presents a distance and angle similarity measure. The integrated similarity measure takes the strenghts of both the distance and direction of measured documents into account. This article analyzes the features of the similarity measure by comparing it with the traditional distance-based similarity measure and the cosine measure, providing the iso-similarity contour, investigating the impacts of the parameters and variables on the new similarity measure. It also gives the further research issues on the topic
  3. Zhang, L.; Liu, Q.L.; Zhang, J.; Wang, H.F.; Pan, Y.; Yu, Y.: Semplore: an IR approach to scalable hybrid query of Semantic Web data (2007) 0.01
    0.009660226 = product of:
      0.04830113 = sum of:
        0.04830113 = weight(_text_:it in 231) [ClassicSimilarity], result of:
          0.04830113 = score(doc=231,freq=8.0), product of:
            0.15115225 = queryWeight, product of:
              2.892262 = idf(docFreq=6664, maxDocs=44218)
              0.052260913 = queryNorm
            0.31955284 = fieldWeight in 231, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              2.892262 = idf(docFreq=6664, maxDocs=44218)
              0.0390625 = fieldNorm(doc=231)
      0.2 = coord(1/5)
    
    Abstract
    As an extension to the current Web, Semantic Web will not only contain structured data with machine understandable semantics but also textual information. While structured queries can be used to find information more precisely on the Semantic Web, keyword searches are still needed to help exploit textual information. It thus becomes very important that we can combine precise structured queries with imprecise keyword searches to have a hybrid query capability. In addition, due to the huge volume of information on the Semantic Web, the hybrid query must be processed in a very scalable way. In this paper, we define such a hybrid query capability that combines unary tree-shaped structured queries with keyword searches. We show how existing information retrieval (IR) index structures and functions can be reused to index semantic web data and its textual information, and how the hybrid query is evaluated on the index structure using IR engines in an efficient and scalable manner. We implemented this IR approach in an engine called Semplore. Comprehensive experiments on its performance show that it is a promising approach. It leads us to believe that it may be possible to evolve current web search engines to query and search the Semantic Web. Finally, we briefy describe how Semplore is used for searching Wikipedia and an IBM customer's product information.
  4. Li, D.; Luo, Z.; Ding, Y.; Tang, J.; Sun, G.G.-Z.; Dai, X.; Du, J.; Zhang, J.; Kong, S.: User-level microblogging recommendation incorporating social influence (2017) 0.01
    0.008366001 = product of:
      0.041830003 = sum of:
        0.041830003 = weight(_text_:it in 3426) [ClassicSimilarity], result of:
          0.041830003 = score(doc=3426,freq=6.0), product of:
            0.15115225 = queryWeight, product of:
              2.892262 = idf(docFreq=6664, maxDocs=44218)
              0.052260913 = queryNorm
            0.27674085 = fieldWeight in 3426, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.892262 = idf(docFreq=6664, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3426)
      0.2 = coord(1/5)
    
    Abstract
    With the information overload of user-generated content in microblogging, users find it extremely challenging to browse and find valuable information in their first attempt. In this paper we propose a microblogging recommendation algorithm, TSI-MR (Topic-Level Social Influence-based Microblogging Recommendation), which can significantly improve users' microblogging experiences. The main innovation of this proposed algorithm is that we consider social influences and their indirect structural relationships, which are largely based on social status theory, from the topic level. The primary advantage of this approach is that it can build an accurate description of latent relationships between two users with weak connections, which can improve the performance of the model; furthermore, it can solve sparsity problems of training data to a certain extent. The realization of the model is mainly based on Factor Graph. We also applied a distributed strategy to further improve the efficiency of the model. Finally, we use data from Tencent Weibo, one of the most popular microblogging services in China, to evaluate our methods. The results show that incorporating social influence can improve microblogging performance considerably, and outperform the baseline methods.
  5. Zhang, J.: Archival context, digital content, and the ethics of digital archival representation : the ethics of identification in digital library metadata (2012) 0.01
    0.006830811 = product of:
      0.034154054 = sum of:
        0.034154054 = weight(_text_:it in 419) [ClassicSimilarity], result of:
          0.034154054 = score(doc=419,freq=4.0), product of:
            0.15115225 = queryWeight, product of:
              2.892262 = idf(docFreq=6664, maxDocs=44218)
              0.052260913 = queryNorm
            0.22595796 = fieldWeight in 419, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.892262 = idf(docFreq=6664, maxDocs=44218)
              0.0390625 = fieldNorm(doc=419)
      0.2 = coord(1/5)
    
    Abstract
    The findings of a recent study on digital archival representation raise some ethical concerns about how digital archival materials are organized, described, and made available for use on the Web. Archivists have a fundamental obligation to preserve and protect the authenticity and integrity of records in their holdings and, at the same time, have the responsibility to promote the use of records as a fundamental purpose of the keeping of archives (SAA 2005 Code of Ethics for Archivists V & VI). Is it an ethical practice that digital content in digital archives is deeply embedded in its contextual structure and generally underrepresented in digital archival systems? Similarly, is it ethical for archivists to detach digital items from their archival context in order to make them more "digital friendly" and more accessible to meet needs of some users? Do archivists have an obligation to bring the two representation systems together so that the context and content of digital archives can be better represented and archival materials "can be located and used by anyone, for any purpose, while still remaining authentic evidence of the work and life of the creator"? (Millar 2010, 157) This paper discusses the findings of the study and their ethical implications relating to digital archival description and representation.
  6. Zhang, J.; Zhai, S.; Stevenson, J.A.; Xia, L.: Optimization of the subject directory in a government agriculture department web portal (2016) 0.01
    0.006830811 = product of:
      0.034154054 = sum of:
        0.034154054 = weight(_text_:it in 3088) [ClassicSimilarity], result of:
          0.034154054 = score(doc=3088,freq=4.0), product of:
            0.15115225 = queryWeight, product of:
              2.892262 = idf(docFreq=6664, maxDocs=44218)
              0.052260913 = queryNorm
            0.22595796 = fieldWeight in 3088, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.892262 = idf(docFreq=6664, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3088)
      0.2 = coord(1/5)
    
    Abstract
    We investigated a subject directory in the US Agriculture Department-Economic Research Service portal. Parent-child relationships, related connections among the categories, and related connections among the subcategories in the subject directory were optimized using social network analysis. The optimization results were assessed by both density analysis and edge strength analysis methods. In addition, the results were evaluated by domain experts. From this study, it is recommended that four subcategories be switched from their original four categories into two different categories as a result of the parent-child relationship optimization.?It is also recommended that 132 subcategories be moved to 40 subcategories and that eight categories be moved to two categories as a result of the related connection optimization. The findings show that optimization boosted the densities of the optimized categories, and the recommended connections of both the related categories and subcategories were stronger than the existing connections of the related categories and subcategories. This paper provides visual displays of the optimization analysis as well as suggestions to enhance the subject directory of this portal.
  7. Zhang, J.; Jastram, I.: ¬A study of the metadata creation behavior of different user groups on the Internet (2006) 0.01
    0.006762158 = product of:
      0.03381079 = sum of:
        0.03381079 = weight(_text_:it in 982) [ClassicSimilarity], result of:
          0.03381079 = score(doc=982,freq=2.0), product of:
            0.15115225 = queryWeight, product of:
              2.892262 = idf(docFreq=6664, maxDocs=44218)
              0.052260913 = queryNorm
            0.22368698 = fieldWeight in 982, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.892262 = idf(docFreq=6664, maxDocs=44218)
              0.0546875 = fieldNorm(doc=982)
      0.2 = coord(1/5)
    
    Abstract
    Metadata is designed to improve information organization and information retrieval effectiveness and efficiency on the Internet. The way web publishers respond to metadata and the way they use it when publishing their web pages, however, is still a mystery. The authors of this paper aim to solve this mystery by defining different professional publisher groups, examining the behaviors of these user groups, and identifying the characteristics of their metadata use. This study will enhance the current understanding of metadata application behavior and provide evidence useful to researchers, web publishers, and search engine designers.
  8. Hansen, D.L.; Khopkar, T.; Zhang, J.: Recommender systems and expert locators (2009) 0.01
    0.006762158 = product of:
      0.03381079 = sum of:
        0.03381079 = weight(_text_:it in 3867) [ClassicSimilarity], result of:
          0.03381079 = score(doc=3867,freq=2.0), product of:
            0.15115225 = queryWeight, product of:
              2.892262 = idf(docFreq=6664, maxDocs=44218)
              0.052260913 = queryNorm
            0.22368698 = fieldWeight in 3867, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.892262 = idf(docFreq=6664, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3867)
      0.2 = coord(1/5)
    
    Abstract
    This entry describes two important classes of systems that facilitate the sharing of recommendations and expertise. Recommender systems suggest items of potential interest to individuals who do not have personal experience with the items. Expert locator systems, an important subset of recommender systems, help find people with the appropriate skills, knowledge, or expertise to meet a particular need. Research related to each of these systems is relatively new and extremely active. The use of these systems is likely to continue increasing as more and more activity is implicitly captured online, making it possible to automatically identify experts, and capture preferences that can be used to recommend items.
  9. Zhang, J.; Mostafa, J.; Tripathy, H.: Information retrieval by semantic analysis and visualization of the concept space of D-Lib® magazine (2002) 0.01
    0.005915656 = product of:
      0.02957828 = sum of:
        0.02957828 = weight(_text_:it in 1211) [ClassicSimilarity], result of:
          0.02957828 = score(doc=1211,freq=12.0), product of:
            0.15115225 = queryWeight, product of:
              2.892262 = idf(docFreq=6664, maxDocs=44218)
              0.052260913 = queryNorm
            0.19568534 = fieldWeight in 1211, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              2.892262 = idf(docFreq=6664, maxDocs=44218)
              0.01953125 = fieldNorm(doc=1211)
      0.2 = coord(1/5)
    
    Abstract
    In this article we present a method for retrieving documents from a digital library through a visual interface based on automatically generated concepts. We used a vocabulary generation algorithm to generate a set of concepts for the digital library and a technique called the max-min distance technique to cluster them. Additionally, the concepts were visualized in a spring embedding graph layout to depict the semantic relationship among them. The resulting graph layout serves as an aid to users for retrieving documents. An online archive containing the contents of D-Lib Magazine from July 1995 to May 2002 was used to test the utility of an implemented retrieval and visualization system. We believe that the method developed and tested can be applied to many different domains to help users get a better understanding of online document collections and to minimize users' cognitive load during execution of search tasks. Over the past few years, the volume of information available through the World Wide Web has been expanding exponentially. Never has so much information been so readily available and shared among so many people. Unfortunately, the unstructured nature and huge volume of information accessible over networks have made it hard for users to sift through and find relevant information. To deal with this problem, information retrieval (IR) techniques have gained more intensive attention from both industrial and academic researchers. Numerous IR techniques have been developed to help deal with the information overload problem. These techniques concentrate on mathematical models and algorithms for retrieval. Popular IR models such as the Boolean model, the vector-space model, the probabilistic model and their variants are well established.
    From the user's perspective, however, it is still difficult to use current information retrieval systems. Users frequently have problems expressing their information needs and translating those needs into queries. This is partly due to the fact that information needs cannot be expressed appropriately in systems terms. It is not unusual for users to input search terms that are different from the index terms information systems use. Various methods have been proposed to help users choose search terms and articulate queries. One widely used approach is to incorporate into the information system a thesaurus-like component that represents both the important concepts in a particular subject area and the semantic relationships among those concepts. Unfortunately, the development and use of thesauri is not without its own problems. The thesaurus employed in a specific information system has often been developed for a general subject area and needs significant enhancement to be tailored to the information system where it is to be used. This thesaurus development process, if done manually, is both time consuming and labor intensive. Usage of a thesaurus in searching is complex and may raise barriers for the user. For illustration purposes, let us consider two scenarios of thesaurus usage. In the first scenario the user inputs a search term and the thesaurus then displays a matching set of related terms. Without an overview of the thesaurus - and without the ability to see the matching terms in the context of other terms - it may be difficult to assess the quality of the related terms in order to select the correct term. In the second scenario the user browses the whole thesaurus, which is organized as in an alphabetically ordered list. The problem with this approach is that the list may be long, and neither does it show users the global semantic relationship among all the listed terms.
  10. Zhang, J.; Yu, Q.; Zheng, F.; Long, C.; Lu, Z.; Duan, Z.: Comparing keywords plus of WOS and author keywords : a case study of patient adherence research (2016) 0.01
    0.005796136 = product of:
      0.028980678 = sum of:
        0.028980678 = weight(_text_:it in 2857) [ClassicSimilarity], result of:
          0.028980678 = score(doc=2857,freq=2.0), product of:
            0.15115225 = queryWeight, product of:
              2.892262 = idf(docFreq=6664, maxDocs=44218)
              0.052260913 = queryNorm
            0.19173169 = fieldWeight in 2857, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.892262 = idf(docFreq=6664, maxDocs=44218)
              0.046875 = fieldNorm(doc=2857)
      0.2 = coord(1/5)
    
    Abstract
    Bibliometric analysis based on literature in the Web of Science (WOS) has become an increasingly popular method for visualizing the structure of scientific fields. Keywords Plus and Author Keywords are commonly selected as units of analysis, despite the limited research evidence demonstrating the effectiveness of Keywords Plus. This study was conceived to evaluate the efficacy of Keywords Plus as a parameter for capturing the content and scientific concepts presented in articles. Using scientific papers about patient adherence that were retrieved from WOS, a comparative assessment of Keywords Plus and Author Keywords was performed at the scientific field level and the document level, respectively. Our search yielded more Keywords Plus terms than Author Keywords, and the Keywords Plus terms were more broadly descriptive. Keywords Plus is as effective as Author Keywords in terms of bibliometric analysis investigating the knowledge structure of scientific fields, but it is less comprehensive in representing an article's content.