Search (34 results, page 1 of 2)

  • × author_ss:"Zhang, J."
  1. Zhang, J.; An, L.; Tang, T.; Hong, Y.: Visual health subject directory analysis based on users' traversal activities (2009) 0.02
    0.016717333 = product of:
      0.033434667 = sum of:
        0.0194429 = weight(_text_:information in 3112) [ClassicSimilarity], result of:
          0.0194429 = score(doc=3112,freq=8.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.23274569 = fieldWeight in 3112, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=3112)
        0.013991767 = product of:
          0.027983533 = sum of:
            0.027983533 = weight(_text_:technology in 3112) [ClassicSimilarity], result of:
              0.027983533 = score(doc=3112,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.19744103 = fieldWeight in 3112, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3112)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Concerns about health issues cover a wide spectrum. Consumer health information, which has become more available on the Internet, plays an extremely important role in addressing these concerns. A subject directory as an information organization and browsing mechanism is widely used in consumer health-related Websites. In this study we employed the information visualization technique Self-Organizing Map (SOM) in combination with a new U-matrix algorithm to analyze health subject clusters through a Web transaction log. An experimental study was conducted to test the proposed methods. The findings show that the clusters identified from the same cells based on path-length-1 outperformed both the clusters from the adjacent cells based on path-length-1 and the clusters from the same cells based on path-length-2 in the visual SOM display. The U-matrix method successfully distinguished the irrelevant subjects situated in the adjacent cells with different colors in the SOM display. The findings of this study lead to a better understanding of the health-related subject relationship from the users' traversal perspective.
    Source
    Journal of the American Society for Information Science and Technology. 60(2009) no.10, S.1977-1994
  2. Zhuge, H.; Zhang, J.: Topological centrality and its e-Science applications (2010) 0.02
    0.016181652 = product of:
      0.032363303 = sum of:
        0.016039573 = weight(_text_:information in 3984) [ClassicSimilarity], result of:
          0.016039573 = score(doc=3984,freq=4.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.1920054 = fieldWeight in 3984, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3984)
        0.016323728 = product of:
          0.032647457 = sum of:
            0.032647457 = weight(_text_:technology in 3984) [ClassicSimilarity], result of:
              0.032647457 = score(doc=3984,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.23034787 = fieldWeight in 3984, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3984)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Network structure analysis plays an important role in characterizing complex systems. Different from previous network centrality measures, this article proposes the topological centrality measure reflecting the topological positions of nodes and edges as well as influence between nodes and edges in general network. Experiments on different networks show distinguished features of the topological centrality by comparing with the degree centrality, closeness centrality, betweenness centrality, information centrality, and PageRank. The topological centrality measure is then applied to discover communities and to construct the backbone network. Its characteristics and significance is further shown in e-Science applications.
    Source
    Journal of the American Society for Information Science and Technology. 61(2010) no.9, S.1824-1841
  3. Zhang, J.; Zeng, M.L.: ¬A new similarity measure for subject hierarchical structures (2014) 0.02
    0.016160354 = product of:
      0.032320708 = sum of:
        0.016202414 = weight(_text_:information in 1778) [ClassicSimilarity], result of:
          0.016202414 = score(doc=1778,freq=8.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.19395474 = fieldWeight in 1778, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1778)
        0.016118294 = product of:
          0.032236587 = sum of:
            0.032236587 = weight(_text_:22 in 1778) [ClassicSimilarity], result of:
              0.032236587 = score(doc=1778,freq=2.0), product of:
                0.16663991 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.047586527 = queryNorm
                0.19345059 = fieldWeight in 1778, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1778)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Purpose - The purpose of this paper is to introduce a new similarity method to gauge the differences between two subject hierarchical structures. Design/methodology/approach - In the proposed similarity measure, nodes on two hierarchical structures are projected onto a two-dimensional space, respectively, and both structural similarity and subject similarity of nodes are considered in the similarity between the two hierarchical structures. The extent to which the structural similarity impacts on the similarity can be controlled by adjusting a parameter. An experiment was conducted to evaluate soundness of the measure. Eight experts whose research interests were information retrieval and information organization participated in the study. Results from the new measure were compared with results from the experts. Findings - The evaluation shows strong correlations between the results from the new method and the results from the experts. It suggests that the similarity method achieved satisfactory results. Practical implications - Hierarchical structures that are found in subject directories, taxonomies, classification systems, and other classificatory structures play an extremely important role in information organization and information representation. Measuring the similarity between two subject hierarchical structures allows an accurate overarching understanding of the degree to which the two hierarchical structures are similar. Originality/value - Both structural similarity and subject similarity of nodes were considered in the proposed similarity method, and the extent to which the structural similarity impacts on the similarity can be adjusted. In addition, a new evaluation method for a hierarchical structure similarity was presented.
    Date
    8. 4.2015 16:22:13
  4. Wolfram, D.; Zhang, J.: ¬The influence of indexing practices and weighting algorithms on document spaces (2008) 0.01
    0.013869986 = product of:
      0.027739972 = sum of:
        0.013748205 = weight(_text_:information in 1963) [ClassicSimilarity], result of:
          0.013748205 = score(doc=1963,freq=4.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.16457605 = fieldWeight in 1963, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=1963)
        0.013991767 = product of:
          0.027983533 = sum of:
            0.027983533 = weight(_text_:technology in 1963) [ClassicSimilarity], result of:
              0.027983533 = score(doc=1963,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.19744103 = fieldWeight in 1963, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1963)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Index modeling and computer simulation techniques are used to examine the influence of indexing frequency distributions, indexing exhaustivity distributions, and three weighting methods on hypothetical document spaces in a vector-based information retrieval (IR) system. The way documents are indexed plays an important role in retrieval. The authors demonstrate the influence of different indexing characteristics on document space density (DSD) changes and document space discriminative capacity for IR. Document environments that contain a relatively higher percentage of infrequently occurring terms provide lower density outcomes than do environments where a higher percentage of frequently occurring terms exists. Different indexing exhaustivity levels, however, have little influence on the document space densities. A weighting algorithm that favors higher weights for infrequently occurring terms results in the lowest overall document space densities, which allows documents to be more readily differentiated from one another. This in turn can positively influence IR. The authors also discuss the influence on outcomes using two methods of normalization of term weights (i.e., means and ranges) for the different weighting methods.
    Source
    Journal of the American Society for Information Science and Technology. 59(2008) no.1, S.3-11
  5. Wolfram, D.; Wang, P.; Zhang, J.: Identifying Web search session patterns using cluster analysis : a comparison of three search environments (2009) 0.01
    0.013869986 = product of:
      0.027739972 = sum of:
        0.013748205 = weight(_text_:information in 2796) [ClassicSimilarity], result of:
          0.013748205 = score(doc=2796,freq=4.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.16457605 = fieldWeight in 2796, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2796)
        0.013991767 = product of:
          0.027983533 = sum of:
            0.027983533 = weight(_text_:technology in 2796) [ClassicSimilarity], result of:
              0.027983533 = score(doc=2796,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.19744103 = fieldWeight in 2796, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2796)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Session characteristics taken from large transaction logs of three Web search environments (academic Web site, public search engine, consumer health information portal) were modeled using cluster analysis to determine if coherent session groups emerged for each environment and whether the types of session groups are similar across the three environments. The analysis revealed three distinct clusters of session behaviors common to each environment: hit and run sessions on focused topics, relatively brief sessions on popular topics, and sustained sessions using obscure terms with greater query modification. The findings also revealed shifts in session characteristics over time for one of the datasets, away from hit and run sessions toward more popular search topics. A better understanding of session characteristics can help system designers to develop more responsive systems to support search features that cater to identifiable groups of searchers based on their search behaviors. For example, the system may identify struggling searchers based on session behaviors that match those identified in the current study to provide context sensitive help.
    Source
    Journal of the American Society for Information Science and Technology. 60(2009) no.5, S.896-910
  6. Geng, Q.; Townley, C.; Huang, K.; Zhang, J.: Comparative knowledge management : a pilot study of Chinese and American universities (2005) 0.01
    0.01383271 = product of:
      0.02766542 = sum of:
        0.011341691 = weight(_text_:information in 3876) [ClassicSimilarity], result of:
          0.011341691 = score(doc=3876,freq=2.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.13576832 = fieldWeight in 3876, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3876)
        0.016323728 = product of:
          0.032647457 = sum of:
            0.032647457 = weight(_text_:technology in 3876) [ClassicSimilarity], result of:
              0.032647457 = score(doc=3876,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.23034787 = fieldWeight in 3876, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3876)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Source
    Journal of the American Society for Information Science and Technology. 56(2005) no.10, S.1031-1044
  7. Zhang, J.; Wolfram, D.; Wang, P.: Analysis of query keywords of sports-related queries using visualization and clustering (2009) 0.01
    0.012845755 = product of:
      0.02569151 = sum of:
        0.0140317045 = weight(_text_:information in 2947) [ClassicSimilarity], result of:
          0.0140317045 = score(doc=2947,freq=6.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.16796975 = fieldWeight in 2947, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2947)
        0.011659805 = product of:
          0.02331961 = sum of:
            0.02331961 = weight(_text_:technology in 2947) [ClassicSimilarity], result of:
              0.02331961 = score(doc=2947,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.16453418 = fieldWeight in 2947, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2947)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    The authors investigated 11 sports-related query keywords extracted from a public search engine query log to better understand sports-related information seeking on the Internet. After the query log contents were cleaned and query data were parsed, popular sports-related keywords were identified, along with frequently co-occurring query terms associated with the identified keywords. Relationships among each sports-related focus keyword and its related keywords were characterized and grouped using multidimensional scaling (MDS) in combination with traditional hierarchical clustering methods. The two approaches were synthesized in a visual context by highlighting the results of the hierarchical clustering analysis in the visual MDS configuration. Important events, people, subjects, merchandise, and so on related to a sport were illustrated, and relationships among the sports were analyzed. A small-scale comparative study of sports searches with and without term assistance was conducted. Searches that used search term assistance by relying on previous query term relationships outperformed the searches without the search term assistance. The findings of this study provide insights into sports information seeking behavior on the Internet. The developed method also may be applied to other query log subject areas.
    Source
    Journal of the American Society for Information Science and Technology. 60(2009) no.8, S.1550-1571
  8. Zhang, J.; Zhai, S.; Liu, H.; Stevenson, J.A.: Social network analysis on a topic-based navigation guidance system in a public health portal (2016) 0.01
    0.012845755 = product of:
      0.02569151 = sum of:
        0.0140317045 = weight(_text_:information in 2887) [ClassicSimilarity], result of:
          0.0140317045 = score(doc=2887,freq=6.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.16796975 = fieldWeight in 2887, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2887)
        0.011659805 = product of:
          0.02331961 = sum of:
            0.02331961 = weight(_text_:technology in 2887) [ClassicSimilarity], result of:
              0.02331961 = score(doc=2887,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.16453418 = fieldWeight in 2887, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2887)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    We investigated a topic-based navigation guidance system in the World Health Organization portal, compared the link connection network and the semantic connection network derived from the guidance system, analyzed the characteristics of the 2 networks from the perspective of the node centrality (in_closeness, out_closeness, betweenness, in_degree, and out_degree), and provided the suggestions to optimize and enhance the topic-based navigation guidance system. A mixed research method that combines the social network analysis method, clustering analysis method, and inferential analysis methods was used. The clustering analysis results of the link connection network were quite different from those of the semantic connection network. There were significant differences between the link connection network and the semantic network in terms of density and centrality. Inferential analysis results show that there were no strong correlations between the centrality of a node and its topic information characteristics. Suggestions for enhancing the navigation guidance system are discussed in detail. Future research directions, such as application of the same research method presented in this study to other similar public health portals, are also included.
    Source
    Journal of the Association for Information Science and Technology. 67(2016) no.5, S.1068-1088
    Theme
    Information Gateway
  9. Li, D.; Luo, Z.; Ding, Y.; Tang, J.; Sun, G.G.-Z.; Dai, X.; Du, J.; Zhang, J.; Kong, S.: User-level microblogging recommendation incorporating social influence (2017) 0.01
    0.012845755 = product of:
      0.02569151 = sum of:
        0.0140317045 = weight(_text_:information in 3426) [ClassicSimilarity], result of:
          0.0140317045 = score(doc=3426,freq=6.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.16796975 = fieldWeight in 3426, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3426)
        0.011659805 = product of:
          0.02331961 = sum of:
            0.02331961 = weight(_text_:technology in 3426) [ClassicSimilarity], result of:
              0.02331961 = score(doc=3426,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.16453418 = fieldWeight in 3426, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3426)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    With the information overload of user-generated content in microblogging, users find it extremely challenging to browse and find valuable information in their first attempt. In this paper we propose a microblogging recommendation algorithm, TSI-MR (Topic-Level Social Influence-based Microblogging Recommendation), which can significantly improve users' microblogging experiences. The main innovation of this proposed algorithm is that we consider social influences and their indirect structural relationships, which are largely based on social status theory, from the topic level. The primary advantage of this approach is that it can build an accurate description of latent relationships between two users with weak connections, which can improve the performance of the model; furthermore, it can solve sparsity problems of training data to a certain extent. The realization of the model is mainly based on Factor Graph. We also applied a distributed strategy to further improve the efficiency of the model. Finally, we use data from Tencent Weibo, one of the most popular microblogging services in China, to evaluate our methods. The results show that incorporating social influence can improve microblogging performance considerably, and outperform the baseline methods.
    Source
    Journal of the Association for Information Science and Technology. 68(2017) no.3, S.553-568
  10. Zhang, J.; Yu, Q.; Zheng, F.; Long, C.; Lu, Z.; Duan, Z.: Comparing keywords plus of WOS and author keywords : a case study of patient adherence research (2016) 0.01
    0.011856608 = product of:
      0.023713216 = sum of:
        0.00972145 = weight(_text_:information in 2857) [ClassicSimilarity], result of:
          0.00972145 = score(doc=2857,freq=2.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.116372846 = fieldWeight in 2857, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2857)
        0.013991767 = product of:
          0.027983533 = sum of:
            0.027983533 = weight(_text_:technology in 2857) [ClassicSimilarity], result of:
              0.027983533 = score(doc=2857,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.19744103 = fieldWeight in 2857, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2857)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Source
    Journal of the Association for Information Science and Technology. 67(2016) no.4, S.967-972
  11. Zhang, J.; Wolfram, D.; Wang, P.; Hong, Y.; Gillis, R.: Visualization of health-subject analysis based on query term co-occurrences (2008) 0.01
    0.011558321 = product of:
      0.023116643 = sum of:
        0.011456838 = weight(_text_:information in 2376) [ClassicSimilarity], result of:
          0.011456838 = score(doc=2376,freq=4.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.13714671 = fieldWeight in 2376, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2376)
        0.011659805 = product of:
          0.02331961 = sum of:
            0.02331961 = weight(_text_:technology in 2376) [ClassicSimilarity], result of:
              0.02331961 = score(doc=2376,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.16453418 = fieldWeight in 2376, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2376)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    A multidimensional-scaling approach is used to analyze frequently used medical-topic terms in queries submitted to a Web-based consumer health information system. Based on a year-long transaction log file, five medical focus keywords (stomach, hip, stroke, depression, and cholesterol) and their co-occurring query terms are analyzed. An overlap-coefficient similarity measure and a conversion measure are used to calculate the proximity of terms to one another based on their co-occurrences in queries. The impact of the dimensionality of the visual configuration, the cutoff point of term co-occurrence for inclusion in the analysis, and the Minkowski metric power k on the stress value are discussed. A visual clustering of groups of terms based on the proximity within each focus-keyword group is also conducted. Term distributions within each visual configuration are characterized and are compared with formal medical vocabulary. This investigation reveals that there are significant differences between consumer health query-term usage and more formal medical terminology used by medical professionals when describing the same medical subject. Future directions are discussed.
    Source
    Journal of the American Society for Information Science and Technology. 59(2008) no.12, S.1933-1947
  12. Liu, X.; Zhang, J.; Guo, C.: Full-text citation analysis : a new method to enhance scholarly networks (2013) 0.01
    0.011558321 = product of:
      0.023116643 = sum of:
        0.011456838 = weight(_text_:information in 1044) [ClassicSimilarity], result of:
          0.011456838 = score(doc=1044,freq=4.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.13714671 = fieldWeight in 1044, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1044)
        0.011659805 = product of:
          0.02331961 = sum of:
            0.02331961 = weight(_text_:technology in 1044) [ClassicSimilarity], result of:
              0.02331961 = score(doc=1044,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.16453418 = fieldWeight in 1044, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1044)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    In this article, we use innovative full-text citation analysis along with supervised topic modeling and network-analysis algorithms to enhance classical bibliometric analysis and publication/author/venue ranking. By utilizing citation contexts extracted from a large number of full-text publications, each citation or publication is represented by a probability distribution over a set of predefined topics, where each topic is labeled by an author-contributed keyword. We then used publication/citation topic distribution to generate a citation graph with vertex prior and edge transitioning probability distributions. The publication importance score for each given topic is calculated by PageRank with edge and vertex prior distributions. To evaluate this work, we sampled 104 topics (labeled with keywords) in review papers. The cited publications of each review paper are assumed to be "important publications" for the target topic (keyword), and we use these cited publications to validate our topic-ranking result and to compare different publication-ranking lists. Evaluation results show that full-text citation and publication content prior topic distribution, along with the classical PageRank algorithm can significantly enhance bibliometric analysis and scientific publication ranking performance, comparing with term frequency-inverted document frequency (tf-idf), language model, BM25, PageRank, and PageRank + language model (p < .001), for academic information retrieval (IR) systems.
    Source
    Journal of the American Society for Information Science and Technology. 64(2013) no.9, S.1852-1863
  13. Li, D.; Tang, J.; Ding, Y.; Shuai, X.; Chambers, T.; Sun, G.; Luo, Z.; Zhang, J.: Topic-level opinion influence model (TOIM) : an investigation using tencent microblogging (2015) 0.01
    0.011558321 = product of:
      0.023116643 = sum of:
        0.011456838 = weight(_text_:information in 2345) [ClassicSimilarity], result of:
          0.011456838 = score(doc=2345,freq=4.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.13714671 = fieldWeight in 2345, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2345)
        0.011659805 = product of:
          0.02331961 = sum of:
            0.02331961 = weight(_text_:technology in 2345) [ClassicSimilarity], result of:
              0.02331961 = score(doc=2345,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.16453418 = fieldWeight in 2345, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2345)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Text mining has been widely used in multiple types of user-generated data to infer user opinion, but its application to microblogging is difficult because text messages are short and noisy, providing limited information about user opinion. Given that microblogging users communicate with each other to form a social network, we hypothesize that user opinion is influenced by its neighbors in the network. In this paper, we infer user opinion on a topic by combining two factors: the user's historical opinion about relevant topics and opinion influence from his/her neighbors. We thus build a topic-level opinion influence model (TOIM) by integrating both topic factor and opinion influence factor into a unified probabilistic model. We evaluate our model in one of the largest microblogging sites in China, Tencent Weibo, and the experiments show that TOIM outperforms baseline methods in opinion inference accuracy. Moreover, incorporating indirect influence further improves inference recall and f1-measure. Finally, we demonstrate some useful applications of TOIM in analyzing users' behaviors in Tencent Weibo.
    Source
    Journal of the Association for Information Science and Technology. 66(2015) no.12, S.2657-2673
  14. Zhang, J.; Chen, Y.; Zhao, Y.; Wolfram, D.; Ma, F.: Public health and social media : a study of Zika virus-related posts on Yahoo! Answers (2020) 0.01
    0.011558321 = product of:
      0.023116643 = sum of:
        0.011456838 = weight(_text_:information in 5672) [ClassicSimilarity], result of:
          0.011456838 = score(doc=5672,freq=4.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.13714671 = fieldWeight in 5672, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5672)
        0.011659805 = product of:
          0.02331961 = sum of:
            0.02331961 = weight(_text_:technology in 5672) [ClassicSimilarity], result of:
              0.02331961 = score(doc=5672,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.16453418 = fieldWeight in 5672, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5672)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    This study investigates the content of questions and responses about the Zika virus on Yahoo! Answers as a recent example of how public concerns regarding an international health issue are reflected in social media. We investigate the contents of posts about the Zika virus on Yahoo! Answers, identify and reveal subject patterns about the Zika virus, and analyze the temporal changes of the revealed subject topics over 4 defined periods of the Zika virus outbreak. Multidimensional scaling analysis, temporal analysis, and inferential statistical analysis approaches were used in the study. A resulting 2-layer Zika virus schema, and term connections and relationships are presented. The results indicate that consumers' concerns changed over the 4 defined periods. Consumers paid more attention to the basic information about the Zika virus, and the prevention and protection from the Zika virus at the beginning of the outbreak of the Zika virus. During the later periods, consumers became more interested in the role that the government and health organizations played in the public health emergency.
    Source
    Journal of the Association for Information Science and Technology. 71(2020) no.3, S.282-299
  15. Zhang, J.; Wolfram, D.: Visualization of term discrimination analysis (2001) 0.01
    0.0098805055 = product of:
      0.019761011 = sum of:
        0.008101207 = weight(_text_:information in 5210) [ClassicSimilarity], result of:
          0.008101207 = score(doc=5210,freq=2.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.09697737 = fieldWeight in 5210, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5210)
        0.011659805 = product of:
          0.02331961 = sum of:
            0.02331961 = weight(_text_:technology in 5210) [ClassicSimilarity], result of:
              0.02331961 = score(doc=5210,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.16453418 = fieldWeight in 5210, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5210)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Source
    Journal of the American Society for Information Science and technology. 52(2001) no.8, S.615-627
  16. Wolfram, D.; Zhang, J.: ¬An investigation of the influence of indexing exhaustivity and term distributions on a document space (2002) 0.01
    0.0098805055 = product of:
      0.019761011 = sum of:
        0.008101207 = weight(_text_:information in 5238) [ClassicSimilarity], result of:
          0.008101207 = score(doc=5238,freq=2.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.09697737 = fieldWeight in 5238, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5238)
        0.011659805 = product of:
          0.02331961 = sum of:
            0.02331961 = weight(_text_:technology in 5238) [ClassicSimilarity], result of:
              0.02331961 = score(doc=5238,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.16453418 = fieldWeight in 5238, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5238)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Source
    Journal of the American Society for Information Science and Technology. 53(2002) no.11, S.944-952
  17. Zhang, J.; Zhai, S.; Stevenson, J.A.; Xia, L.: Optimization of the subject directory in a government agriculture department web portal (2016) 0.01
    0.0098805055 = product of:
      0.019761011 = sum of:
        0.008101207 = weight(_text_:information in 3088) [ClassicSimilarity], result of:
          0.008101207 = score(doc=3088,freq=2.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.09697737 = fieldWeight in 3088, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3088)
        0.011659805 = product of:
          0.02331961 = sum of:
            0.02331961 = weight(_text_:technology in 3088) [ClassicSimilarity], result of:
              0.02331961 = score(doc=3088,freq=2.0), product of:
                0.1417311 = queryWeight, product of:
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.047586527 = queryNorm
                0.16453418 = fieldWeight in 3088, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.978387 = idf(docFreq=6114, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3088)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Source
    Journal of the Association for Information Science and Technology. 67(2016) no.9, S.2166-2180
  18. Zhang, J.: TOFIR: A tool of facilitating information retrieval : introduce a visual retrieval model (2001) 0.01
    0.008019786 = product of:
      0.032079145 = sum of:
        0.032079145 = weight(_text_:information in 7711) [ClassicSimilarity], result of:
          0.032079145 = score(doc=7711,freq=4.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.3840108 = fieldWeight in 7711, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.109375 = fieldNorm(doc=7711)
      0.25 = coord(1/4)
    
    Source
    Information processing and management. 37(2001) no.4, S.639-657
  19. Zhang, J.; Dimitroff, A.: Internet search engines' response to Metadata Dublin Core implementation (2005) 0.01
    0.0056708455 = product of:
      0.022683382 = sum of:
        0.022683382 = weight(_text_:information in 4652) [ClassicSimilarity], result of:
          0.022683382 = score(doc=4652,freq=2.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.27153665 = fieldWeight in 4652, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.109375 = fieldNorm(doc=4652)
      0.25 = coord(1/4)
    
    Source
    Journal of information science. 30(2005) no.4, S.310-
  20. Zhang, L.; Liu, Q.L.; Zhang, J.; Wang, H.F.; Pan, Y.; Yu, Y.: Semplore: an IR approach to scalable hybrid query of Semantic Web data (2007) 0.01
    0.005358445 = product of:
      0.02143378 = sum of:
        0.02143378 = weight(_text_:information in 231) [ClassicSimilarity], result of:
          0.02143378 = score(doc=231,freq=14.0), product of:
            0.083537094 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047586527 = queryNorm
            0.256578 = fieldWeight in 231, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=231)
      0.25 = coord(1/4)
    
    Abstract
    As an extension to the current Web, Semantic Web will not only contain structured data with machine understandable semantics but also textual information. While structured queries can be used to find information more precisely on the Semantic Web, keyword searches are still needed to help exploit textual information. It thus becomes very important that we can combine precise structured queries with imprecise keyword searches to have a hybrid query capability. In addition, due to the huge volume of information on the Semantic Web, the hybrid query must be processed in a very scalable way. In this paper, we define such a hybrid query capability that combines unary tree-shaped structured queries with keyword searches. We show how existing information retrieval (IR) index structures and functions can be reused to index semantic web data and its textual information, and how the hybrid query is evaluated on the index structure using IR engines in an efficient and scalable manner. We implemented this IR approach in an engine called Semplore. Comprehensive experiments on its performance show that it is a promising approach. It leads us to believe that it may be possible to evolve current web search engines to query and search the Semantic Web. Finally, we briefy describe how Semplore is used for searching Wikipedia and an IBM customer's product information.