Search (36 results, page 1 of 2)

  • × author_ss:"Zhang, J."
  1. Zhang, J.; Wolfram, D.; Wang, P.; Hong, Y.; Gillis, R.: Visualization of health-subject analysis based on query term co-occurrences (2008) 0.02
    0.015027763 = product of:
      0.041326348 = sum of:
        0.004782719 = weight(_text_:a in 2376) [ClassicSimilarity], result of:
          0.004782719 = score(doc=2376,freq=12.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.15602624 = fieldWeight in 2376, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2376)
        0.016092705 = weight(_text_:r in 2376) [ClassicSimilarity], result of:
          0.016092705 = score(doc=2376,freq=2.0), product of:
            0.088001914 = queryWeight, product of:
              3.3102584 = idf(docFreq=4387, maxDocs=44218)
              0.026584605 = queryNorm
            0.18286766 = fieldWeight in 2376, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3102584 = idf(docFreq=4387, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2376)
        0.0017360178 = weight(_text_:s in 2376) [ClassicSimilarity], result of:
          0.0017360178 = score(doc=2376,freq=2.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.060061958 = fieldWeight in 2376, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2376)
        0.018714907 = weight(_text_:k in 2376) [ClassicSimilarity], result of:
          0.018714907 = score(doc=2376,freq=2.0), product of:
            0.09490114 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.026584605 = queryNorm
            0.19720423 = fieldWeight in 2376, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2376)
      0.36363637 = coord(4/11)
    
    Abstract
    A multidimensional-scaling approach is used to analyze frequently used medical-topic terms in queries submitted to a Web-based consumer health information system. Based on a year-long transaction log file, five medical focus keywords (stomach, hip, stroke, depression, and cholesterol) and their co-occurring query terms are analyzed. An overlap-coefficient similarity measure and a conversion measure are used to calculate the proximity of terms to one another based on their co-occurrences in queries. The impact of the dimensionality of the visual configuration, the cutoff point of term co-occurrence for inclusion in the analysis, and the Minkowski metric power k on the stress value are discussed. A visual clustering of groups of terms based on the proximity within each focus-keyword group is also conducted. Term distributions within each visual configuration are characterized and are compared with formal medical vocabulary. This investigation reveals that there are significant differences between consumer health query-term usage and more formal medical terminology used by medical professionals when describing the same medical subject. Future directions are discussed.
    Source
    Journal of the American Society for Information Science and Technology. 59(2008) no.12, S.1933-1947
    Type
    a
  2. Zhang, J.; An, L.; Tang, T.; Hong, Y.: Visual health subject directory analysis based on users' traversal activities (2009) 0.01
    0.009421353 = product of:
      0.03454496 = sum of:
        0.0057392623 = weight(_text_:a in 3112) [ClassicSimilarity], result of:
          0.0057392623 = score(doc=3112,freq=12.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.18723148 = fieldWeight in 3112, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=3112)
        0.0020832212 = weight(_text_:s in 3112) [ClassicSimilarity], result of:
          0.0020832212 = score(doc=3112,freq=2.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.072074346 = fieldWeight in 3112, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.046875 = fieldNorm(doc=3112)
        0.026722478 = weight(_text_:u in 3112) [ClassicSimilarity], result of:
          0.026722478 = score(doc=3112,freq=4.0), product of:
            0.08704981 = queryWeight, product of:
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.026584605 = queryNorm
            0.30697915 = fieldWeight in 3112, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.046875 = fieldNorm(doc=3112)
      0.27272728 = coord(3/11)
    
    Abstract
    Concerns about health issues cover a wide spectrum. Consumer health information, which has become more available on the Internet, plays an extremely important role in addressing these concerns. A subject directory as an information organization and browsing mechanism is widely used in consumer health-related Websites. In this study we employed the information visualization technique Self-Organizing Map (SOM) in combination with a new U-matrix algorithm to analyze health subject clusters through a Web transaction log. An experimental study was conducted to test the proposed methods. The findings show that the clusters identified from the same cells based on path-length-1 outperformed both the clusters from the adjacent cells based on path-length-1 and the clusters from the same cells based on path-length-2 in the visual SOM display. The U-matrix method successfully distinguished the irrelevant subjects situated in the adjacent cells with different colors in the SOM display. The findings of this study lead to a better understanding of the health-related subject relationship from the users' traversal perspective.
    Source
    Journal of the American Society for Information Science and Technology. 60(2009) no.10, S.1977-1994
    Type
    a
  3. Geng, Q.; Townley, C.; Huang, K.; Zhang, J.: Comparative knowledge management : a pilot study of Chinese and American universities (2005) 0.01
    0.009099804 = product of:
      0.033365946 = sum of:
        0.0047346503 = weight(_text_:a in 3876) [ClassicSimilarity], result of:
          0.0047346503 = score(doc=3876,freq=6.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.1544581 = fieldWeight in 3876, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3876)
        0.0024304248 = weight(_text_:s in 3876) [ClassicSimilarity], result of:
          0.0024304248 = score(doc=3876,freq=2.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.08408674 = fieldWeight in 3876, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3876)
        0.02620087 = weight(_text_:k in 3876) [ClassicSimilarity], result of:
          0.02620087 = score(doc=3876,freq=2.0), product of:
            0.09490114 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.026584605 = queryNorm
            0.27608594 = fieldWeight in 3876, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3876)
      0.27272728 = coord(3/11)
    
    Abstract
    Comparative study of knowledge management (KM) promises to lead to more effective knowledge use in all cultural environments. This pilot study compares KM priorities, needs, tools, and administrative structure components in large Chinese and American universities. General KM theory and literature related to KM in higher education are analyzed to develop the four components of the study. Comparative differences in KM practice at large Chinese and American universities are analyzed for each component. A correlation matrix reveals statistically significant co-variation among all but one of the study components. Four conclusions related to comparative KM and suggestions for future research are presented.
    Source
    Journal of the American Society for Information Science and Technology. 56(2005) no.10, S.1031-1044
    Type
    a
  4. Gao, J.; Zhang, J.: Clustered SVD strategies in latent semantic indexing (2005) 0.01
    0.00772941 = product of:
      0.02834117 = sum of:
        0.003865826 = weight(_text_:a in 1166) [ClassicSimilarity], result of:
          0.003865826 = score(doc=1166,freq=4.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.12611452 = fieldWeight in 1166, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1166)
        0.0024304248 = weight(_text_:s in 1166) [ClassicSimilarity], result of:
          0.0024304248 = score(doc=1166,freq=2.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.08408674 = fieldWeight in 1166, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1166)
        0.02204492 = weight(_text_:u in 1166) [ClassicSimilarity], result of:
          0.02204492 = score(doc=1166,freq=2.0), product of:
            0.08704981 = queryWeight, product of:
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.026584605 = queryNorm
            0.25324488 = fieldWeight in 1166, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1166)
      0.27272728 = coord(3/11)
    
    Abstract
    The text retrieval method using latent semantic indexing (LSI) technique with truncated singular value decomposition (SVD) has been intensively studied in recent years. The SVD reduces the noise contained in the original representation of the term-document matrix and improves the information retrieval accuracy. Recent studies indicate that SVD is mostly useful for small homogeneous data collections. For large inhomogeneous datasets, the performance of the SVD based text retrieval technique may deteriorate. We propose to partition a large inhomogeneous dataset into several smaller ones with clustered structure, on which we apply the truncated SVD. Our experimental results show that the clustered SVD strategies may enhance the retrieval accuracy and reduce the computing and storage costs.
    Source
    Information processing and management. 41(2005) no.5, S.1051-1064
    Theme
    Semantisches Umfeld in Indexierung u. Retrieval
    Type
    a
  5. Zhang, L.; Liu, Q.L.; Zhang, J.; Wang, H.F.; Pan, Y.; Yu, Y.: Semplore: an IR approach to scalable hybrid query of Semantic Web data (2007) 0.01
    0.006768254 = product of:
      0.02481693 = sum of:
        0.0043660053 = weight(_text_:a in 231) [ClassicSimilarity], result of:
          0.0043660053 = score(doc=231,freq=10.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.14243183 = fieldWeight in 231, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=231)
        0.0017360178 = weight(_text_:s in 231) [ClassicSimilarity], result of:
          0.0017360178 = score(doc=231,freq=2.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.060061958 = fieldWeight in 231, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0390625 = fieldNorm(doc=231)
        0.018714907 = weight(_text_:k in 231) [ClassicSimilarity], result of:
          0.018714907 = score(doc=231,freq=2.0), product of:
            0.09490114 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.026584605 = queryNorm
            0.19720423 = fieldWeight in 231, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.0390625 = fieldNorm(doc=231)
      0.27272728 = coord(3/11)
    
    Abstract
    As an extension to the current Web, Semantic Web will not only contain structured data with machine understandable semantics but also textual information. While structured queries can be used to find information more precisely on the Semantic Web, keyword searches are still needed to help exploit textual information. It thus becomes very important that we can combine precise structured queries with imprecise keyword searches to have a hybrid query capability. In addition, due to the huge volume of information on the Semantic Web, the hybrid query must be processed in a very scalable way. In this paper, we define such a hybrid query capability that combines unary tree-shaped structured queries with keyword searches. We show how existing information retrieval (IR) index structures and functions can be reused to index semantic web data and its textual information, and how the hybrid query is evaluated on the index structure using IR engines in an efficient and scalable manner. We implemented this IR approach in an engine called Semplore. Comprehensive experiments on its performance show that it is a promising approach. It leads us to believe that it may be possible to evolve current web search engines to query and search the Semantic Web. Finally, we briefy describe how Semplore is used for searching Wikipedia and an IBM customer's product information.
    Pages
    S.652-665
    Source
    Proceeding ISWC'07/ASWC'07 : Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference. Ed.: K. Aberer et al
    Type
    a
  6. Chen, C.; Ibekwe-SanJuan, F.; Pinho, R.; Zhang, J.: ¬The impact of the sloan digital sky survey on astronomical research : the role of culture, identity, and international collaboration (2008) 0.01
    0.006738554 = product of:
      0.02470803 = sum of:
        0.0033135647 = weight(_text_:a in 2275) [ClassicSimilarity], result of:
          0.0033135647 = score(doc=2275,freq=4.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.10809815 = fieldWeight in 2275, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=2275)
        0.019311246 = weight(_text_:r in 2275) [ClassicSimilarity], result of:
          0.019311246 = score(doc=2275,freq=2.0), product of:
            0.088001914 = queryWeight, product of:
              3.3102584 = idf(docFreq=4387, maxDocs=44218)
              0.026584605 = queryNorm
            0.2194412 = fieldWeight in 2275, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3102584 = idf(docFreq=4387, maxDocs=44218)
              0.046875 = fieldNorm(doc=2275)
        0.0020832212 = weight(_text_:s in 2275) [ClassicSimilarity], result of:
          0.0020832212 = score(doc=2275,freq=2.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.072074346 = fieldWeight in 2275, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.046875 = fieldNorm(doc=2275)
      0.27272728 = coord(3/11)
    
    Content
    We investigate the influence of culture and identity (geographic location) on the constitution of a specific research field. Using as case study the Sloan Digital Sky Survey (SDSS) project in the Astronomy field, we analyzed texts from bibliographic records of publications along three cultural and geographic axes: US only publications, non-US publications and international collaboration. Using three text mining systems (CiteSpace, TermWatch and PEx), we were able to automatically identify the topics specific to each cultural and geographic region as well as isolate the core research topics common to all geographic zones. The results tended to show that US-only and non-US research in this field shared more commonalities with international collaboration than with one another, thus indicating that the former two (US-only and non-US) research focused on rather distinct topics.
    Pages
    S.307-312
    Type
    a
  7. An, L.; Zhang, J.; Yu, C.: ¬The visual subject analysis of library and information science journals with self-organizing map (2011) 0.01
    0.0058329445 = product of:
      0.021387463 = sum of:
        0.0039050733 = weight(_text_:a in 4613) [ClassicSimilarity], result of:
          0.0039050733 = score(doc=4613,freq=8.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.12739488 = fieldWeight in 4613, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4613)
        0.0017360178 = weight(_text_:s in 4613) [ClassicSimilarity], result of:
          0.0017360178 = score(doc=4613,freq=2.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.060061958 = fieldWeight in 4613, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4613)
        0.015746372 = weight(_text_:u in 4613) [ClassicSimilarity], result of:
          0.015746372 = score(doc=4613,freq=2.0), product of:
            0.08704981 = queryWeight, product of:
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.026584605 = queryNorm
            0.1808892 = fieldWeight in 4613, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4613)
      0.27272728 = coord(3/11)
    
    Abstract
    Academic journals play an important role in scientific communication. The effective organization of journals can help reveal the thematic contents of journals and thus make them more user-friendly. In this study, the Self-Organizing Map (SOM) technique was employed to visually analyze the 60 library and information science-related journals published from 2006 to 2008. The U-matrix by Ultsch (2003) was applied to categorize the journals into 19 clusters according to their subjects. Four journals were recommended to supplement library collections although they were not indexed by SCI/SSCI. A novel SOM display named Attribute Accumulation Matrix (AA-matrix) was proposed, and the results from this method show that they correlate significantly with the total occurrences of the subjects in the investigated journals. The AA-matrix was employed to identify the 86 salient subjects, which could be manually classified into 7 meaningful groups. A method of the Salient Attribute Projection was constructed to label the attribute characteristics of different clusters. Finally, the subject characteristics of the journals with high impact factors (IFs) were also addressed. The findings of this study can lead to a better understanding of the subject structure and characteristics of library/information-related journals.
    Source
    Knowledge organization. 38(2011) no.4, S.299-320
    Type
    a
  8. Zhang, J.; Mostafa, J.; Tripathy, H.: Information retrieval by semantic analysis and visualization of the concept space of D-Lib® magazine (2002) 0.01
    0.005812889 = product of:
      0.015985444 = sum of:
        0.002266238 = product of:
          0.004532476 = sum of:
            0.004532476 = weight(_text_:h in 1211) [ClassicSimilarity], result of:
              0.004532476 = score(doc=1211,freq=2.0), product of:
                0.0660481 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.026584605 = queryNorm
                0.06862386 = fieldWeight in 1211, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.01953125 = fieldNorm(doc=1211)
          0.5 = coord(1/2)
        0.004978012 = weight(_text_:a in 1211) [ClassicSimilarity], result of:
          0.004978012 = score(doc=1211,freq=52.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.16239727 = fieldWeight in 1211, product of:
              7.2111025 = tf(freq=52.0), with freq of:
                52.0 = termFreq=52.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.01953125 = fieldNorm(doc=1211)
        8.680089E-4 = weight(_text_:s in 1211) [ClassicSimilarity], result of:
          8.680089E-4 = score(doc=1211,freq=2.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.030030979 = fieldWeight in 1211, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.01953125 = fieldNorm(doc=1211)
        0.007873186 = weight(_text_:u in 1211) [ClassicSimilarity], result of:
          0.007873186 = score(doc=1211,freq=2.0), product of:
            0.08704981 = queryWeight, product of:
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.026584605 = queryNorm
            0.0904446 = fieldWeight in 1211, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.01953125 = fieldNorm(doc=1211)
      0.36363637 = coord(4/11)
    
    Abstract
    In this article we present a method for retrieving documents from a digital library through a visual interface based on automatically generated concepts. We used a vocabulary generation algorithm to generate a set of concepts for the digital library and a technique called the max-min distance technique to cluster them. Additionally, the concepts were visualized in a spring embedding graph layout to depict the semantic relationship among them. The resulting graph layout serves as an aid to users for retrieving documents. An online archive containing the contents of D-Lib Magazine from July 1995 to May 2002 was used to test the utility of an implemented retrieval and visualization system. We believe that the method developed and tested can be applied to many different domains to help users get a better understanding of online document collections and to minimize users' cognitive load during execution of search tasks. Over the past few years, the volume of information available through the World Wide Web has been expanding exponentially. Never has so much information been so readily available and shared among so many people. Unfortunately, the unstructured nature and huge volume of information accessible over networks have made it hard for users to sift through and find relevant information. To deal with this problem, information retrieval (IR) techniques have gained more intensive attention from both industrial and academic researchers. Numerous IR techniques have been developed to help deal with the information overload problem. These techniques concentrate on mathematical models and algorithms for retrieval. Popular IR models such as the Boolean model, the vector-space model, the probabilistic model and their variants are well established.
    From the user's perspective, however, it is still difficult to use current information retrieval systems. Users frequently have problems expressing their information needs and translating those needs into queries. This is partly due to the fact that information needs cannot be expressed appropriately in systems terms. It is not unusual for users to input search terms that are different from the index terms information systems use. Various methods have been proposed to help users choose search terms and articulate queries. One widely used approach is to incorporate into the information system a thesaurus-like component that represents both the important concepts in a particular subject area and the semantic relationships among those concepts. Unfortunately, the development and use of thesauri is not without its own problems. The thesaurus employed in a specific information system has often been developed for a general subject area and needs significant enhancement to be tailored to the information system where it is to be used. This thesaurus development process, if done manually, is both time consuming and labor intensive. Usage of a thesaurus in searching is complex and may raise barriers for the user. For illustration purposes, let us consider two scenarios of thesaurus usage. In the first scenario the user inputs a search term and the thesaurus then displays a matching set of related terms. Without an overview of the thesaurus - and without the ability to see the matching terms in the context of other terms - it may be difficult to assess the quality of the related terms in order to select the correct term. In the second scenario the user browses the whole thesaurus, which is organized as in an alphabetically ordered list. The problem with this approach is that the list may be long, and neither does it show users the global semantic relationship among all the listed terms.
    Nevertheless, because thesaurus use has shown to improve retrieval, for our method we integrate functions in the search interface that permit users to explore built-in search vocabularies to improve retrieval from digital libraries. Our method automatically generates the terms and their semantic relationships representing relevant topics covered in a digital library. We call these generated terms the "concepts", and the generated terms and their semantic relationships we call the "concept space". Additionally, we used a visualization technique to display the concept space and allow users to interact with this space. The automatically generated term set is considered to be more representative of subject area in a corpus than an "externally" imposed thesaurus, and our method has the potential of saving a significant amount of time and labor for those who have been manually creating thesauri as well. Information visualization is an emerging discipline and developed very quickly in the last decade. With growing volumes of documents and associated complexities, information visualization has become increasingly important. Researchers have found information visualization to be an effective way to use and understand information while minimizing a user's cognitive load. Our work was based on an algorithmic approach of concept discovery and association. Concepts are discovered using an algorithm based on an automated thesaurus generation procedure. Subsequently, similarities among terms are computed using the cosine measure, and the associations among terms are established using a method known as max-min distance clustering. The concept space is then visualized in a spring embedding graph, which roughly shows the semantic relationships among concepts in a 2-D visual representation. The semantic space of the visualization is used as a medium for users to retrieve the desired documents. In the remainder of this article, we present our algorithmic approach of concept generation and clustering, followed by description of the visualization technique and interactive interface. The paper ends with key conclusions and discussions on future work.
    Content
    The JAVA applet is available at <http://ella.slis.indiana.edu/~junzhang/dlib/IV.html>. A prototype of this interface has been developed and is available at <http://ella.slis.indiana.edu/~junzhang/dlib/IV.html>. The D-Lib search interface is available at <http://www.dlib.org/Architext/AT-dlib2query.html>.
    Source
    D-Lib magazine. 8(2002) no.10, x S
    Theme
    Semantisches Umfeld in Indexierung u. Retrieval
    Type
    a
  9. Zhang, J.; Zeng, M.L.: ¬A new similarity measure for subject hierarchical structures (2014) 0.00
    0.0043381536 = product of:
      0.015906563 = sum of:
        0.0051659266 = weight(_text_:a in 1778) [ClassicSimilarity], result of:
          0.0051659266 = score(doc=1778,freq=14.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.1685276 = fieldWeight in 1778, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1778)
        0.0017360178 = weight(_text_:s in 1778) [ClassicSimilarity], result of:
          0.0017360178 = score(doc=1778,freq=2.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.060061958 = fieldWeight in 1778, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1778)
        0.009004618 = product of:
          0.018009236 = sum of:
            0.018009236 = weight(_text_:22 in 1778) [ClassicSimilarity], result of:
              0.018009236 = score(doc=1778,freq=2.0), product of:
                0.09309476 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.026584605 = queryNorm
                0.19345059 = fieldWeight in 1778, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1778)
          0.5 = coord(1/2)
      0.27272728 = coord(3/11)
    
    Abstract
    Purpose - The purpose of this paper is to introduce a new similarity method to gauge the differences between two subject hierarchical structures. Design/methodology/approach - In the proposed similarity measure, nodes on two hierarchical structures are projected onto a two-dimensional space, respectively, and both structural similarity and subject similarity of nodes are considered in the similarity between the two hierarchical structures. The extent to which the structural similarity impacts on the similarity can be controlled by adjusting a parameter. An experiment was conducted to evaluate soundness of the measure. Eight experts whose research interests were information retrieval and information organization participated in the study. Results from the new measure were compared with results from the experts. Findings - The evaluation shows strong correlations between the results from the new method and the results from the experts. It suggests that the similarity method achieved satisfactory results. Practical implications - Hierarchical structures that are found in subject directories, taxonomies, classification systems, and other classificatory structures play an extremely important role in information organization and information representation. Measuring the similarity between two subject hierarchical structures allows an accurate overarching understanding of the degree to which the two hierarchical structures are similar. Originality/value - Both structural similarity and subject similarity of nodes were considered in the proposed similarity method, and the extent to which the structural similarity impacts on the similarity can be adjusted. In addition, a new evaluation method for a hierarchical structure similarity was presented.
    Date
    8. 4.2015 16:22:13
    Source
    Journal of documentation. 70(2014) no.3, S.364-391
    Type
    a
  10. Zhang, J.; Zhai, S.; Liu, H.; Stevenson, J.A.: Social network analysis on a topic-based navigation guidance system in a public health portal (2016) 0.00
    0.0032100803 = product of:
      0.011770294 = sum of:
        0.004532476 = product of:
          0.009064952 = sum of:
            0.009064952 = weight(_text_:h in 2887) [ClassicSimilarity], result of:
              0.009064952 = score(doc=2887,freq=2.0), product of:
                0.0660481 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.026584605 = queryNorm
                0.13724773 = fieldWeight in 2887, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2887)
          0.5 = coord(1/2)
        0.004782719 = weight(_text_:a in 2887) [ClassicSimilarity], result of:
          0.004782719 = score(doc=2887,freq=12.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.15602624 = fieldWeight in 2887, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2887)
        0.0024550997 = weight(_text_:s in 2887) [ClassicSimilarity], result of:
          0.0024550997 = score(doc=2887,freq=4.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.08494043 = fieldWeight in 2887, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2887)
      0.27272728 = coord(3/11)
    
    Abstract
    We investigated a topic-based navigation guidance system in the World Health Organization portal, compared the link connection network and the semantic connection network derived from the guidance system, analyzed the characteristics of the 2 networks from the perspective of the node centrality (in_closeness, out_closeness, betweenness, in_degree, and out_degree), and provided the suggestions to optimize and enhance the topic-based navigation guidance system. A mixed research method that combines the social network analysis method, clustering analysis method, and inferential analysis methods was used. The clustering analysis results of the link connection network were quite different from those of the semantic connection network. There were significant differences between the link connection network and the semantic network in terms of density and centrality. Inferential analysis results show that there were no strong correlations between the centrality of a node and its topic information characteristics. Suggestions for enhancing the navigation guidance system are discussed in detail. Future research directions, such as application of the same research method presented in this study to other similar public health portals, are also included.
    Source
    Journal of the Association for Information Science and Technology. 67(2016) no.5, S.1068-1088
    Type
    a
  11. Zhuge, H.; Zhang, J.: Topological centrality and its e-Science applications (2010) 0.00
    0.0031389387 = product of:
      0.011509442 = sum of:
        0.006345466 = product of:
          0.012690932 = sum of:
            0.012690932 = weight(_text_:h in 3984) [ClassicSimilarity], result of:
              0.012690932 = score(doc=3984,freq=2.0), product of:
                0.0660481 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.026584605 = queryNorm
                0.19214681 = fieldWeight in 3984, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3984)
          0.5 = coord(1/2)
        0.0027335514 = weight(_text_:a in 3984) [ClassicSimilarity], result of:
          0.0027335514 = score(doc=3984,freq=2.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.089176424 = fieldWeight in 3984, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3984)
        0.0024304248 = weight(_text_:s in 3984) [ClassicSimilarity], result of:
          0.0024304248 = score(doc=3984,freq=2.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.08408674 = fieldWeight in 3984, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3984)
      0.27272728 = coord(3/11)
    
    Source
    Journal of the American Society for Information Science and Technology. 61(2010) no.9, S.1824-1841
    Type
    a
  12. Zhang, J.: TOFIR: A tool of facilitating information retrieval : introduce a visual retrieval model (2001) 0.00
    0.002605482 = product of:
      0.014330151 = sum of:
        0.0094693005 = weight(_text_:a in 7711) [ClassicSimilarity], result of:
          0.0094693005 = score(doc=7711,freq=6.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.3089162 = fieldWeight in 7711, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.109375 = fieldNorm(doc=7711)
        0.0048608496 = weight(_text_:s in 7711) [ClassicSimilarity], result of:
          0.0048608496 = score(doc=7711,freq=2.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.16817348 = fieldWeight in 7711, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.109375 = fieldNorm(doc=7711)
      0.18181819 = coord(2/11)
    
    Source
    Information processing and management. 37(2001) no.4, S.639-657
    Type
    a
  13. Patrick, J.; Zhang, J.; Artola-Zubillaga, X.: ¬An architecture and query language for a federation of heterogeneous dictionary databases (2000) 0.00
    0.002289546 = product of:
      0.012592502 = sum of:
        0.007731652 = weight(_text_:a in 339) [ClassicSimilarity], result of:
          0.007731652 = score(doc=339,freq=4.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.25222903 = fieldWeight in 339, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.109375 = fieldNorm(doc=339)
        0.0048608496 = weight(_text_:s in 339) [ClassicSimilarity], result of:
          0.0048608496 = score(doc=339,freq=2.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.16817348 = fieldWeight in 339, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.109375 = fieldNorm(doc=339)
      0.18181819 = coord(2/11)
    
    Source
    Computers and the humanities. 35(2000), S.393-407
    Type
    a
  14. Zhang, J.; Dimitroff, A.: Internet search engines' response to Metadata Dublin Core implementation (2005) 0.00
    0.002289546 = product of:
      0.012592502 = sum of:
        0.007731652 = weight(_text_:a in 4652) [ClassicSimilarity], result of:
          0.007731652 = score(doc=4652,freq=4.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.25222903 = fieldWeight in 4652, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.109375 = fieldNorm(doc=4652)
        0.0048608496 = weight(_text_:s in 4652) [ClassicSimilarity], result of:
          0.0048608496 = score(doc=4652,freq=2.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.16817348 = fieldWeight in 4652, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.109375 = fieldNorm(doc=4652)
      0.18181819 = coord(2/11)
    
    Source
    Journal of information science. 30(2005) no.4, S.310-
    Type
    a
  15. Zhang, J.: ¬A representational analysis of relational information displays (1996) 0.00
    0.0020078386 = product of:
      0.011043112 = sum of:
        0.008265483 = weight(_text_:a in 6403) [ClassicSimilarity], result of:
          0.008265483 = score(doc=6403,freq=14.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.26964417 = fieldWeight in 6403, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=6403)
        0.0027776284 = weight(_text_:s in 6403) [ClassicSimilarity], result of:
          0.0027776284 = score(doc=6403,freq=2.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.09609913 = fieldWeight in 6403, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0625 = fieldNorm(doc=6403)
      0.18181819 = coord(2/11)
    
    Abstract
    Analyses graphic and tabular displays under a common, unified form - relational information displays (RIDs) which are displays that represent relations between dimensions. A representational taxonomy is developed that classifies all RIDs and serves as a framework for systematic studies of RIDs. Develops a taxonomy of RIDs which can classifiy the majority of dimension based display tasks and analyzes the relation between representations of displays and structures of tasks in terms of a mapping principle
    Source
    International journal of human-computer studies. 45(1996) no.1, S.59-74
    Type
    a
  16. Zhang, J.; Korfhage, R.R.: DARE: Distance and Angle Retrieval Environment : A tale of the two measures (1999) 0.00
    0.0017751341 = product of:
      0.009763237 = sum of:
        0.0069856085 = weight(_text_:a in 3916) [ClassicSimilarity], result of:
          0.0069856085 = score(doc=3916,freq=10.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.22789092 = fieldWeight in 3916, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=3916)
        0.0027776284 = weight(_text_:s in 3916) [ClassicSimilarity], result of:
          0.0027776284 = score(doc=3916,freq=2.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.09609913 = fieldWeight in 3916, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0625 = fieldNorm(doc=3916)
      0.18181819 = coord(2/11)
    
    Abstract
    This article presents a visualization tool for information retrieval. Some retrieval evaluation models are interpreted in the two-dimensional space comprising direction and distance. The two different similarity measures-angle and distance-are displayed in the visual space. A new retrieval means based on the visual retrieval tool, the controlling bar, is developed for a search
    Source
    Journal of the American Society for Information Science. 50(1999) no.9, S.779-787
    Type
    a
  17. Zhang, J.; Dimitroff, A.: ¬The impact of webpage content characteristics on webpage visibility in search engine results : part I (2005) 0.00
    0.0017751341 = product of:
      0.009763237 = sum of:
        0.0069856085 = weight(_text_:a in 1032) [ClassicSimilarity], result of:
          0.0069856085 = score(doc=1032,freq=10.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.22789092 = fieldWeight in 1032, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=1032)
        0.0027776284 = weight(_text_:s in 1032) [ClassicSimilarity], result of:
          0.0027776284 = score(doc=1032,freq=2.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.09609913 = fieldWeight in 1032, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0625 = fieldNorm(doc=1032)
      0.18181819 = coord(2/11)
    
    Abstract
    Content characteristics of a webpage include factors such as keyword position in a webpage, keyword duplication, layout, and their combination. These factors may impact webpage visibility in a search engine. Four hypotheses are presented relating to the impact of selected content characteristics on webpage visibility in search engine results lists. Webpage visibility can be improved by increasing the frequency of keywords in the title, in the full-text and in both the title and full-text.
    Source
    Information processing and management. 41(2005) no.3, S.665-690
    Type
    a
  18. Zhang, J.; Dimitroff, A.: ¬The impact of metadata implementation on webpage visibility in search engine results : part II (2005) 0.00
    0.0017568587 = product of:
      0.009662722 = sum of:
        0.0072322977 = weight(_text_:a in 1027) [ClassicSimilarity], result of:
          0.0072322977 = score(doc=1027,freq=14.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.23593865 = fieldWeight in 1027, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1027)
        0.0024304248 = weight(_text_:s in 1027) [ClassicSimilarity], result of:
          0.0024304248 = score(doc=1027,freq=2.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.08408674 = fieldWeight in 1027, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1027)
      0.18181819 = coord(2/11)
    
    Abstract
    This paper discusses the impact of metadata implementation in a webpage on its visibility performance in a search engine results list. Influential internal and external factors of metadata implementation were identified. How these factors affect webpage visibility in a search engine results list was examined in an experimental study. Findings suggest that metadata is a good mechanism to improve webpage visibility, the metadata subject field plays a more important role than any other metadata field and keywords extracted from the webpage itself, particularly title or full-text, are most effective. To maximize the effects, these keywords should come from both title and full-text.
    Source
    Information processing and management. 41(2005) no.3, S.691-716
    Type
    a
  19. Zhang, J.; Dimitroff, A.: ¬The impact of metadata implementation on webpage visibility in search engine results : part II (2005) 0.00
    0.0017568587 = product of:
      0.009662722 = sum of:
        0.0072322977 = weight(_text_:a in 1033) [ClassicSimilarity], result of:
          0.0072322977 = score(doc=1033,freq=14.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.23593865 = fieldWeight in 1033, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1033)
        0.0024304248 = weight(_text_:s in 1033) [ClassicSimilarity], result of:
          0.0024304248 = score(doc=1033,freq=2.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.08408674 = fieldWeight in 1033, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1033)
      0.18181819 = coord(2/11)
    
    Abstract
    This paper discusses the impact of metadata implementation in a webpage on its visibility performance in a search engine results list. Influential internal and external factors of metadata implementation were identified. How these factors affect webpage visibility in a search engine results list was examined in an experimental study. Findings suggest that metadata is a good mechanism to improve webpage visibility, the metadata subject field plays a more important role than any other metadata field and keywords extracted from the webpage itself, particularly title or full-text, are most effective. To maximize the effects, these keywords should come from both title and full-text.
    Source
    Information processing and management. 41(2005) no.3, S.691-715
    Type
    a
  20. Zhang, J.; Nguyen, T.: WebStar: a visualization model for hyperlink structures (2005) 0.00
    0.0016567915 = product of:
      0.009112353 = sum of:
        0.0070291325 = weight(_text_:a in 1056) [ClassicSimilarity], result of:
          0.0070291325 = score(doc=1056,freq=18.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.22931081 = fieldWeight in 1056, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=1056)
        0.0020832212 = weight(_text_:s in 1056) [ClassicSimilarity], result of:
          0.0020832212 = score(doc=1056,freq=2.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.072074346 = fieldWeight in 1056, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.046875 = fieldNorm(doc=1056)
      0.18181819 = coord(2/11)
    
    Abstract
    The authors introduce an information visualization model, WebStar, for hyperlink-based information systems. Hyperlinks within a hyperlink-based document can be visualized in a two-dimensional visual space. All links are projected within a display sphere in the visual space. The relationship between a specified central document and its hyperlinked documents is visually presented in the visual space. In addition, users are able to define a group of subjects and to observe relevance between each subject and all hyperlinked documents via movement of that subject around the display sphere center. WebStar allows users to dynamically change an interest center during navigation. A retrieval mechanism is developed to control retrieved results in the visual space. Impact of movement of a subject on the visual document distribution is analyzed. An ambiguity problem caused by projection is discussed. Potential applications of this visualization model in information retrieval are included. Future research directions on the topic are addressed.
    Source
    Information processing and management. 41(2005) no.4, S.1003-1018
    Type
    a