Search (36 results, page 1 of 2)

Zhang, J.; Jastram, I.: ¬A study of the metadata creation behavior of different user groups on the Internet (2006) 0.05

0.050813206 = product of:
  0.10162641 = sum of:
    0.029222867 = weight(_text_:retrieval in 982) [ClassicSimilarity], result of:
      0.029222867 = score(doc=982,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.23394634 = fieldWeight in 982, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=982)
    0.042349376 = weight(_text_:use in 982) [ClassicSimilarity], result of:
      0.042349376 = score(doc=982,freq=4.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.33491597 = fieldWeight in 982, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0546875 = fieldNorm(doc=982)
    0.019129815 = weight(_text_:of in 982) [ClassicSimilarity], result of:
      0.019129815 = score(doc=982,freq=12.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.29624295 = fieldWeight in 982, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=982)
    0.010924355 = product of:
      0.02184871 = sum of:
        0.02184871 = weight(_text_:on in 982) [ClassicSimilarity], result of:
          0.02184871 = score(doc=982,freq=4.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.24056101 = fieldWeight in 982, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0546875 = fieldNorm(doc=982)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: Metadata is designed to improve information organization and information retrieval effectiveness and efficiency on the Internet. The way web publishers respond to metadata and the way they use it when publishing their web pages, however, is still a mystery. The authors of this paper aim to solve this mystery by defining different professional publisher groups, examining the behaviors of these user groups, and identifying the characteristics of their metadata use. This study will enhance the current understanding of metadata application behavior and provide evidence useful to researchers, web publishers, and search engine designers.

Zhang, J.; Mostafa, J.; Tripathy, H.: Information retrieval by semantic analysis and visualization of the concept space of D-Lib® magazine (2002) 0.04
```
0.036912464 = product of:
  0.07382493 = sum of:
    0.029519552 = weight(_text_:retrieval in 1211) [ClassicSimilarity], result of:
      0.029519552 = score(doc=1211,freq=16.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.23632148 = fieldWeight in 1211, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.01953125 = fieldNorm(doc=1211)
    0.023914373 = weight(_text_:use in 1211) [ClassicSimilarity], result of:
      0.023914373 = score(doc=1211,freq=10.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.18912451 = fieldWeight in 1211, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.01953125 = fieldNorm(doc=1211)
    0.014222101 = weight(_text_:of in 1211) [ClassicSimilarity], result of:
      0.014222101 = score(doc=1211,freq=52.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.22024246 = fieldWeight in 1211, product of:
          7.2111025 = tf(freq=52.0), with freq of:
            52.0 = termFreq=52.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.01953125 = fieldNorm(doc=1211)
    0.006168901 = product of:
      0.012337802 = sum of:
        0.012337802 = weight(_text_:on in 1211) [ClassicSimilarity], result of:
          0.012337802 = score(doc=1211,freq=10.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.135843 = fieldWeight in 1211, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.01953125 = fieldNorm(doc=1211)
      0.5 = coord(1/2)
  0.5 = coord(4/8)
```
Abstract

In this article we present a method for retrieving documents from a digital library through a visual interface based on automatically generated concepts. We used a vocabulary generation algorithm to generate a set of concepts for the digital library and a technique called the max-min distance technique to cluster them. Additionally, the concepts were visualized in a spring embedding graph layout to depict the semantic relationship among them. The resulting graph layout serves as an aid to users for retrieving documents. An online archive containing the contents of D-Lib Magazine from July 1995 to May 2002 was used to test the utility of an implemented retrieval and visualization system. We believe that the method developed and tested can be applied to many different domains to help users get a better understanding of online document collections and to minimize users' cognitive load during execution of search tasks. Over the past few years, the volume of information available through the World Wide Web has been expanding exponentially. Never has so much information been so readily available and shared among so many people. Unfortunately, the unstructured nature and huge volume of information accessible over networks have made it hard for users to sift through and find relevant information. To deal with this problem, information retrieval (IR) techniques have gained more intensive attention from both industrial and academic researchers. Numerous IR techniques have been developed to help deal with the information overload problem. These techniques concentrate on mathematical models and algorithms for retrieval. Popular IR models such as the Boolean model, the vector-space model, the probabilistic model and their variants are well established.
From the user's perspective, however, it is still difficult to use current information retrieval systems. Users frequently have problems expressing their information needs and translating those needs into queries. This is partly due to the fact that information needs cannot be expressed appropriately in systems terms. It is not unusual for users to input search terms that are different from the index terms information systems use. Various methods have been proposed to help users choose search terms and articulate queries. One widely used approach is to incorporate into the information system a thesaurus-like component that represents both the important concepts in a particular subject area and the semantic relationships among those concepts. Unfortunately, the development and use of thesauri is not without its own problems. The thesaurus employed in a specific information system has often been developed for a general subject area and needs significant enhancement to be tailored to the information system where it is to be used. This thesaurus development process, if done manually, is both time consuming and labor intensive. Usage of a thesaurus in searching is complex and may raise barriers for the user. For illustration purposes, let us consider two scenarios of thesaurus usage. In the first scenario the user inputs a search term and the thesaurus then displays a matching set of related terms. Without an overview of the thesaurus - and without the ability to see the matching terms in the context of other terms - it may be difficult to assess the quality of the related terms in order to select the correct term. In the second scenario the user browses the whole thesaurus, which is organized as in an alphabetically ordered list. The problem with this approach is that the list may be long, and neither does it show users the global semantic relationship among all the listed terms.
Nevertheless, because thesaurus use has shown to improve retrieval, for our method we integrate functions in the search interface that permit users to explore built-in search vocabularies to improve retrieval from digital libraries. Our method automatically generates the terms and their semantic relationships representing relevant topics covered in a digital library. We call these generated terms the "concepts", and the generated terms and their semantic relationships we call the "concept space". Additionally, we used a visualization technique to display the concept space and allow users to interact with this space. The automatically generated term set is considered to be more representative of subject area in a corpus than an "externally" imposed thesaurus, and our method has the potential of saving a significant amount of time and labor for those who have been manually creating thesauri as well. Information visualization is an emerging discipline and developed very quickly in the last decade. With growing volumes of documents and associated complexities, information visualization has become increasingly important. Researchers have found information visualization to be an effective way to use and understand information while minimizing a user's cognitive load. Our work was based on an algorithmic approach of concept discovery and association. Concepts are discovered using an algorithm based on an automated thesaurus generation procedure. Subsequently, similarities among terms are computed using the cosine measure, and the associations among terms are established using a method known as max-min distance clustering. The concept space is then visualized in a spring embedding graph, which roughly shows the semantic relationships among concepts in a 2-D visual representation. The semantic space of the visualization is used as a medium for users to retrieve the desired documents. In the remainder of this article, we present our algorithmic approach of concept generation and clustering, followed by description of the visualization technique and interactive interface. The paper ends with key conclusions and discussions on future work.

Content

The JAVA applet is available at <http://ella.slis.indiana.edu/~junzhang/dlib/IV.html>. A prototype of this interface has been developed and is available at <http://ella.slis.indiana.edu/~junzhang/dlib/IV.html>. The D-Lib search interface is available at <http://www.dlib.org/Architext/AT-dlib2query.html>.

Theme

Semantisches Umfeld in Indexierung u. Retrieval

Zhang, J.; Korfhage, R.R.: DARE: Distance and Angle Retrieval Environment : A tale of the two measures (1999) 0.04

0.036048688 = product of:
  0.096129835 = sum of:
    0.07467922 = weight(_text_:retrieval in 3916) [ClassicSimilarity], result of:
      0.07467922 = score(doc=3916,freq=10.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.59785134 = fieldWeight in 3916, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=3916)
    0.012622404 = weight(_text_:of in 3916) [ClassicSimilarity], result of:
      0.012622404 = score(doc=3916,freq=4.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.19546966 = fieldWeight in 3916, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=3916)
    0.008828212 = product of:
      0.017656423 = sum of:
        0.017656423 = weight(_text_:on in 3916) [ClassicSimilarity], result of:
          0.017656423 = score(doc=3916,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.19440265 = fieldWeight in 3916, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0625 = fieldNorm(doc=3916)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: This article presents a visualization tool for information retrieval. Some retrieval evaluation models are interpreted in the two-dimensional space comprising direction and distance. The two different similarity measures-angle and distance-are displayed in the visual space. A new retrieval means based on the visual retrieval tool, the controlling bar, is developed for a search
Source: Journal of the American Society for Information Science. 50(1999) no.9, S.779-787

Zhang, J.; Wolfram, D.: Visualization of term discrimination analysis (2001) 0.03

0.033798985 = product of:
  0.06759797 = sum of:
    0.020873476 = weight(_text_:retrieval in 5210) [ClassicSimilarity], result of:
      0.020873476 = score(doc=5210,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.16710453 = fieldWeight in 5210, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5210)
    0.021389665 = weight(_text_:use in 5210) [ClassicSimilarity], result of:
      0.021389665 = score(doc=5210,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.1691581 = fieldWeight in 5210, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5210)
    0.015778005 = weight(_text_:of in 5210) [ClassicSimilarity], result of:
      0.015778005 = score(doc=5210,freq=16.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.24433708 = fieldWeight in 5210, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5210)
    0.00955682 = product of:
      0.01911364 = sum of:
        0.01911364 = weight(_text_:on in 5210) [ClassicSimilarity], result of:
          0.01911364 = score(doc=5210,freq=6.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.21044704 = fieldWeight in 5210, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5210)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: Zang and Wolfram compute the discrimination value for terms as the difference between the centroid value of all terms in the corpus and that value without the term in question, and suggest selection be made by comparing density changes with a visualization tool. The Distance Angle Retrieval Environment (DARE) visually projects a document or term space by presenting distance similarity on the X axis and angular similarity on the Y axis. Thus a document icon appearing close to the X axis would be relevant to reference points in terms of a distance similarity measure, while those close to the Y axis are relevant to reference points in terms of an angle based measure. Using 450 Associated Press news reports indexed by 44 distinct terms, the removal of the term ``Yeltsin'' causes the cluster to fall on the Y axis indicating a good discriminator. For an angular measure, cosine say, movement along the X axis to the left will signal good discrimination, as movement to the right will signal poor discrimination. A term density space could also be used. Most terms are shown to be indifferent discriminators. Different measures result in different choices as good and poor discriminators, as does the use of a term space rather than a document space. The visualization approach is clearly feasible, and provides some additional insights not found in the computation of a discrimination value.
Source: Journal of the American Society for Information Science and technology. 52(2001) no.8, S.615-627

Gao, J.; Zhang, J.: Clustered SVD strategies in latent semantic indexing (2005) 0.03

0.031542607 = product of:
  0.08411361 = sum of:
    0.06534432 = weight(_text_:retrieval in 1166) [ClassicSimilarity], result of:
      0.06534432 = score(doc=1166,freq=10.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.5231199 = fieldWeight in 1166, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1166)
    0.011044604 = weight(_text_:of in 1166) [ClassicSimilarity], result of:
      0.011044604 = score(doc=1166,freq=4.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.17103596 = fieldWeight in 1166, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1166)
    0.007724685 = product of:
      0.01544937 = sum of:
        0.01544937 = weight(_text_:on in 1166) [ClassicSimilarity], result of:
          0.01544937 = score(doc=1166,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.17010231 = fieldWeight in 1166, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1166)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: The text retrieval method using latent semantic indexing (LSI) technique with truncated singular value decomposition (SVD) has been intensively studied in recent years. The SVD reduces the noise contained in the original representation of the term-document matrix and improves the information retrieval accuracy. Recent studies indicate that SVD is mostly useful for small homogeneous data collections. For large inhomogeneous datasets, the performance of the SVD based text retrieval technique may deteriorate. We propose to partition a large inhomogeneous dataset into several smaller ones with clustered structure, on which we apply the truncated SVD. Our experimental results show that the clustered SVD strategies may enhance the retrieval accuracy and reduce the computing and storage costs.
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Zhang, J.; Zeng, M.L.: ¬A new similarity measure for subject hierarchical structures (2014) 0.03

0.029040786 = product of:
  0.05808157 = sum of:
    0.020873476 = weight(_text_:retrieval in 1778) [ClassicSimilarity], result of:
      0.020873476 = score(doc=1778,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.16710453 = fieldWeight in 1778, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1778)
    0.013664153 = weight(_text_:of in 1778) [ClassicSimilarity], result of:
      0.013664153 = score(doc=1778,freq=12.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.21160212 = fieldWeight in 1778, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1778)
    0.00955682 = product of:
      0.01911364 = sum of:
        0.01911364 = weight(_text_:on in 1778) [ClassicSimilarity], result of:
          0.01911364 = score(doc=1778,freq=6.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.21044704 = fieldWeight in 1778, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1778)
      0.5 = coord(1/2)
    0.013987125 = product of:
      0.02797425 = sum of:
        0.02797425 = weight(_text_:22 in 1778) [ClassicSimilarity], result of:
          0.02797425 = score(doc=1778,freq=2.0), product of:
            0.1446067 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.041294612 = queryNorm
            0.19345059 = fieldWeight in 1778, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1778)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: Purpose - The purpose of this paper is to introduce a new similarity method to gauge the differences between two subject hierarchical structures. Design/methodology/approach - In the proposed similarity measure, nodes on two hierarchical structures are projected onto a two-dimensional space, respectively, and both structural similarity and subject similarity of nodes are considered in the similarity between the two hierarchical structures. The extent to which the structural similarity impacts on the similarity can be controlled by adjusting a parameter. An experiment was conducted to evaluate soundness of the measure. Eight experts whose research interests were information retrieval and information organization participated in the study. Results from the new measure were compared with results from the experts. Findings - The evaluation shows strong correlations between the results from the new method and the results from the experts. It suggests that the similarity method achieved satisfactory results. Practical implications - Hierarchical structures that are found in subject directories, taxonomies, classification systems, and other classificatory structures play an extremely important role in information organization and information representation. Measuring the similarity between two subject hierarchical structures allows an accurate overarching understanding of the degree to which the two hierarchical structures are similar. Originality/value - Both structural similarity and subject similarity of nodes were considered in the proposed similarity method, and the extent to which the structural similarity impacts on the similarity can be adjusted. In addition, a new evaluation method for a hierarchical structure similarity was presented.
Date: 8. 4.2015 16:22:13
Source: Journal of documentation. 70(2014) no.3, S.364-391

Patrick, J.; Zhang, J.; Artola-Zubillaga, X.: ¬An architecture and query language for a federation of heterogeneous dictionary databases (2000) 0.03

0.025974795 = product of:
  0.10389918 = sum of:
    0.015619429 = weight(_text_:of in 339) [ClassicSimilarity], result of:
      0.015619429 = score(doc=339,freq=2.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.24188137 = fieldWeight in 339, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.109375 = fieldNorm(doc=339)
    0.088279754 = product of:
      0.17655951 = sum of:
        0.17655951 = weight(_text_:computers in 339) [ClassicSimilarity], result of:
          0.17655951 = score(doc=339,freq=2.0), product of:
            0.21710795 = queryWeight, product of:
              5.257537 = idf(docFreq=625, maxDocs=44218)
              0.041294612 = queryNorm
            0.81323373 = fieldWeight in 339, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.257537 = idf(docFreq=625, maxDocs=44218)
              0.109375 = fieldNorm(doc=339)
      0.5 = coord(1/2)
  0.25 = coord(2/8)

Source: Computers and the humanities. 35(2000), S.393-407

Wolfram, D.; Zhang, J.: ¬The influence of indexing practices and weighting algorithms on document spaces (2008) 0.03

0.02593591 = product of:
  0.06916243 = sum of:
    0.035423465 = weight(_text_:retrieval in 1963) [ClassicSimilarity], result of:
      0.035423465 = score(doc=1963,freq=4.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.2835858 = fieldWeight in 1963, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=1963)
    0.018933605 = weight(_text_:of in 1963) [ClassicSimilarity], result of:
      0.018933605 = score(doc=1963,freq=16.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.2932045 = fieldWeight in 1963, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=1963)
    0.014805362 = product of:
      0.029610723 = sum of:
        0.029610723 = weight(_text_:on in 1963) [ClassicSimilarity], result of:
          0.029610723 = score(doc=1963,freq=10.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.32602316 = fieldWeight in 1963, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=1963)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: Index modeling and computer simulation techniques are used to examine the influence of indexing frequency distributions, indexing exhaustivity distributions, and three weighting methods on hypothetical document spaces in a vector-based information retrieval (IR) system. The way documents are indexed plays an important role in retrieval. The authors demonstrate the influence of different indexing characteristics on document space density (DSD) changes and document space discriminative capacity for IR. Document environments that contain a relatively higher percentage of infrequently occurring terms provide lower density outcomes than do environments where a higher percentage of frequently occurring terms exists. Different indexing exhaustivity levels, however, have little influence on the document space densities. A weighting algorithm that favors higher weights for infrequently occurring terms results in the lowest overall document space densities, which allows documents to be more readily differentiated from one another. This in turn can positively influence IR. The authors also discuss the influence on outcomes using two methods of normalization of term weights (i.e., means and ranges) for the different weighting methods.
Source: Journal of the American Society for Information Science and Technology. 59(2008) no.1, S.3-11

Zhang, J.: TOFIR: A tool of facilitating information retrieval : introduce a visual retrieval model (2001) 0.02

0.024568543 = product of:
  0.09827417 = sum of:
    0.082654744 = weight(_text_:retrieval in 7711) [ClassicSimilarity], result of:
      0.082654744 = score(doc=7711,freq=4.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.6617001 = fieldWeight in 7711, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.109375 = fieldNorm(doc=7711)
    0.015619429 = weight(_text_:of in 7711) [ClassicSimilarity], result of:
      0.015619429 = score(doc=7711,freq=2.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.24188137 = fieldWeight in 7711, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.109375 = fieldNorm(doc=7711)
  0.25 = coord(2/8)

Liu, X.; Zhang, J.; Guo, C.: Full-text citation analysis : a new method to enhance scholarly networks (2013) 0.02
```
0.023354912 = product of:
  0.062279765 = sum of:
    0.020873476 = weight(_text_:retrieval in 1044) [ClassicSimilarity], result of:
      0.020873476 = score(doc=1044,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.16710453 = fieldWeight in 1044, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1044)
    0.030249555 = weight(_text_:use in 1044) [ClassicSimilarity], result of:
      0.030249555 = score(doc=1044,freq=4.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.23922569 = fieldWeight in 1044, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1044)
    0.011156735 = weight(_text_:of in 1044) [ClassicSimilarity], result of:
      0.011156735 = score(doc=1044,freq=8.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.17277241 = fieldWeight in 1044, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1044)
  0.375 = coord(3/8)
```
Abstract

In this article, we use innovative full-text citation analysis along with supervised topic modeling and network-analysis algorithms to enhance classical bibliometric analysis and publication/author/venue ranking. By utilizing citation contexts extracted from a large number of full-text publications, each citation or publication is represented by a probability distribution over a set of predefined topics, where each topic is labeled by an author-contributed keyword. We then used publication/citation topic distribution to generate a citation graph with vertex prior and edge transitioning probability distributions. The publication importance score for each given topic is calculated by PageRank with edge and vertex prior distributions. To evaluate this work, we sampled 104 topics (labeled with keywords) in review papers. The cited publications of each review paper are assumed to be "important publications" for the target topic (keyword), and we use these cited publications to validate our topic-ranking result and to compare different publication-ranking lists. Evaluation results show that full-text citation and publication content prior topic distribution, along with the classical PageRank algorithm can significantly enhance bibliometric analysis and scientific publication ranking performance, comparing with term frequency-inverted document frequency (tf-idf), language model, BM25, PageRank, and PageRank + language model (p < .001), for academic information retrieval (IR) systems.

Source

Journal of the American Society for Information Science and Technology. 64(2013) no.9, S.1852-1863
Zhang, J.: Archival context, digital content, and the ethics of digital archival representation : the ethics of identification in digital library metadata (2012) 0.02
```
0.023294942 = product of:
  0.062119845 = sum of:
    0.030249555 = weight(_text_:use in 419) [ClassicSimilarity], result of:
      0.030249555 = score(doc=419,freq=4.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.23922569 = fieldWeight in 419, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0390625 = fieldNorm(doc=419)
    0.02231347 = weight(_text_:of in 419) [ClassicSimilarity], result of:
      0.02231347 = score(doc=419,freq=32.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.34554482 = fieldWeight in 419, product of:
          5.656854 = tf(freq=32.0), with freq of:
            32.0 = termFreq=32.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=419)
    0.00955682 = product of:
      0.01911364 = sum of:
        0.01911364 = weight(_text_:on in 419) [ClassicSimilarity], result of:
          0.01911364 = score(doc=419,freq=6.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.21044704 = fieldWeight in 419, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=419)
      0.5 = coord(1/2)
  0.375 = coord(3/8)
```
Abstract

The findings of a recent study on digital archival representation raise some ethical concerns about how digital archival materials are organized, described, and made available for use on the Web. Archivists have a fundamental obligation to preserve and protect the authenticity and integrity of records in their holdings and, at the same time, have the responsibility to promote the use of records as a fundamental purpose of the keeping of archives (SAA 2005 Code of Ethics for Archivists V & VI). Is it an ethical practice that digital content in digital archives is deeply embedded in its contextual structure and generally underrepresented in digital archival systems? Similarly, is it ethical for archivists to detach digital items from their archival context in order to make them more "digital friendly" and more accessible to meet needs of some users? Do archivists have an obligation to bring the two representation systems together so that the context and content of digital archives can be better represented and archival materials "can be located and used by anyone, for any purpose, while still remaining authentic evidence of the work and life of the creator"? (Millar 2010, 157) This paper discusses the findings of the study and their ethical implications relating to digital archival description and representation.

Content

Beitrag aus einem Themenheft zu den Proceedings of the 2nd Milwaukee Conference on Ethics in Information Organization, June 15-16, 2012, School of Information Studies, University of Wisconsin-Milwaukee. Hope A. Olson, Conference Chair. Vgl.: http://www.ergon-verlag.de/isko_ko/downloads/ko_39_2012_5_d.pdf.

Zhang, J.; Nguyen, T.: WebStar: a visualization model for hyperlink structures (2005) 0.02

0.022408323 = product of:
  0.05975553 = sum of:
    0.035423465 = weight(_text_:retrieval in 1056) [ClassicSimilarity], result of:
      0.035423465 = score(doc=1056,freq=4.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.2835858 = fieldWeight in 1056, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=1056)
    0.014968331 = weight(_text_:of in 1056) [ClassicSimilarity], result of:
      0.014968331 = score(doc=1056,freq=10.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.23179851 = fieldWeight in 1056, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=1056)
    0.009363732 = product of:
      0.018727465 = sum of:
        0.018727465 = weight(_text_:on in 1056) [ClassicSimilarity], result of:
          0.018727465 = score(doc=1056,freq=4.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.20619515 = fieldWeight in 1056, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=1056)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: The authors introduce an information visualization model, WebStar, for hyperlink-based information systems. Hyperlinks within a hyperlink-based document can be visualized in a two-dimensional visual space. All links are projected within a display sphere in the visual space. The relationship between a specified central document and its hyperlinked documents is visually presented in the visual space. In addition, users are able to define a group of subjects and to observe relevance between each subject and all hyperlinked documents via movement of that subject around the display sphere center. WebStar allows users to dynamically change an interest center during navigation. A retrieval mechanism is developed to control retrieved results in the visual space. Impact of movement of a subject on the visual document distribution is analyzed. An ambiguity problem caused by projection is discussed. Potential applications of this visualization model in information retrieval are included. Future research directions on the topic are addressed.

Wolfram, D.; Zhang, J.: ¬An investigation of the influence of indexing exhaustivity and term distributions on a document space (2002) 0.02
```
0.021268768 = product of:
  0.056716714 = sum of:
    0.029519552 = weight(_text_:retrieval in 5238) [ClassicSimilarity], result of:
      0.029519552 = score(doc=5238,freq=4.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.23632148 = fieldWeight in 5238, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5238)
    0.017640345 = weight(_text_:of in 5238) [ClassicSimilarity], result of:
      0.017640345 = score(doc=5238,freq=20.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.27317715 = fieldWeight in 5238, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5238)
    0.00955682 = product of:
      0.01911364 = sum of:
        0.01911364 = weight(_text_:on in 5238) [ClassicSimilarity], result of:
          0.01911364 = score(doc=5238,freq=6.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.21044704 = fieldWeight in 5238, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5238)
      0.5 = coord(1/2)
  0.375 = coord(3/8)
```
Abstract

Wolfram and Zhang are interested in the effect of different indexing exhaustivity, by which they mean the number of terms chosen, and of different index term distributions and different term weighting methods on the resulting document cluster organization. The Distance Angle Retrieval Environment, DARE, which provides a two dimensional display of retrieved documents was used to represent the document clusters based upon a document's distance from the searcher's main interest, and on the angle formed by the document, a point representing a minor interest, and the point representing the main interest. If the centroid and the origin of the document space are assigned as major and minor points the average distance between documents and the centroid can be measured providing an indication of cluster organization. in the form of a size normalized similarity measure. Using 500 records from NTIS and nine models created by intersecting low, observed, and high exhaustivity levels (based upon a negative binomial distribution) with shallow, observed, and steep term distributions (based upon a Zipf distribution) simulation runs were preformed using inverse document frequency, inter-document term frequency, and inverse document frequency based upon both inter and intra-document frequencies. Low exhaustivity and shallow distributions result in a more dense document space and less effective retrieval. High exhaustivity and steeper distributions result in a more diffuse space.

Source

Journal of the American Society for Information Science and Technology. 53(2002) no.11, S.944-952

Li, D.; Luo, Z.; Ding, Y.; Tang, J.; Sun, G.G.-Z.; Dai, X.; Du, J.; Zhang, J.; Kong, S.: User-level microblogging recommendation incorporating social influence (2017) 0.02

0.01756242 = product of:
  0.04683312 = sum of:
    0.021389665 = weight(_text_:use in 3426) [ClassicSimilarity], result of:
      0.021389665 = score(doc=3426,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.1691581 = fieldWeight in 3426, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3426)
    0.017640345 = weight(_text_:of in 3426) [ClassicSimilarity], result of:
      0.017640345 = score(doc=3426,freq=20.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.27317715 = fieldWeight in 3426, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3426)
    0.007803111 = product of:
      0.015606222 = sum of:
        0.015606222 = weight(_text_:on in 3426) [ClassicSimilarity], result of:
          0.015606222 = score(doc=3426,freq=4.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.1718293 = fieldWeight in 3426, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3426)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: With the information overload of user-generated content in microblogging, users find it extremely challenging to browse and find valuable information in their first attempt. In this paper we propose a microblogging recommendation algorithm, TSI-MR (Topic-Level Social Influence-based Microblogging Recommendation), which can significantly improve users' microblogging experiences. The main innovation of this proposed algorithm is that we consider social influences and their indirect structural relationships, which are largely based on social status theory, from the topic level. The primary advantage of this approach is that it can build an accurate description of latent relationships between two users with weak connections, which can improve the performance of the model; furthermore, it can solve sparsity problems of training data to a certain extent. The realization of the model is mainly based on Factor Graph. We also applied a distributed strategy to further improve the efficiency of the model. Finally, we use data from Tencent Weibo, one of the most popular microblogging services in China, to evaluate our methods. The results show that incorporating social influence can improve microblogging performance considerably, and outperform the baseline methods.
Source: Journal of the Association for Information Science and Technology. 68(2017) no.3, S.553-568

Zhang, L.; Liu, Q.L.; Zhang, J.; Wang, H.F.; Pan, Y.; Yu, Y.: Semplore: an IR approach to scalable hybrid query of Semantic Web data (2007) 0.02

0.016077485 = product of:
  0.042873293 = sum of:
    0.020873476 = weight(_text_:retrieval in 231) [ClassicSimilarity], result of:
      0.020873476 = score(doc=231,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.16710453 = fieldWeight in 231, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=231)
    0.009662016 = weight(_text_:of in 231) [ClassicSimilarity], result of:
      0.009662016 = score(doc=231,freq=6.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.1496253 = fieldWeight in 231, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=231)
    0.012337802 = product of:
      0.024675604 = sum of:
        0.024675604 = weight(_text_:on in 231) [ClassicSimilarity], result of:
          0.024675604 = score(doc=231,freq=10.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.271686 = fieldWeight in 231, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=231)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: As an extension to the current Web, Semantic Web will not only contain structured data with machine understandable semantics but also textual information. While structured queries can be used to find information more precisely on the Semantic Web, keyword searches are still needed to help exploit textual information. It thus becomes very important that we can combine precise structured queries with imprecise keyword searches to have a hybrid query capability. In addition, due to the huge volume of information on the Semantic Web, the hybrid query must be processed in a very scalable way. In this paper, we define such a hybrid query capability that combines unary tree-shaped structured queries with keyword searches. We show how existing information retrieval (IR) index structures and functions can be reused to index semantic web data and its textual information, and how the hybrid query is evaluated on the index structure using IR engines in an efficient and scalable manner. We implemented this IR approach in an engine called Semplore. Comprehensive experiments on its performance show that it is a promising approach. It leads us to believe that it may be possible to evolve current web search engines to query and search the Semantic Web. Finally, we briefy describe how Semplore is used for searching Wikipedia and an IBM customer's product information.
Source: Proceeding ISWC'07/ASWC'07 : Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference. Ed.: K. Aberer et al

Hansen, D.L.; Khopkar, T.; Zhang, J.: Recommender systems and expert locators (2009) 0.01

0.012652023 = product of:
  0.05060809 = sum of:
    0.029945528 = weight(_text_:use in 3867) [ClassicSimilarity], result of:
      0.029945528 = score(doc=3867,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.23682132 = fieldWeight in 3867, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3867)
    0.020662563 = weight(_text_:of in 3867) [ClassicSimilarity], result of:
      0.020662563 = score(doc=3867,freq=14.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.31997898 = fieldWeight in 3867, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3867)
  0.25 = coord(2/8)

Abstract: This entry describes two important classes of systems that facilitate the sharing of recommendations and expertise. Recommender systems suggest items of potential interest to individuals who do not have personal experience with the items. Expert locator systems, an important subset of recommender systems, help find people with the appropriate skills, knowledge, or expertise to meet a particular need. Research related to each of these systems is relatively new and extremely active. The use of these systems is likely to continue increasing as more and more activity is implicitly captured online, making it possible to automatically identify experts, and capture preferences that can be used to recommend items.
Source: Encyclopedia of library and information sciences. 3rd ed. Ed.: M.J. Bates

Geng, Q.; Townley, C.; Huang, K.; Zhang, J.: Comparative knowledge management : a pilot study of Chinese and American universities (2005) 0.01

0.011852145 = product of:
  0.04740858 = sum of:
    0.029945528 = weight(_text_:use in 3876) [ClassicSimilarity], result of:
      0.029945528 = score(doc=3876,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.23682132 = fieldWeight in 3876, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3876)
    0.017463053 = weight(_text_:of in 3876) [ClassicSimilarity], result of:
      0.017463053 = score(doc=3876,freq=10.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.2704316 = fieldWeight in 3876, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3876)
  0.25 = coord(2/8)

Abstract: Comparative study of knowledge management (KM) promises to lead to more effective knowledge use in all cultural environments. This pilot study compares KM priorities, needs, tools, and administrative structure components in large Chinese and American universities. General KM theory and literature related to KM in higher education are analyzed to develop the four components of the study. Comparative differences in KM practice at large Chinese and American universities are analyzed for each component. A correlation matrix reveals statistically significant co-variation among all but one of the study components. Four conclusions related to comparative KM and suggestions for future research are presented.
Source: Journal of the American Society for Information Science and Technology. 56(2005) no.10, S.1031-1044

Zhang, J.; Korfhage, R.R.: ¬A distance and angle similarity measure method (1999) 0.01

0.008110688 = product of:
  0.032442752 = sum of:
    0.019957775 = weight(_text_:of in 3915) [ClassicSimilarity], result of:
      0.019957775 = score(doc=3915,freq=10.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.3090647 = fieldWeight in 3915, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=3915)
    0.012484977 = product of:
      0.024969954 = sum of:
        0.024969954 = weight(_text_:on in 3915) [ClassicSimilarity], result of:
          0.024969954 = score(doc=3915,freq=4.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.27492687 = fieldWeight in 3915, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0625 = fieldNorm(doc=3915)
      0.5 = coord(1/2)
  0.25 = coord(2/8)

Abstract: This article presents a distance and angle similarity measure. The integrated similarity measure takes the strenghts of both the distance and direction of measured documents into account. This article analyzes the features of the similarity measure by comparing it with the traditional distance-based similarity measure and the cosine measure, providing the iso-similarity contour, investigating the impacts of the parameters and variables on the new similarity measure. It also gives the further research issues on the topic
Source: Journal of the American Society for Information Science. 50(1999) no.9, S.772-778

Wolfram, D.; Wang, P.; Zhang, J.: Identifying Web search session patterns using cluster analysis : a comparison of three search environments (2009) 0.01
```
0.008043981 = product of:
  0.032175925 = sum of:
    0.018933605 = weight(_text_:of in 2796) [ClassicSimilarity], result of:
      0.018933605 = score(doc=2796,freq=16.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.2932045 = fieldWeight in 2796, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=2796)
    0.013242318 = product of:
      0.026484637 = sum of:
        0.026484637 = weight(_text_:on in 2796) [ClassicSimilarity], result of:
          0.026484637 = score(doc=2796,freq=8.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.29160398 = fieldWeight in 2796, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=2796)
      0.5 = coord(1/2)
  0.25 = coord(2/8)
```
Abstract

Session characteristics taken from large transaction logs of three Web search environments (academic Web site, public search engine, consumer health information portal) were modeled using cluster analysis to determine if coherent session groups emerged for each environment and whether the types of session groups are similar across the three environments. The analysis revealed three distinct clusters of session behaviors common to each environment: hit and run sessions on focused topics, relatively brief sessions on popular topics, and sustained sessions using obscure terms with greater query modification. The findings also revealed shifts in session characteristics over time for one of the datasets, away from hit and run sessions toward more popular search topics. A better understanding of session characteristics can help system designers to develop more responsive systems to support search features that cater to identifiable groups of searchers based on their search behaviors. For example, the system may identify struggling searchers based on session behaviors that match those identified in the current study to provide context sensitive help.

Source

Journal of the American Society for Information Science and Technology. 60(2009) no.5, S.896-910

Zhang, J.; Dimitroff, A.: ¬The impact of webpage content characteristics on webpage visibility in search engine results : part I (2005) 0.01

0.007583938 = product of:
  0.030335752 = sum of:
    0.017850775 = weight(_text_:of in 1032) [ClassicSimilarity], result of:
      0.017850775 = score(doc=1032,freq=8.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.27643585 = fieldWeight in 1032, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=1032)
    0.012484977 = product of:
      0.024969954 = sum of:
        0.024969954 = weight(_text_:on in 1032) [ClassicSimilarity], result of:
          0.024969954 = score(doc=1032,freq=4.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.27492687 = fieldWeight in 1032, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0625 = fieldNorm(doc=1032)
      0.5 = coord(1/2)
  0.25 = coord(2/8)

Abstract: Content characteristics of a webpage include factors such as keyword position in a webpage, keyword duplication, layout, and their combination. These factors may impact webpage visibility in a search engine. Four hypotheses are presented relating to the impact of selected content characteristics on webpage visibility in search engine results lists. Webpage visibility can be improved by increasing the frequency of keywords in the title, in the full-text and in both the title and full-text.

Search (36 results, page 1 of 2)

Authors

Years

Types

Themes