Search (2 results, page 1 of 1)

  • × author_ss:"Zhang, J."
  • × theme_ss:"Semantisches Umfeld in Indexierung u. Retrieval"
  1. Gao, J.; Zhang, J.: Clustered SVD strategies in latent semantic indexing (2005) 0.02
    0.015043489 = product of:
      0.045130465 = sum of:
        0.045130465 = weight(_text_:management in 1166) [ClassicSimilarity], result of:
          0.045130465 = score(doc=1166,freq=2.0), product of:
            0.17312427 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051362853 = queryNorm
            0.2606825 = fieldWeight in 1166, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1166)
      0.33333334 = coord(1/3)
    
    Source
    Information processing and management. 41(2005) no.5, S.1051-1064
  2. Zhang, J.; Mostafa, J.; Tripathy, H.: Information retrieval by semantic analysis and visualization of the concept space of D-Lib® magazine (2002) 0.00
    0.004691057 = product of:
      0.01407317 = sum of:
        0.01407317 = product of:
          0.02814634 = sum of:
            0.02814634 = weight(_text_:system in 1211) [ClassicSimilarity], result of:
              0.02814634 = score(doc=1211,freq=8.0), product of:
                0.16177002 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.051362853 = queryNorm
                0.17398985 = fieldWeight in 1211, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.01953125 = fieldNorm(doc=1211)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    In this article we present a method for retrieving documents from a digital library through a visual interface based on automatically generated concepts. We used a vocabulary generation algorithm to generate a set of concepts for the digital library and a technique called the max-min distance technique to cluster them. Additionally, the concepts were visualized in a spring embedding graph layout to depict the semantic relationship among them. The resulting graph layout serves as an aid to users for retrieving documents. An online archive containing the contents of D-Lib Magazine from July 1995 to May 2002 was used to test the utility of an implemented retrieval and visualization system. We believe that the method developed and tested can be applied to many different domains to help users get a better understanding of online document collections and to minimize users' cognitive load during execution of search tasks. Over the past few years, the volume of information available through the World Wide Web has been expanding exponentially. Never has so much information been so readily available and shared among so many people. Unfortunately, the unstructured nature and huge volume of information accessible over networks have made it hard for users to sift through and find relevant information. To deal with this problem, information retrieval (IR) techniques have gained more intensive attention from both industrial and academic researchers. Numerous IR techniques have been developed to help deal with the information overload problem. These techniques concentrate on mathematical models and algorithms for retrieval. Popular IR models such as the Boolean model, the vector-space model, the probabilistic model and their variants are well established.
    From the user's perspective, however, it is still difficult to use current information retrieval systems. Users frequently have problems expressing their information needs and translating those needs into queries. This is partly due to the fact that information needs cannot be expressed appropriately in systems terms. It is not unusual for users to input search terms that are different from the index terms information systems use. Various methods have been proposed to help users choose search terms and articulate queries. One widely used approach is to incorporate into the information system a thesaurus-like component that represents both the important concepts in a particular subject area and the semantic relationships among those concepts. Unfortunately, the development and use of thesauri is not without its own problems. The thesaurus employed in a specific information system has often been developed for a general subject area and needs significant enhancement to be tailored to the information system where it is to be used. This thesaurus development process, if done manually, is both time consuming and labor intensive. Usage of a thesaurus in searching is complex and may raise barriers for the user. For illustration purposes, let us consider two scenarios of thesaurus usage. In the first scenario the user inputs a search term and the thesaurus then displays a matching set of related terms. Without an overview of the thesaurus - and without the ability to see the matching terms in the context of other terms - it may be difficult to assess the quality of the related terms in order to select the correct term. In the second scenario the user browses the whole thesaurus, which is organized as in an alphabetically ordered list. The problem with this approach is that the list may be long, and neither does it show users the global semantic relationship among all the listed terms.

Types