Search (4 results, page 1 of 1)

  • × author_ss:"Mostafa, J."
  • × language_ss:"e"
  • × type_ss:"a"
  • × year_i:[2000 TO 2010}
  1. Mukhopadhyay, S.; Peng, S.; Raje, R.; Mostafa, J.; Palakal, M.: Distributed multi-agent information filtering : a comparative study (2005) 0.04
    0.04064859 = product of:
      0.08129718 = sum of:
        0.051698197 = weight(_text_:digital in 3559) [ClassicSimilarity], result of:
          0.051698197 = score(doc=3559,freq=2.0), product of:
            0.19770671 = queryWeight, product of:
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.050121464 = queryNorm
            0.26148933 = fieldWeight in 3559, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.046875 = fieldNorm(doc=3559)
        0.029598987 = product of:
          0.059197973 = sum of:
            0.059197973 = weight(_text_:project in 3559) [ClassicSimilarity], result of:
              0.059197973 = score(doc=3559,freq=2.0), product of:
                0.21156175 = queryWeight, product of:
                  4.220981 = idf(docFreq=1764, maxDocs=44218)
                  0.050121464 = queryNorm
                0.27981415 = fieldWeight in 3559, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.220981 = idf(docFreq=1764, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3559)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Information filtering is a technique to identify, in large collections, information that is relevant according to some criteria (e.g., a user's personal interests, or a research project objective). As such, it is a key technology for providing efficient user services in any large-scale information infrastructure, e.g., digital libraries. To provide large-scale Information filtering services, both computational and knowledge management issues need to be addressed. A centralized (single-agent) approach to information filtering suffers from serious drawbacks in terms of speed, accuracy, and economic considerations, and becomes unrealistic even for medium-scale applications. In this article, we discuss two distributed (multiagent) information filtering approaches, that are distributed with respect to knowledge or functionality, to overcome the limitations of single-agent centralized information filtering. Large-scale experimental studies involving the weIl-known TREC data set are also presented to illustrate the advantages of distributed filtering as weIl as to compare the different distributed approaches.
  2. Zhang, J.; Mostafa, J.; Tripathy, H.: Information retrieval by semantic analysis and visualization of the concept space of D-Lib® magazine (2002) 0.03
    0.029829983 = product of:
      0.059659965 = sum of:
        0.043081827 = weight(_text_:digital in 1211) [ClassicSimilarity], result of:
          0.043081827 = score(doc=1211,freq=8.0), product of:
            0.19770671 = queryWeight, product of:
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.050121464 = queryNorm
            0.21790776 = fieldWeight in 1211, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.01953125 = fieldNorm(doc=1211)
        0.016578136 = weight(_text_:library in 1211) [ClassicSimilarity], result of:
          0.016578136 = score(doc=1211,freq=6.0), product of:
            0.1317883 = queryWeight, product of:
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.050121464 = queryNorm
            0.12579368 = fieldWeight in 1211, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.01953125 = fieldNorm(doc=1211)
      0.5 = coord(2/4)
    
    Abstract
    In this article we present a method for retrieving documents from a digital library through a visual interface based on automatically generated concepts. We used a vocabulary generation algorithm to generate a set of concepts for the digital library and a technique called the max-min distance technique to cluster them. Additionally, the concepts were visualized in a spring embedding graph layout to depict the semantic relationship among them. The resulting graph layout serves as an aid to users for retrieving documents. An online archive containing the contents of D-Lib Magazine from July 1995 to May 2002 was used to test the utility of an implemented retrieval and visualization system. We believe that the method developed and tested can be applied to many different domains to help users get a better understanding of online document collections and to minimize users' cognitive load during execution of search tasks. Over the past few years, the volume of information available through the World Wide Web has been expanding exponentially. Never has so much information been so readily available and shared among so many people. Unfortunately, the unstructured nature and huge volume of information accessible over networks have made it hard for users to sift through and find relevant information. To deal with this problem, information retrieval (IR) techniques have gained more intensive attention from both industrial and academic researchers. Numerous IR techniques have been developed to help deal with the information overload problem. These techniques concentrate on mathematical models and algorithms for retrieval. Popular IR models such as the Boolean model, the vector-space model, the probabilistic model and their variants are well established.
    Nevertheless, because thesaurus use has shown to improve retrieval, for our method we integrate functions in the search interface that permit users to explore built-in search vocabularies to improve retrieval from digital libraries. Our method automatically generates the terms and their semantic relationships representing relevant topics covered in a digital library. We call these generated terms the "concepts", and the generated terms and their semantic relationships we call the "concept space". Additionally, we used a visualization technique to display the concept space and allow users to interact with this space. The automatically generated term set is considered to be more representative of subject area in a corpus than an "externally" imposed thesaurus, and our method has the potential of saving a significant amount of time and labor for those who have been manually creating thesauri as well. Information visualization is an emerging discipline and developed very quickly in the last decade. With growing volumes of documents and associated complexities, information visualization has become increasingly important. Researchers have found information visualization to be an effective way to use and understand information while minimizing a user's cognitive load. Our work was based on an algorithmic approach of concept discovery and association. Concepts are discovered using an algorithm based on an automated thesaurus generation procedure. Subsequently, similarities among terms are computed using the cosine measure, and the associations among terms are established using a method known as max-min distance clustering. The concept space is then visualized in a spring embedding graph, which roughly shows the semantic relationships among concepts in a 2-D visual representation. The semantic space of the visualization is used as a medium for users to retrieve the desired documents. In the remainder of this article, we present our algorithmic approach of concept generation and clustering, followed by description of the visualization technique and interactive interface. The paper ends with key conclusions and discussions on future work.
  3. Mongin, L.; Fu, Y.Y.; Mostafa, J.: Open Archives data Service prototype and automated subject indexing using D-Lib archive content as a testbed (2003) 0.01
    0.008121594 = product of:
      0.032486375 = sum of:
        0.032486375 = weight(_text_:library in 1167) [ClassicSimilarity], result of:
          0.032486375 = score(doc=1167,freq=4.0), product of:
            0.1317883 = queryWeight, product of:
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.050121464 = queryNorm
            0.24650425 = fieldWeight in 1167, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.046875 = fieldNorm(doc=1167)
      0.25 = coord(1/4)
    
    Abstract
    The Indiana University School of Library and Information Science opened a new research laboratory in January 2003; The Indiana University School of Library and Information Science Information Processing Laboratory [IU IP Lab]. The purpose of the new laboratory is to facilitate collaboration between scientists in the department in the areas of information retrieval (IR) and information visualization (IV) research. The lab has several areas of focus. These include grid and cluster computing, and a standard Java-based software platform to support plug and play research datasets, a selection of standard IR modules and standard IV algorithms. Future development includes software to enable researchers to contribute datasets, IR algorithms, and visualization algorithms into the standard environment. We decided early on to use OAI-PMH as a resource discovery tool because it is consistent with our mission.
  4. Mostafa, J.: Document search interface design : background and introduction to special topic section (2004) 0.01
    0.0057428335 = product of:
      0.022971334 = sum of:
        0.022971334 = weight(_text_:library in 2503) [ClassicSimilarity], result of:
          0.022971334 = score(doc=2503,freq=2.0), product of:
            0.1317883 = queryWeight, product of:
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.050121464 = queryNorm
            0.17430481 = fieldWeight in 2503, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6293786 = idf(docFreq=8668, maxDocs=44218)
              0.046875 = fieldNorm(doc=2503)
      0.25 = coord(1/4)
    
    Abstract
    A library user searching for high-quality and authoritative information today is confronted with thousands of resources that cover a wide variety of topics. The heterogeneity factor alone can be a major obstacle for the user to select appropriate resources to search. Depending an the information need, the user may have to navigate among resources that are in different formats (bibliographic versus full-text), are stored in different media (text versus images), have different levels of coverage (news versus scholarly reports), or are published in different languages. Beyond the heterogeneity factor, the user faces specific challenges related to the search experience itself. These factors and their impact an searching can be best described using a fourphase framework, namely: formulation, action, presentation, and refinement (Shneiderman, Byrd, & Croft, 1998). Certain key functions for document search interfaces are described below in the context of these four phases. Following the description, highlights from the contributed papers are discussed.