Search (7 results, page 1 of 1)

Seki, K.; Mostafa, J.: Gene ontology annotation as text categorization : an empirical study (2008) 0.00
```
0.0022137975 = product of:
  0.01771038 = sum of:
    0.01771038 = product of:
      0.05313114 = sum of:
        0.05313114 = weight(_text_:problem in 2123) [ClassicSimilarity], result of:
          0.05313114 = score(doc=2123,freq=6.0), product of:
            0.13082431 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.030822188 = queryNorm
            0.4061259 = fieldWeight in 2123, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2123)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)
```
Abstract

Gene ontology (GO) consists of three structured controlled vocabularies, i.e., GO domains, developed for describing attributes of gene products, and its annotation is crucial to provide a common gateway to access different model organism databases. This paper explores an effective application of text categorization methods to this highly practical problem in biology. As a first step, we attempt to tackle the automatic GO annotation task posed in the Text Retrieval Conference (TREC) 2004 Genomics Track. Given a pair of genes and an article reference where the genes appear, the task simulates assigning GO domain codes. We approach the problem with careful consideration of the specialized terminology and pay special attention to various forms of gene synonyms, so as to exhaustively locate the occurrences of the target gene. We extract the words around the spotted gene occurrences and used them to represent the gene for GO domain code annotation. We regard the task as a text categorization problem and adopt a variant of kNN with supervised term weighting schemes, making our method among the top-performing systems in the TREC official evaluation. Furthermore, we investigate different feature selection policies in conjunction with the treatment of terms associated with negative instances. Our experiments reveal that round-robin feature space allocation with eliminating negative terms substantially improves performance as GO terms become specific.
Mostafa, J.; Dillon, A.: Design and evaluation of a user interface supporting multiple image query models (1996) 0.00
```
0.0015337638 = product of:
  0.012270111 = sum of:
    0.012270111 = product of:
      0.03681033 = sum of:
        0.03681033 = weight(_text_:problem in 7432) [ClassicSimilarity], result of:
          0.03681033 = score(doc=7432,freq=2.0), product of:
            0.13082431 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.030822188 = queryNorm
            0.28137225 = fieldWeight in 7432, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.046875 = fieldNorm(doc=7432)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)
```
Abstract

For effective access to images, the design of the database interface must be based on principles that match the actual querying needs of users. Analysis of this design problem reveals that the query language must support utilization of both visual and verbal clues. The ViewFinder interface, designed as a client to a database server, supports querying based on both types of clues. Presents details of ViewFinder design. Describes results of usability analysis performed on ViweFinder with a group of 18 users. High search success rates were achieved (greater than 80%) through both types of querying means (visual and verbal). Users generally used more verbal clues than visual clues in searches

Mostafa, J.: Bessere Suchmaschinen für das Web (2006) 0.00

0.0013983113 = product of:
  0.011186491 = sum of:
    0.011186491 = product of:
      0.016779736 = sum of:
        0.0084277745 = weight(_text_:29 in 4871) [ClassicSimilarity], result of:
          0.0084277745 = score(doc=4871,freq=2.0), product of:
            0.108422816 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.030822188 = queryNorm
            0.07773064 = fieldWeight in 4871, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.015625 = fieldNorm(doc=4871)
        0.008351962 = weight(_text_:22 in 4871) [ClassicSimilarity], result of:
          0.008351962 = score(doc=4871,freq=2.0), product of:
            0.10793405 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.030822188 = queryNorm
            0.07738023 = fieldWeight in 4871, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.015625 = fieldNorm(doc=4871)
      0.6666667 = coord(2/3)
  0.125 = coord(1/8)

Date: 31.12.1996 19:29:41
22. 1.2006 18:34:49

Mostafa, J.: Digital image representation and access (1994) 0.00

0.0012290506 = product of:
  0.009832405 = sum of:
    0.009832405 = product of:
      0.029497212 = sum of:
        0.029497212 = weight(_text_:29 in 1102) [ClassicSimilarity], result of:
          0.029497212 = score(doc=1102,freq=2.0), product of:
            0.108422816 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.030822188 = queryNorm
            0.27205724 = fieldWeight in 1102, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1102)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Source: Annual review of information science and technology. 29(1994), S.91-135

Lam, W.; Mostafa, J.: Modeling user interest shift using a Baysian approach (2001) 0.00

0.0012290506 = product of:
  0.009832405 = sum of:
    0.009832405 = product of:
      0.029497212 = sum of:
        0.029497212 = weight(_text_:29 in 2658) [ClassicSimilarity], result of:
          0.029497212 = score(doc=2658,freq=2.0), product of:
            0.108422816 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.030822188 = queryNorm
            0.27205724 = fieldWeight in 2658, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2658)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Date: 29. 9.2001 13:58:28

Zhang, J.; Mostafa, J.; Tripathy, H.: Information retrieval by semantic analysis and visualization of the concept space of D-Lib® magazine (2002) 0.00
```
0.0011068988 = product of:
  0.00885519 = sum of:
    0.00885519 = product of:
      0.02656557 = sum of:
        0.02656557 = weight(_text_:problem in 1211) [ClassicSimilarity], result of:
          0.02656557 = score(doc=1211,freq=6.0), product of:
            0.13082431 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.030822188 = queryNorm
            0.20306295 = fieldWeight in 1211, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.01953125 = fieldNorm(doc=1211)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)
```
Abstract

In this article we present a method for retrieving documents from a digital library through a visual interface based on automatically generated concepts. We used a vocabulary generation algorithm to generate a set of concepts for the digital library and a technique called the max-min distance technique to cluster them. Additionally, the concepts were visualized in a spring embedding graph layout to depict the semantic relationship among them. The resulting graph layout serves as an aid to users for retrieving documents. An online archive containing the contents of D-Lib Magazine from July 1995 to May 2002 was used to test the utility of an implemented retrieval and visualization system. We believe that the method developed and tested can be applied to many different domains to help users get a better understanding of online document collections and to minimize users' cognitive load during execution of search tasks. Over the past few years, the volume of information available through the World Wide Web has been expanding exponentially. Never has so much information been so readily available and shared among so many people. Unfortunately, the unstructured nature and huge volume of information accessible over networks have made it hard for users to sift through and find relevant information. To deal with this problem, information retrieval (IR) techniques have gained more intensive attention from both industrial and academic researchers. Numerous IR techniques have been developed to help deal with the information overload problem. These techniques concentrate on mathematical models and algorithms for retrieval. Popular IR models such as the Boolean model, the vector-space model, the probabilistic model and their variants are well established.
From the user's perspective, however, it is still difficult to use current information retrieval systems. Users frequently have problems expressing their information needs and translating those needs into queries. This is partly due to the fact that information needs cannot be expressed appropriately in systems terms. It is not unusual for users to input search terms that are different from the index terms information systems use. Various methods have been proposed to help users choose search terms and articulate queries. One widely used approach is to incorporate into the information system a thesaurus-like component that represents both the important concepts in a particular subject area and the semantic relationships among those concepts. Unfortunately, the development and use of thesauri is not without its own problems. The thesaurus employed in a specific information system has often been developed for a general subject area and needs significant enhancement to be tailored to the information system where it is to be used. This thesaurus development process, if done manually, is both time consuming and labor intensive. Usage of a thesaurus in searching is complex and may raise barriers for the user. For illustration purposes, let us consider two scenarios of thesaurus usage. In the first scenario the user inputs a search term and the thesaurus then displays a matching set of related terms. Without an overview of the thesaurus - and without the ability to see the matching terms in the context of other terms - it may be difficult to assess the quality of the related terms in order to select the correct term. In the second scenario the user browses the whole thesaurus, which is organized as in an alphabetically ordered list. The problem with this approach is that the list may be long, and neither does it show users the global semantic relationship among all the listed terms.

Zhang, Y.; Wu, D.; Hagen, L.; Song, I.-Y.; Mostafa, J.; Oh, S.; Anderson, T.; Shah, C.; Bishop, B.W.; Hopfgartner, F.; Eckert, K.; Federer, L.; Saltz, J.S.: Data science curriculum in the iField (2023) 0.00

8.7789324E-4 = product of:
  0.007023146 = sum of:
    0.007023146 = product of:
      0.021069437 = sum of:
        0.021069437 = weight(_text_:29 in 964) [ClassicSimilarity], result of:
          0.021069437 = score(doc=964,freq=2.0), product of:
            0.108422816 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.030822188 = queryNorm
            0.19432661 = fieldWeight in 964, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0390625 = fieldNorm(doc=964)
      0.33333334 = coord(1/3)
  0.125 = coord(1/8)

Date: 12. 5.2023 14:29:42

Search (7 results, page 1 of 1)

Authors

Years

Languages

Types

Themes