Search (2 results, page 1 of 1)

Chowdhury, A.; Mccabe, M.C.: Improving information retrieval systems using part of speech tagging (1993) 0.05

0.054811984 = product of:
  0.1461653 = sum of:
    0.0953384 = weight(_text_:storage in 1061) [ClassicSimilarity], result of:
      0.0953384 = score(doc=1061,freq=4.0), product of:
        0.1866346 = queryWeight, product of:
          5.4488444 = idf(docFreq=516, maxDocs=44218)
          0.034252144 = queryNorm
        0.51082915 = fieldWeight in 1061, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          5.4488444 = idf(docFreq=516, maxDocs=44218)
          0.046875 = fieldNorm(doc=1061)
    0.029382274 = weight(_text_:retrieval in 1061) [ClassicSimilarity], result of:
      0.029382274 = score(doc=1061,freq=4.0), product of:
        0.10360982 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.034252144 = queryNorm
        0.2835858 = fieldWeight in 1061, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=1061)
    0.021444622 = weight(_text_:systems in 1061) [ClassicSimilarity], result of:
      0.021444622 = score(doc=1061,freq=2.0), product of:
        0.10526281 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.034252144 = queryNorm
        0.2037246 = fieldWeight in 1061, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.046875 = fieldNorm(doc=1061)
  0.375 = coord(3/8)

Abstract: The object of Information Retrieval is to retrieve all relevant documents for a user query and only those relevant documents. Much research has focused on achieving this objective with little regard for storage overhead or performance. In the paper we evaluate the use of Part of Speech Tagging to improve, the index storage overhead and general speed of the system with only a minimal reduction to precision recall measurements. We tagged 500Mbs of the Los Angeles Times 1990 and 1989 document collection provided by TREC for parts of speech. We then experimented to find the most relevant part of speech to index. We show that 90% of precision recall is achieved with 40% of the document collections terms. We also show that this is a improvement in overhead with only a 1% reduction in precision recall.

Plotkin, R.C.; Schwartz, M.S.: Data modeling for news clip archive : a prototype solution (1997) 0.05

0.04700025 = product of:
  0.12533401 = sum of:
    0.067414425 = weight(_text_:storage in 1259) [ClassicSimilarity], result of:
      0.067414425 = score(doc=1259,freq=2.0), product of:
        0.1866346 = queryWeight, product of:
          5.4488444 = idf(docFreq=516, maxDocs=44218)
          0.034252144 = queryNorm
        0.36121076 = fieldWeight in 1259, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.4488444 = idf(docFreq=516, maxDocs=44218)
          0.046875 = fieldNorm(doc=1259)
    0.020776404 = weight(_text_:retrieval in 1259) [ClassicSimilarity], result of:
      0.020776404 = score(doc=1259,freq=2.0), product of:
        0.10360982 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.034252144 = queryNorm
        0.20052543 = fieldWeight in 1259, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=1259)
    0.037143175 = weight(_text_:systems in 1259) [ClassicSimilarity], result of:
      0.037143175 = score(doc=1259,freq=6.0), product of:
        0.10526281 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.034252144 = queryNorm
        0.35286134 = fieldWeight in 1259, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.046875 = fieldNorm(doc=1259)
  0.375 = coord(3/8)

Abstract: Film, videotape and multimedia archive systems must address the issues of editing, authoring and searching at the media (i.e. tape) or sub media (i.e. scene) level in addition to the traditional inventory management capabilities associated with the physical media. This paper describes a prototype of a database design for the storage, search and retrieval of multimedia and its related information. It also provides a process by which legacy data can be imported to this schema. The Continuous Media Index, or Comix system is the name of the prototype. An implementation of such a digital library solution incorporates multimedia objects, hierarchical relationships and timecode in addition to traditional attribute data. Present video and multimedia archive systems are easily migrated to this architecture. Comix was implemented for a videotape archiving system. It was written for, and implemented using IBM Digital Library version 1.0. A derivative of Comix is currently in development for customer specific applications. Principles of the Comix design as well as the importation methods are not specific to the underlying systems used.

Search (2 results, page 1 of 1)

Authors

Themes