Search (7 results, page 1 of 1)

Aqeel, S.U.; Beitzel, S.M.; Jensen, E.C.; Grossman, D.; Frieder, O.: On the development of name search techniques for Arabic (2006) 0.04

0.03754639 = product of:
  0.07509278 = sum of:
    0.07509278 = sum of:
      0.03267146 = weight(_text_:systems in 5289) [ClassicSimilarity], result of:
        0.03267146 = score(doc=5289,freq=2.0), product of:
          0.16037072 = queryWeight, product of:
            3.0731742 = idf(docFreq=5561, maxDocs=44218)
            0.052184064 = queryNorm
          0.2037246 = fieldWeight in 5289, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.0731742 = idf(docFreq=5561, maxDocs=44218)
            0.046875 = fieldNorm(doc=5289)
      0.042421322 = weight(_text_:22 in 5289) [ClassicSimilarity], result of:
        0.042421322 = score(doc=5289,freq=2.0), product of:
          0.1827397 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052184064 = queryNorm
          0.23214069 = fieldWeight in 5289, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=5289)
  0.5 = coord(1/2)

Abstract: The need for effective identity matching systems has led to extensive research in the area of name search. For the most part, such work has been limited to English and other Latin-based languages. Consequently, algorithms such as Soundex and n-gram matching are of limited utility for languages such as Arabic, which has vastly different morphologic features that rely heavily on phonetic information. The dearth of work in this field is partly caused by the lack of standardized test data. Consequently, we have built a collection of 7,939 Arabic names, along with 50 training queries and 111 test queries. We use this collection to evaluate a variety of algorithms, including a derivative of Soundex tailored to Arabic (ASOUNDEX), measuring effectiveness by using standard information retrieval measures. Our results show an improvement of 70% over existing approaches.
Date: 22. 7.2006 17:20:20

Lundquist, C.; Frieder, O.; Holmes, D.O.; Grossman, D.: ¬A parallel relational database management system approach to relevance feedback in information retrieval (1999) 0.02

0.021210661 = product of:
  0.042421322 = sum of:
    0.042421322 = product of:
      0.084842645 = sum of:
        0.084842645 = weight(_text_:22 in 4303) [ClassicSimilarity], result of:
          0.084842645 = score(doc=4303,freq=2.0), product of:
            0.1827397 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052184064 = queryNorm
            0.46428138 = fieldWeight in 4303, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=4303)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 17. 1.2000 12:22:18

Yee, W.G.; Nguyen, L.T; Frieder, O.: ¬A view of the data on P2P file-sharing systems (2009) 0.02

0.018862877 = product of:
  0.037725754 = sum of:
    0.037725754 = product of:
      0.07545151 = sum of:
        0.07545151 = weight(_text_:systems in 3118) [ClassicSimilarity], result of:
          0.07545151 = score(doc=3118,freq=6.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.4704818 = fieldWeight in 3118, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0625 = fieldNorm(doc=3118)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: Peer-to-peer (P2P) file sharing is a leading Internet application. Millions of users use P2P file-sharing systems daily to search for and download files, accounting for a large portion of Internet traffic. Due to their scale, it is important to fully understand how these systems work. We analyze user queries and shared files collected on the Gnutella system, draw some conclusions on the nature of the application, and propose some research problems.

Ruocco, A.S.; Frieder, O.: Clustering and classification of large document bases in a parallel environment (1997) 0.02
```
0.016505018 = product of:
  0.033010036 = sum of:
    0.033010036 = product of:
      0.06602007 = sum of:
        0.06602007 = weight(_text_:systems in 1661) [ClassicSimilarity], result of:
          0.06602007 = score(doc=1661,freq=6.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.41167158 = fieldWeight in 1661, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1661)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Proposes the use of parallel computing systems to overcome the computationally intense clustering process. Examines 2 operations: clustering a document set and classifying the document set. Uses a subset of the TIPSTER corpus, specifically, articles from the Wall Street Journal. Document set classification was performed without the large storage requirements for ancillary data matrices. The time performance of the parallel systems was an improvement over sequential systems times, and produced the same clustering and classification scheme. Results show near linear speed up in higher threshold clustering applications
Fox, K.L.; Frieder, O.; Knepper, M.M.; Snowberg, E.J.: SENTINEL: a multiple engine information retrieval and visualization system (1999) 0.01
```
0.009529176 = product of:
  0.019058352 = sum of:
    0.019058352 = product of:
      0.038116705 = sum of:
        0.038116705 = weight(_text_:systems in 3547) [ClassicSimilarity], result of:
          0.038116705 = score(doc=3547,freq=2.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.23767869 = fieldWeight in 3547, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3547)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

We describe a prototype Information Retrieval system; SENTINEL, under development at Harris Corporation's Information Systems Division. SENTINEL is a fusion of multiple information retrieval technologies, integrating n-grams, a vector space model, and a neural network training rule. One of the primary advantages of SENTINEL is its 3-dimensional visualization capability that is based fully upon the mathematical representation of information with SENTINEL. The 3-dimensional visualization capability provides users with an intuitive understanding, with relevance/query refinement techniques athat can be better utilized, resulting in higher retrieval precision

Grossman, D.A.; Frieder, O.: Information retrieval : algorithms and heuristics (2004) 0.01

0.007700737 = product of:
  0.015401474 = sum of:
    0.015401474 = product of:
      0.030802948 = sum of:
        0.030802948 = weight(_text_:systems in 1486) [ClassicSimilarity], result of:
          0.030802948 = score(doc=1486,freq=4.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.19207339 = fieldWeight in 1486, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03125 = fieldNorm(doc=1486)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

LCSH: Information storage and retrieval systems
Subject: Information storage and retrieval systems

Aljlayl, M.; Frieder, O.; Grossman, D.: On bidirectional English-Arabic search (2002) 0.01
```
0.0068065543 = product of:
  0.013613109 = sum of:
    0.013613109 = product of:
      0.027226217 = sum of:
        0.027226217 = weight(_text_:systems in 5227) [ClassicSimilarity], result of:
          0.027226217 = score(doc=5227,freq=2.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.1697705 = fieldWeight in 5227, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5227)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Aljlayl, Frieder, and Grossman review machine translation of query methodologies and apply them to English-Arabic/Arabic-English Cross-Language Information Retrieval. In the dictionary method, replacement of each term with all possible equivalents in the target language results in considerable ambiguity, while taking the first term in the dictionary list reduces the ambiguity but may fail to capture the meaning. A Two-Phase method takes all possible equivalents and translates them back, retaining only those that generate the original term. It results in an average query length of six terms in TREC7 and 12 in TREC9. Arabic to English translations consistently preformed below the original English queries, and the Two-Phase method consistently preformed at the highest level and significantly better than the Every-Match method. Machine translation using other techniques is economical for queries but not likely so for documents. Using ALKAFI, a commercial translation system from Arabic to English and the Al-Mutarjim Al-Arabey system for English to Arabic, nearly 60% of monolingual retrievals were generated going from Arabic to English. Smaller numbers of terms in the source query improve performance, and these systems require syntactically well-formed queries for good performance.

Search (7 results, page 1 of 1)

Authors

Years

Types

Themes

Subjects

Classifications