Search (9 results, page 1 of 1)

  • × author_ss:"Zobel, J."
  1. Uitdenbogerd, A.L.; Zobel, J.: ¬An architecture for effective music information retrieval (2004) 0.01
    0.005757545 = product of:
      0.02303018 = sum of:
        0.02303018 = weight(_text_:information in 3055) [ClassicSimilarity], result of:
          0.02303018 = score(doc=3055,freq=10.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.2602176 = fieldWeight in 3055, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=3055)
      0.25 = coord(1/4)
    
    Abstract
    We have explored methods for music information retrieval for polyphonic music stored in the MIDI format. These methods use a query, expressed as a series of notes that are intended to represent a melody or theme, to identify similar pieces. Our work has shown that a three-phase architecture is appropriate for this task in which the first phase is melody extraction, the second is standardization, and the third is query-to-melody matching. We have investigated and systematically compared algorithms for each of these phases. To ensure that our results are robust, we have applied methodologies that are derived from text information retrieval: We developed test collections and compared different ways of acquiring test queries and relevance judgments. In this article we review this program of work, compare it to other approaches to music information retrieval, and identify outstanding issues.
    Source
    Journal of the American Society for Information Science and Technology. 55(2004) no.12, S.1053-1057
  2. Shokouhi, M.; Zobel, J.; Tahaghoghi, S.; Scholer, F.: Using query logs to establish vocabularies in distributed information retrieval (2007) 0.01
    0.005149705 = product of:
      0.02059882 = sum of:
        0.02059882 = weight(_text_:information in 901) [ClassicSimilarity], result of:
          0.02059882 = score(doc=901,freq=8.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.23274569 = fieldWeight in 901, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=901)
      0.25 = coord(1/4)
    
    Abstract
    Users of search engines express their needs as queries, typically consisting of a small number of terms. The resulting search engine query logs are valuable resources that can be used to predict how people interact with the search system. In this paper, we introduce two novel applications of query logs, in the context of distributed information retrieval. First, we use query log terms to guide sampling from uncooperative distributed collections. We show that while our sampling strategy is at least as efficient as current methods, it consistently performs better. Second, we propose and evaluate a pruning strategy that uses query log information to eliminate terms. Our experiments show that our proposed pruning method maintains the accuracy achieved by complete indexes, while decreasing the index size by up to 60%. While such pruning may not always be desirable in practice, it provides a useful benchmark against which other pruning strategies can be measured.
    Source
    Information processing and management. 43(2007) no.1, S.169-180
  3. Hoad, T.C.; Zobel, J.: Methods for identifying versioned and plagiarized documents (2003) 0.00
    0.0030344925 = product of:
      0.01213797 = sum of:
        0.01213797 = weight(_text_:information in 5159) [ClassicSimilarity], result of:
          0.01213797 = score(doc=5159,freq=4.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.13714671 = fieldWeight in 5159, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5159)
      0.25 = coord(1/4)
    
    Abstract
    Hoad and Zobel term documents that originate from the same source, whether versions or plagiarisms, co-derivatives. Identification of co-derivatives is normally by a technique called fingerprinting, which uses hashing to generate surrogates in the form of integer strings derived from substrings of text, for comparison purposes, or by ranking using a similarity measure as in information retrieval. Hoad and Zobel derive several variants of what they term an identity measure, where documents with similar numbers of occurrences of words benefit and those with dissimilar numbers are penalized, for use in a ranking technique. They then review fingerprinting strategies, and characterize them by the substring size utilized, i.e. granularity, character of the hashing function, the size of the document fingerprint, i.e. resolution, and the substring selection strategy. In their experiments highest false match, HFM, the highest percentage score given an incorrect result, and separation, the difference between the lowest correct result and HFM were the measures utilized in two collections, one of 3,300 documents, and the other of 80,000 with 53 query documents. The new identity measure demonstrates superior performance to the alternatives. Only one fingerprinting strategy was able to identify all human identified similar documents, the anchor strategy. The key parameter in fingerprinting appears to be granularity, with three to five words producing the best results.
    Source
    Journal of the American Society for Information Science and technology. 54(2003) no.3, S.203-215
  4. Bell, T.C.; Moffat, A.; Nevill-Manning, C.G.; Witten, I.H.; Zobel, J.: Data compression in full-text retrieval system (1993) 0.00
    0.0030039945 = product of:
      0.012015978 = sum of:
        0.012015978 = weight(_text_:information in 5643) [ClassicSimilarity], result of:
          0.012015978 = score(doc=5643,freq=2.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.13576832 = fieldWeight in 5643, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5643)
      0.25 = coord(1/4)
    
    Source
    Journal of the American Society for Information Science. 44(1993) no.9, S.508-531
  5. Persin, M.; Zobel, J.; Sacks-Davis, R.: Filtered document retrieval with frequency-sorted indexes (1996) 0.00
    0.0030039945 = product of:
      0.012015978 = sum of:
        0.012015978 = weight(_text_:information in 6758) [ClassicSimilarity], result of:
          0.012015978 = score(doc=6758,freq=2.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.13576832 = fieldWeight in 6758, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6758)
      0.25 = coord(1/4)
    
    Source
    Journal of the American Society for Information SCience. 47(1996) no.10, S.749-764
  6. Heinz, S.; Zobel, J.: Efficient single-pass index construction for text databases (2003) 0.00
    0.0030039945 = product of:
      0.012015978 = sum of:
        0.012015978 = weight(_text_:information in 1678) [ClassicSimilarity], result of:
          0.012015978 = score(doc=1678,freq=2.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.13576832 = fieldWeight in 1678, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1678)
      0.25 = coord(1/4)
    
    Source
    Journal of the American Society for Information Science and technology. 54(2003) no.8, S.713-729
  7. Kaszkiel, M.; Zobel, J.: Effective ranking with arbitrary passages (2001) 0.00
    0.0025748524 = product of:
      0.01029941 = sum of:
        0.01029941 = weight(_text_:information in 5764) [ClassicSimilarity], result of:
          0.01029941 = score(doc=5764,freq=2.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.116372846 = fieldWeight in 5764, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=5764)
      0.25 = coord(1/4)
    
    Source
    Journal of the American Society for Information Science and technology. 52(2001) no.4, S.344-364
  8. Moffat, A.; Zobel, J.: Self-indexing inverted files for fast text retrieval (1996) 0.00
    0.0025748524 = product of:
      0.01029941 = sum of:
        0.01029941 = weight(_text_:information in 9) [ClassicSimilarity], result of:
          0.01029941 = score(doc=9,freq=2.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.116372846 = fieldWeight in 9, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=9)
      0.25 = coord(1/4)
    
    Source
    ACM transactions on information systems. 14(1996) no.4, S.349-379
  9. Hawking, D.; Zobel, J.: Does topic metadata help with Web search? (2007) 0.00
    0.0021457102 = product of:
      0.008582841 = sum of:
        0.008582841 = weight(_text_:information in 204) [ClassicSimilarity], result of:
          0.008582841 = score(doc=204,freq=2.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.09697737 = fieldWeight in 204, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=204)
      0.25 = coord(1/4)
    
    Source
    Journal of the American Society for Information Science and Technology. 58(2007) no.5, S.613-628