Search (5 results, page 1 of 1)

  • × author_ss:"Bodoff, D."
  1. Bodoff, D.: Test theory for evaluating reliability of IR test collections (2008) 0.03
    0.032625798 = product of:
      0.065251596 = sum of:
        0.065251596 = product of:
          0.13050319 = sum of:
            0.13050319 = weight(_text_:theory in 2085) [ClassicSimilarity], result of:
              0.13050319 = score(doc=2085,freq=14.0), product of:
                0.21471956 = queryWeight, product of:
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.05163523 = queryNorm
                0.6077844 = fieldWeight in 2085, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2085)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Classical test theory offers theoretically derived reliability measures such as Cronbach's alpha, which can be applied to measure the reliability of a set of Information Retrieval test results. The theory also supports item analysis, which identifies queries that are hampering the test's reliability, and which may be candidates for refinement or removal. A generalization of Classical Test Theory, called Generalizability Theory, provides an even richer set of tools. It allows us to estimate the reliability of a test as a function of the number of queries, assessors (relevance judges), and other aspects of the test's design. One novel aspect of Generalizability Theory is that it allows this estimation of reliability even before the test collection exists, based purely on the numbers of queries and assessors that it will contain. These calculations can help test designers in advance, by allowing them to compare the reliability of test designs with various numbers of queries and relevance assessors, and to spend their limited budgets on a design that maximizes reliability. Empirical analysis shows that in cases for which our data is representative, having more queries is more helpful for reliability than having more assessors. It also suggests that reliability may be improved with a per-document performance measure, as opposed to a document-set based performance measure, where appropriate. The theory also clarifies the implicit debate in IR literature regarding the nature of error in relevance judgments.
  2. Bodoff, D.: Emergence of terminological conventions as a searcher-indexer coordination game (2009) 0.02
    0.020927068 = product of:
      0.041854136 = sum of:
        0.041854136 = product of:
          0.08370827 = sum of:
            0.08370827 = weight(_text_:theory in 3299) [ClassicSimilarity], result of:
              0.08370827 = score(doc=3299,freq=4.0), product of:
                0.21471956 = queryWeight, product of:
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.05163523 = queryNorm
                0.3898493 = fieldWeight in 3299, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3299)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In the traditional model of information retrieval, searchers and indexers choose query and index terms, respectively, and these term choices are ultimately compared in a matching process. One of the main challenges in information science and information retrieval is that searchers and indexers often do not choose the same term even though the item is relevant to the need whereas at other times they do choose the same term even though it is not relevant. But if both searchers and indexers have the opportunity to review feedback data showing the success or failure of their previous term choices, then there exists an evolutionary force that, all else being equal, will lead to helpful convergence in searchers' and indexers' term usage when the information is relevant, and helpful divergence of term usage when it is not. Based on learning theory, and new theory presented here, it is possible to predict which terms will emerge as the terminological conventions that are used by groups of searchers and the indexers of relevant and nonrelevant information items.
  3. Bodoff, D.: ¬A re-unification of two competing models for document retrieval (1999) 0.02
    0.01726395 = product of:
      0.0345279 = sum of:
        0.0345279 = product of:
          0.0690558 = sum of:
            0.0690558 = weight(_text_:theory in 2951) [ClassicSimilarity], result of:
              0.0690558 = score(doc=2951,freq=2.0), product of:
                0.21471956 = queryWeight, product of:
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.05163523 = queryNorm
                0.32160926 = fieldWeight in 2951, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2951)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    2 competing approaches for document retrieval were first identified by Robertson, Maron and Cooper (1982) for probabilistic retrieval. The difficulty of unifying those approaches was introduced as a problem of resolving query-focused with document-focused retrieval, and an approach towards unification was offered. That approach rests on a re-conceptualization of the meaning of terms weight estimates. In this work, we propose a new unified model. The unification problem is re-framed as resulting from a lack of theory regarding the relationship to 2 sorts of data, absolute and relative. This new unified model is valid even for traditional interpretations of term estimates
  4. Bodoff, D.; Wong, S.P.-S.: Documents and queries as random variables : history and implications (2006) 0.01
    0.014797671 = product of:
      0.029595342 = sum of:
        0.029595342 = product of:
          0.059190683 = sum of:
            0.059190683 = weight(_text_:theory in 193) [ClassicSimilarity], result of:
              0.059190683 = score(doc=193,freq=2.0), product of:
                0.21471956 = queryWeight, product of:
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.05163523 = queryNorm
                0.27566507 = fieldWeight in 193, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.046875 = fieldNorm(doc=193)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The view of documents and/or queries as random variables is gaining importance in the theory of information retrieval. We argue that traditional probabilistic models consider documents and queries as random variables, but that newer models such as language modeling and our unified model take this one step further. The additional step is called error in predictors. Such models consider that we don't observe the document and query random variables that are modeled to predict relevance probabilistically. Rather, there are additional random variables, which are the observed documents and queries. We discuss some important implications of this idea for parameter estimation, relevance prediction, and even test-collection construction. By clarifying the positions of various probabilistic models on this question, and presenting in one place many of its implications, this article aims to deepen our common understanding of the theories behind traditional probabilistic models, and to strengthen the theoretical basis for further development of more recent approaches such as language modeling.
  5. Bodoff, D.; Raban, D.: Question types and intermediary elicitations (2016) 0.01
    0.008744827 = product of:
      0.017489653 = sum of:
        0.017489653 = product of:
          0.034979306 = sum of:
            0.034979306 = weight(_text_:22 in 2638) [ClassicSimilarity], result of:
              0.034979306 = score(doc=2638,freq=2.0), product of:
                0.18081778 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05163523 = queryNorm
                0.19345059 = fieldWeight in 2638, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2638)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 1.2016 11:58:25