Search (13 results, page 1 of 1)

Bodoff, D.; Raban, D.: Question types and intermediary elicitations (2016) 0.02
```
0.016163789 = product of:
  0.032327577 = sum of:
    0.032327577 = sum of:
      0.0067836978 = weight(_text_:a in 2638) [ClassicSimilarity], result of:
        0.0067836978 = score(doc=2638,freq=12.0), product of:
          0.043477926 = queryWeight, product of:
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.037706986 = queryNorm
          0.15602624 = fieldWeight in 2638, product of:
            3.4641016 = tf(freq=12.0), with freq of:
              12.0 = termFreq=12.0
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2638)
      0.02554388 = weight(_text_:22 in 2638) [ClassicSimilarity], result of:
        0.02554388 = score(doc=2638,freq=2.0), product of:
          0.13204344 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.037706986 = queryNorm
          0.19345059 = fieldWeight in 2638, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2638)
  0.5 = coord(1/2)
```
Abstract

In the context of online question-answering services, an intermediary clarifies the user's needs by eliciting additional information. This research proposes that these elicitations will depend on the type of question. In particular, this research explores the relationship between three constructs: question types, elicitations, and the fee that is paid for the answer. These relationships are explored for a few different question typologies, including a new kind of question type that we call Identity. It is found that the kinds of clarifications that intermediaries elicit depend on the type of question in systematic ways. A practical implication is that interactive question-answering services-whether human or automated-can be steered to focus attention on the kinds of clarification that are evidently most needed for that question type. Further, it is found that certain question types, as well as the number of elicitations, are associated with higher fees. This means that it may be possible to define a pricing structure for question-answering services based on objective and predictable characteristics of the question, which would help to establish a rational market for this type of information service. The newly introduced Identity question type was found to be especially reliable in predicting elicitations and fees.

Date

22. 1.2016 11:58:25

Type

a
Bodoff, D.: Relevance for browsing, relevance for searching (2006) 0.00
```
0.0025645308 = product of:
  0.0051290616 = sum of:
    0.0051290616 = product of:
      0.010258123 = sum of:
        0.010258123 = weight(_text_:a in 4909) [ClassicSimilarity], result of:
          0.010258123 = score(doc=4909,freq=14.0), product of:
            0.043477926 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.037706986 = queryNorm
            0.23593865 = fieldWeight in 4909, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4909)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The concept of relevance has received a great deal of theoretical attention. Separately, the relationship between focused search and browsing has also received extensive theoretical attention. This article aims to integrate these two literatures with a model and an empirical study that relate relevance in focused searching to relevance in browsing. Some factors affect both kinds of relevance in the same direction; others affect them in different ways. In our empirical study, we find that the latter factors dominate, so that there is actually a negative correlation between the probability of a document's relevance to a browsing user and its probability of relevance to a focused searcher.

Type

a
Bodoff, D.: ¬A re-unification of two competing models for document retrieval (1999) 0.00
```
0.002374294 = product of:
  0.004748588 = sum of:
    0.004748588 = product of:
      0.009497176 = sum of:
        0.009497176 = weight(_text_:a in 2951) [ClassicSimilarity], result of:
          0.009497176 = score(doc=2951,freq=12.0), product of:
            0.043477926 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.037706986 = queryNorm
            0.21843673 = fieldWeight in 2951, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2951)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

2 competing approaches for document retrieval were first identified by Robertson, Maron and Cooper (1982) for probabilistic retrieval. The difficulty of unifying those approaches was introduced as a problem of resolving query-focused with document-focused retrieval, and an approach towards unification was offered. That approach rests on a re-conceptualization of the meaning of terms weight estimates. In this work, we propose a new unified model. The unification problem is re-framed as resulting from a lack of theory regarding the relationship to 2 sorts of data, absolute and relative. This new unified model is valid even for traditional interpretations of term estimates

Type

a
Bodoff, D.; Kambil, A.: Partial coordination : II. A preliminary evaluation and failure analysis (1998) 0.00
```
0.0021981692 = product of:
  0.0043963385 = sum of:
    0.0043963385 = product of:
      0.008792677 = sum of:
        0.008792677 = weight(_text_:a in 2323) [ClassicSimilarity], result of:
          0.008792677 = score(doc=2323,freq=14.0), product of:
            0.043477926 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.037706986 = queryNorm
            0.20223314 = fieldWeight in 2323, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=2323)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Partial coordination is a new method for cataloging documents for subject access. It is especially designed to enhance the precision of document searches in online environments. This article reports a preliminary evaluation of partial coordination that shows promising results compared with full-text retrieval. We also report the difficulties in empirically evaluating the effectiveness of automatic full-text retrieval in contrast to mixed methods such as partial coordination which combine human cataloging with computerized retrieval. Based on our study, we propose research in this area will substantially benefit from a common framework for failure analysis and a common data set. This will allow information retrieval researchers adapting 'library style'cataloging to large electronic document collections, as well as those developing automated or mixed methods, to directly compare their proposals for indexing and retrieval. This article concludes by suggesting guidelines for constructing such as testbed

Type

a
Bodoff, D.; Enache, D.; Kambil, A.; Simon, G.; Yukhimets, A.: ¬A unified maximum likelihood approach to document retrieval (2001) 0.00
```
0.002035109 = product of:
  0.004070218 = sum of:
    0.004070218 = product of:
      0.008140436 = sum of:
        0.008140436 = weight(_text_:a in 174) [ClassicSimilarity], result of:
          0.008140436 = score(doc=174,freq=12.0), product of:
            0.043477926 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.037706986 = queryNorm
            0.18723148 = fieldWeight in 174, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=174)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Empirical work shows significant benefits from using relevance feedback data to improve information retrieval (IR) performance. Still, one fundamental difficulty has limited the ability to fully exploit this valuable data. The problem is that it is not clear whether the relevance feedback data should be used to train the system about what the users really mean, or about what the documents really mean. In this paper, we resolve the question using a maximum likelihood framework. We show how all the available data can be used to simultaneously estimate both documents and queries in proportions that are optimal in a maximum likelihood sense. The resulting algorithm is directly applicable to many approaches to IR, and the unified framework can help explain previously reported results as well as guidethe search for new methods that utilize feedback data in IR

Type

a
Bodoff, D.: Test theory for evaluating reliability of IR test collections (2008) 0.00
```
0.0019582848 = product of:
  0.0039165695 = sum of:
    0.0039165695 = product of:
      0.007833139 = sum of:
        0.007833139 = weight(_text_:a in 2085) [ClassicSimilarity], result of:
          0.007833139 = score(doc=2085,freq=16.0), product of:
            0.043477926 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.037706986 = queryNorm
            0.18016359 = fieldWeight in 2085, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2085)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Classical test theory offers theoretically derived reliability measures such as Cronbach's alpha, which can be applied to measure the reliability of a set of Information Retrieval test results. The theory also supports item analysis, which identifies queries that are hampering the test's reliability, and which may be candidates for refinement or removal. A generalization of Classical Test Theory, called Generalizability Theory, provides an even richer set of tools. It allows us to estimate the reliability of a test as a function of the number of queries, assessors (relevance judges), and other aspects of the test's design. One novel aspect of Generalizability Theory is that it allows this estimation of reliability even before the test collection exists, based purely on the numbers of queries and assessors that it will contain. These calculations can help test designers in advance, by allowing them to compare the reliability of test designs with various numbers of queries and relevance assessors, and to spend their limited budgets on a design that maximizes reliability. Empirical analysis shows that in cases for which our data is representative, having more queries is more helpful for reliability than having more assessors. It also suggests that reliability may be improved with a per-document performance measure, as opposed to a document-set based performance measure, where appropriate. The theory also clarifies the implicit debate in IR literature regarding the nature of error in relevance judgments.

Type

a
Bodoff, D.; Wu, B.; Wong, K.Y.M.: Relevance data for language models using maximum likelihood (2003) 0.00
```
0.001938603 = product of:
  0.003877206 = sum of:
    0.003877206 = product of:
      0.007754412 = sum of:
        0.007754412 = weight(_text_:a in 1822) [ClassicSimilarity], result of:
          0.007754412 = score(doc=1822,freq=8.0), product of:
            0.043477926 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.037706986 = queryNorm
            0.17835285 = fieldWeight in 1822, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1822)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

We present a preliminary empirical test of a maximum likelihood approach to using relevance data for training information retrieval (IR) parameters. Similar to language models, our method uses explicitly hypothesized distributions for documents and queries, but we add to this an explicitly hypothesized distribution for relevance judgments. The method unifies document-oriented and query-oriented views. Performance is better than the Rocchio heuristic for document and/or query modification. The maximum likelihood methodology also motivates a heuristic estimate of the MLE optimization. The method can be used to test competing hypotheses regarding the processes of authors' term selection, searchers' term selection, and assessors' relevancy judgments.

Type

a
Bodoff, D.; Raban, D.: User models as revealed in web-based research services (2012) 0.00
```
0.0018577921 = product of:
  0.0037155843 = sum of:
    0.0037155843 = product of:
      0.0074311686 = sum of:
        0.0074311686 = weight(_text_:a in 76) [ClassicSimilarity], result of:
          0.0074311686 = score(doc=76,freq=10.0), product of:
            0.043477926 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.037706986 = queryNorm
            0.1709182 = fieldWeight in 76, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=76)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The user-centered approach to information retrieval emphasizes the importance of a user model in determining what information will be most useful to a particular user, given their context. Mediated search provides an opportunity to elaborate on this idea, as an intermediary's elicitations reveal what aspects of the user model they think are worth inquiring about. However, empirical evidence is divided over whether intermediaries actually work to develop a broadly conceived user model. Our research revisits the issue in a web research services setting, whose characteristics are expected to result in more thorough user modeling on the part of intermediaries. Our empirical study confirms that intermediaries engage in rich user modeling. While intermediaries behave differently across settings, our interpretation is that the underlying user model characteristics that intermediaries inquire about in our setting are applicable to other settings as well.

Type

a
Bodoff, D.; Robertson, S.: ¬A new unified probabilistic model (2004) 0.00
```
0.0016616598 = product of:
  0.0033233196 = sum of:
    0.0033233196 = product of:
      0.006646639 = sum of:
        0.006646639 = weight(_text_:a in 2129) [ClassicSimilarity], result of:
          0.006646639 = score(doc=2129,freq=8.0), product of:
            0.043477926 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.037706986 = queryNorm
            0.15287387 = fieldWeight in 2129, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=2129)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This paper proposes a new unified probabilistic model. Two previous models, Robertson et al.'s "Model 0" and "Model 3," each have strengths and weaknesses. The strength of Model 0 not found in Model 3, is that it does not require relevance data about the particular document or query, and, related to that, its probability estimates are straightforward. The strength of Model 3 not found in Model 0 is that it can utilize feedback information about the particular document and query in question. In this paper we introduce a new unified probabilistic model that combines these strengths: the expression of its probabilities is straightforward, it does not require that data must be available for the particular document or query in question, but it can utilize such specific data if it is available. The model is one way to resolve the difficulty of combining two marginal views in probabilistic retrieval.

Type

a
Bodoff, D.: Emergence of terminological conventions as a searcher-indexer coordination game (2009) 0.00
```
0.0014390396 = product of:
  0.0028780792 = sum of:
    0.0028780792 = product of:
      0.0057561584 = sum of:
        0.0057561584 = weight(_text_:a in 3299) [ClassicSimilarity], result of:
          0.0057561584 = score(doc=3299,freq=6.0), product of:
            0.043477926 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.037706986 = queryNorm
            0.13239266 = fieldWeight in 3299, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=3299)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

In the traditional model of information retrieval, searchers and indexers choose query and index terms, respectively, and these term choices are ultimately compared in a matching process. One of the main challenges in information science and information retrieval is that searchers and indexers often do not choose the same term even though the item is relevant to the need whereas at other times they do choose the same term even though it is not relevant. But if both searchers and indexers have the opportunity to review feedback data showing the success or failure of their previous term choices, then there exists an evolutionary force that, all else being equal, will lead to helpful convergence in searchers' and indexers' term usage when the information is relevant, and helpful divergence of term usage when it is not. Based on learning theory, and new theory presented here, it is possible to predict which terms will emerge as the terminological conventions that are used by groups of searchers and the indexers of relevant and nonrelevant information items.

Type

a
Bodoff, D.; Richter-Levin, Y.: Viewpoints in indexing term assignment (2020) 0.00
```
0.0014390396 = product of:
  0.0028780792 = sum of:
    0.0028780792 = product of:
      0.0057561584 = sum of:
        0.0057561584 = weight(_text_:a in 5765) [ClassicSimilarity], result of:
          0.0057561584 = score(doc=5765,freq=6.0), product of:
            0.043477926 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.037706986 = queryNorm
            0.13239266 = fieldWeight in 5765, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=5765)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The literature on assigned indexing considers three possible viewpoints-the author's viewpoint as evidenced in the title, the users' viewpoint, and the indexer's viewpoint-and asks whether and which of those views should be reflected in an indexer's choice of terms to assign to an item. We study this question empirically, as opposed to normatively. Based on the literature that discusses whose viewpoints should be reflected, we construct a research model that includes those same three viewpoints as factors that might be influencing term assignment in actual practice. In the unique study design that we employ, the records of term assignments made by identified indexers in academic libraries are cross-referenced with the results of a survey that those same indexers completed on political views. Our results indicate that in our setting, variance in term assignment was best explained by indexers' personal political views.

Type

a

Bodoff, D.; Kambil, A.: Partial coordination : I. The best of pre-coordination and post-coordination (1998) 0.00

0.0011749709 = product of:
  0.0023499418 = sum of:
    0.0023499418 = product of:
      0.0046998835 = sum of:
        0.0046998835 = weight(_text_:a in 2322) [ClassicSimilarity], result of:
          0.0046998835 = score(doc=2322,freq=4.0), product of:
            0.043477926 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.037706986 = queryNorm
            0.10809815 = fieldWeight in 2322, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=2322)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Type: a

Bodoff, D.; Wong, S.P.-S.: Documents and queries as random variables : history and implications (2006) 0.00

8.308299E-4 = product of:
  0.0016616598 = sum of:
    0.0016616598 = product of:
      0.0033233196 = sum of:
        0.0033233196 = weight(_text_:a in 193) [ClassicSimilarity], result of:
          0.0033233196 = score(doc=193,freq=2.0), product of:
            0.043477926 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.037706986 = queryNorm
            0.07643694 = fieldWeight in 193, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=193)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Type: a

Search (13 results, page 1 of 1)

Authors

Years

Themes