Search (8 results, page 1 of 1)

  • × author_ss:"White, H.D."
  1. White, H.D.: Relevance in theory (2009) 0.02
    0.01579685 = product of:
      0.047390547 = sum of:
        0.047390547 = product of:
          0.07108582 = sum of:
            0.047201812 = weight(_text_:theory in 3872) [ClassicSimilarity], result of:
              0.047201812 = score(doc=3872,freq=4.0), product of:
                0.18161562 = queryWeight, product of:
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.04367448 = queryNorm
                0.25989953 = fieldWeight in 3872, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3872)
            0.023884008 = weight(_text_:29 in 3872) [ClassicSimilarity], result of:
              0.023884008 = score(doc=3872,freq=2.0), product of:
                0.15363316 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04367448 = queryNorm
                0.15546128 = fieldWeight in 3872, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3872)
          0.6666667 = coord(2/3)
      0.33333334 = coord(1/3)
    
    Abstract
    Relevance is the central concept in information science because of its salience in designing and evaluating literature-based answering systems. It is salient when users seek information through human intermediaries, such as reference librarians, but becomes even more so when systems are automated and users must navigate them on their own. Designers of classic precomputer systems of the nineteenth and twentieth centuries appear to have been no less concerned with relevance than the information scientists of today. The concept has, however, proved difficult to define and operationalize. A common belief is that it is a relation between a user's request for information and the documents the system retrieves in response. Documents might be considered retrieval-worthy because they: 1) constitute evidence for or against a claim; 2) answer a question; or 3) simply match the request in topic. In practice, literature-based answering makes use of term-matching technology, and most evaluation of relevance has involved topical match as the primary criterion for acceptability. The standard table for evaluating the relation of retrieved documents to a request has only the values "relevant" and "not relevant," yet many analysts hold that relevance admits of degrees. Moreover, many analysts hold that users decide relevance on more dimensions than topical match. Who then can validly judge relevance? Is it only the person who put the request and who can evaluate a document on multiple dimensions? Or can surrogate judges perform this function on the basis of topicality? Such questions arise in a longstanding debate on whether relevance is objective or subjective. One proposal has been to reframe the debate in terms of relevance theory (imported from linguistic pragmatics), which makes relevance increase with a document's valuable cognitive effects and decrease with the effort needed to process it. This notion allows degree of topical match to contribute to relevance but allows other considerations to contribute as well. Since both cognitive effects and processing effort will differ across users, they can be taken as subjective, but users' decisions can also be objectively evaluated if the logic behind them is made explicit. Relevance seems problematical because the considerations that lead people to accept documents in literature searches, or to use them later in contexts such as citation, are seldom fully revealed. Once they are revealed, relevance may be seen as not only multidimensional and dynamic, but also understandable.
    Date
    27. 8.2011 14:29:23
  2. Buzydlowski, J.W.; White, H.D.; Lin, X.: Term Co-occurrence Analysis as an Interface for Digital Libraries (2002) 0.01
    0.013665395 = product of:
      0.040996183 = sum of:
        0.040996183 = product of:
          0.122988544 = sum of:
            0.122988544 = weight(_text_:22 in 1339) [ClassicSimilarity], result of:
              0.122988544 = score(doc=1339,freq=6.0), product of:
                0.15294059 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04367448 = queryNorm
                0.804159 = fieldWeight in 1339, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=1339)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    22. 2.2003 17:25:39
    22. 2.2003 18:16:22
  3. White, H.D.: Authors as citers over time (2001) 0.01
    0.012724607 = product of:
      0.03817382 = sum of:
        0.03817382 = product of:
          0.05726073 = sum of:
            0.033376724 = weight(_text_:theory in 5581) [ClassicSimilarity], result of:
              0.033376724 = score(doc=5581,freq=2.0), product of:
                0.18161562 = queryWeight, product of:
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.04367448 = queryNorm
                0.18377672 = fieldWeight in 5581, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5581)
            0.023884008 = weight(_text_:29 in 5581) [ClassicSimilarity], result of:
              0.023884008 = score(doc=5581,freq=2.0), product of:
                0.15363316 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04367448 = queryNorm
                0.15546128 = fieldWeight in 5581, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5581)
          0.6666667 = coord(2/3)
      0.33333334 = coord(1/3)
    
    Abstract
    This study explores the tendency of authors to recite themselves and others in multiple works over time, using the insights gained to build citation theory. The set of all authors whom an author cites is defined as that author's citation identity. The study explains how to retrieve citation identities from the Institute for Scientific Information's files on Dialog and how to deal with idiosyncrasies of these files. As the author's oeuvre grows, the identity takes the form of a core-and-scatter distribution that may be divided into authors cited only once (unicitations) and authors cited at least twice (recitations). The latter group, especially those recited most frequently, are interpretable as symbols of a citer's main substantive concerns. As illustrated by the top recitees of eight information scientists, identities are intelligible, individualized, and wide-ranging. They are ego-centered without being egotistical. They are often affected by social ties between citers and citees, but the universal motivator seems to be the perceived relevance of the citees' works. Citing styles in identities differ: "scientific-paper style" authors recite heavily, adding to core; "bibliographic-essay style" authors are heavy on unicitations, adding to scatter; "literature-review style" authors do both at once. Identities distill aspects of citers' intellectual lives, such as orienting figures, interdisciplinary interests, bidisciplinary careers, and conduct in controversies. They can also be related to past schemes for classifying citations in categories such as positive-negative and perfunctory- organic; indeed, one author's frequent recitation of another, whether positive or negative, may be the readiest indicator of an organic relation between them. The shape of the core-and-scatter distribution of names in identities can be explained by the principle of least effort. Citers economize on effort by frequently reciting only a relatively small core of names in their identities. They also economize by frequent use of perfunctory citations, which require relatively little context, and infrequent use of negative citations, which require contexts more laborious to set
    Date
    29. 9.2001 13:58:38
  4. White, H.D.: Combining bibliometrics, information retrieval, and relevance theory : part 1: first examples of a synthesis (2007) 0.01
    0.0092713125 = product of:
      0.027813938 = sum of:
        0.027813938 = product of:
          0.08344181 = sum of:
            0.08344181 = weight(_text_:theory in 436) [ClassicSimilarity], result of:
              0.08344181 = score(doc=436,freq=8.0), product of:
                0.18161562 = queryWeight, product of:
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.04367448 = queryNorm
                0.4594418 = fieldWeight in 436, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=436)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    In Sperber and Wilson's relevance theory (RT), the ratio Cognitive Effects/Processing Effort defines the relevance of a communication. The tf*idf formula from information retrieval is used to operationalize this ratio for any item co-occurring with a user-supplied seed term in bibliometric distributions. The tf weight of the item predicts its effect on the user in the context of the seed term, and its idf weight predicts the user's processing effort in relating the item to the seed term. The idf measure, also known as statistical specificity, is shown to have unsuspected applications in quantifying interrelated concepts such as topical and nontopical relevance, levels of user expertise, and levels of authority. A new kind of visualization, the pennant diagram, illustrates these claims. The bibliometric distributions visualized are the works cocited with a seed work (Moby Dick), the authors cocited with a seed author (White HD, for maximum interpretability), and the books and articles cocited with a seed article (S.A. Harter's "Psychological Relevance and Information Science," which introduced RT to information scientists in 1992). Pennant diagrams use bibliometric data and information retrieval techniques on the system side to mimic a relevancetheoretic model of cognition on the user side. Relevance theory may thus influence the design of new visual information retrieval interfaces. Generally, when information retrieval and bibliometrics are interpreted in light of RT, the implications are rich: A single sociocognitive theory may serve to integrate research on literature-based systems with research on their users, areas now largely separate.
  5. White, H.D.: Combining bibliometrics, information retrieval, and relevance theory : part 2: some implications for information science (2007) 0.01
    0.006555808 = product of:
      0.019667422 = sum of:
        0.019667422 = product of:
          0.059002265 = sum of:
            0.059002265 = weight(_text_:theory in 437) [ClassicSimilarity], result of:
              0.059002265 = score(doc=437,freq=4.0), product of:
                0.18161562 = queryWeight, product of:
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.04367448 = queryNorm
                0.3248744 = fieldWeight in 437, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=437)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    When bibliometric data are converted to term frequency (tf) and inverse document frequency (idf) values, plotted as pennant diagrams, and interpreted according to Sperber and Wilson's relevance theory (RT), the results evoke major variables of information science (IS). These include topicality, in the sense of intercohesion and intercoherence among texts; cognitive effects of texts in response to people's questions; people's levels of expertise as a precondition for cognitive effects; processing effort as textual or other messages are received; specificity of terms as it affects processing effort; relevance, defined in RT as the effects/effort ratio; and authority of texts and their authors. While such concerns figure automatically in dialogues between people, they become problematic when people create or use or judge literature-based information systems. The difficulty of achieving worthwhile cognitive effects and acceptable processing effort in human-system dialogues explains why relevance is the central concern of IS. Moreover, since relevant communication with both systems and unfamiliar people is uncertain, speakers tend to seek cognitive effects that cost them the least effort. Yet hearers need greater effort, often greater specificity, from speakers if their responses are to be highly relevant in their turn. This theme of mismatch manifests itself in vague reference questions, underdeveloped online searches, uncreative judging in retrieval evaluation trials, and perfunctory indexing. Another effect of least effort is a bias toward topical relevance over other kinds. RT can explain these outcomes as well as more adaptive ones. Pennant diagrams, applied here to a literature search and a Bradford-style journal analysis, can model them. Given RT and the right context, bibliometrics may predict psychometrics.
  6. White, H.D.: Relevance theory and distributions of judgments in document retrieval (2017) 0.01
    0.006555808 = product of:
      0.019667422 = sum of:
        0.019667422 = product of:
          0.059002265 = sum of:
            0.059002265 = weight(_text_:theory in 5099) [ClassicSimilarity], result of:
              0.059002265 = score(doc=5099,freq=4.0), product of:
                0.18161562 = queryWeight, product of:
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.04367448 = queryNorm
                0.3248744 = fieldWeight in 5099, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5099)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    This article extends relevance theory (RT) from linguistic pragmatics into information retrieval. Using more than 50 retrieval experiments from the literature as examples, it applies RT to explain the frequency distributions of documents on relevance scales with three or more points. The scale points, which judges in experiments must consider in addition to queries and documents, are communications from researchers. In RT, the relevance of a communication varies directly with its cognitive effects and inversely with the effort of processing it. Researchers define and/or label the scale points to measure the cognitive effects of documents on judges. However, they apparently assume that all scale points as presented are equally easy for judges to process. Yet the notion that points cost variable effort explains fairly well the frequency distributions of judgments across them. By hypothesis, points that cost more effort are chosen by judges less frequently. Effort varies with the vagueness or strictness of scale-point labels and definitions. It is shown that vague scales tend to produce U- or V-shaped distributions, while strict scales tend to produce right-skewed distributions. These results reinforce the paper's more general argument that RT clarifies the concept of relevance in the dialogues of retrieval evaluation.
  7. White, H.D.: Author cocitation analysis and pearson's r (2003) 0.00
    0.0046356563 = product of:
      0.013906969 = sum of:
        0.013906969 = product of:
          0.041720904 = sum of:
            0.041720904 = weight(_text_:theory in 2119) [ClassicSimilarity], result of:
              0.041720904 = score(doc=2119,freq=2.0), product of:
                0.18161562 = queryWeight, product of:
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.04367448 = queryNorm
                0.2297209 = fieldWeight in 2119, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2119)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    In their article "Requirements for a cocitation similarity measure, with special reference to Pearson's correlation coefficient," Ahlgren, Jarneving, and Rousseau fault traditional author cocitation analysis (ACA) for using Pearson's r as a measure of similarity between authors because it fails two tests of stability of measurement. The instabilities arise when rs are recalculated after a first coherent group of authors has been augmented by a second coherent group with whom the first has little or no cocitation. However, AJ&R neither cluster nor map their data to demonstrate how fluctuations in rs will mislead the analyst, and the problem they pose is remote from both theory and practice in traditional ACA. By entering their own rs into multidimensional scaling and clustering routines, I show that, despite r's fluctuations, clusters based an it are much the same for the combined groups as for the separate groups. The combined groups when mapped appear as polarized clumps of points in two-dimensional space, confirming that differences between the groups have become much more important than differences within the groups-an accurate portrayal of what has happened to the data. Moreover, r produces clusters and maps very like those based an other coefficients that AJ&R mention as possible replacements, such as a cosine similarity measure or a chi square dissimilarity measure. Thus, r performs well enough for the purposes of ACA. Accordingly, I argue that qualitative information revealing why authors are cocited is more important than the cautions proposed in the AJ&R critique. I include notes an topics such as handling the diagonal in author cocitation matrices, lognormalizing data, and testing r for significance.
  8. White, H.D.: Pathfinder networks and author cocitation analysis : a remapping of paradigmatic information scientists (2003) 0.00
    0.0033172236 = product of:
      0.009951671 = sum of:
        0.009951671 = product of:
          0.029855011 = sum of:
            0.029855011 = weight(_text_:29 in 1459) [ClassicSimilarity], result of:
              0.029855011 = score(doc=1459,freq=2.0), product of:
                0.15363316 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04367448 = queryNorm
                0.19432661 = fieldWeight in 1459, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1459)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    29. 3.2003 19:55:24