Search (5 results, page 1 of 1)

  • × author_ss:"Shaw, W.M."
  1. Shaw, W.M.: Subject and citation indexing : pt.1: the clustering structure of composite representations in the cystic fibrosis document collection (1991) 0.00
    0.0026134925 = product of:
      0.005226985 = sum of:
        0.005226985 = product of:
          0.01045397 = sum of:
            0.01045397 = weight(_text_:a in 4841) [ClassicSimilarity], result of:
              0.01045397 = score(doc=4841,freq=10.0), product of:
                0.06116359 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.053045183 = queryNorm
                0.1709182 = fieldWeight in 4841, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4841)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The presence of clustering structure in the CF document collection (cystic fibrosis) is evaluated as a function of the exhaustivity of 5 composite representations. The composite representations are constructed from 2 subject descriptions, based on MeSH and subheadings, and 2 citation indexes, based on the complete set of references an and a comprehensive set of citations to each document. Experiment results reveal observable evidence of clustering structure diminishes as the exhaustivity of each representation is decreased. The representation composed of references and citations shows less evidence of clustering structure at the exhaustive level but more uniform evidence of clustering structure over a wide range of exhaustivity levels than composite representations that include subject descriptions. The structures imposed on the CF document collection by all composite representations satisfy the necessary condition for a meaningful clustering outcome
    Type
    a
  2. Tang, R.; Shaw, W.M.; Vevea, J.L.: Towards the identification of the optimal number of relevance categories (1999) 0.00
    0.0023857814 = product of:
      0.0047715628 = sum of:
        0.0047715628 = product of:
          0.0095431255 = sum of:
            0.0095431255 = weight(_text_:a in 3060) [ClassicSimilarity], result of:
              0.0095431255 = score(doc=3060,freq=12.0), product of:
                0.06116359 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.053045183 = queryNorm
                0.15602624 = fieldWeight in 3060, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3060)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In this paper, we are concerned with participants' confidence in their judgements of the relevance of bibliographic records to particular research questions. We describe an empirical investigation of the association between judges' confidence and the number of categories for a relevance rating scale. Participants rated the relevance of bibliographic records, and recorded their confidence in the relevance ratings. We hypothesize that confidence in relevance judgements is a function of the number of relevance categories that are available in the rating scale. We consider scales ranging from 2 to 11 points, and define the optimal scale as the one for which participants express a maximum level of confidence. A pilot study finds no optimal number of points (because confidence continues to improve slightly through the 11-point scale); nevertheless, the study shows little added benefit associated with scales that have more than six points. On the basis of the findings in that study, we adjusted our experimental procedures and found, in our principal study, that the optimal scale for maximizing confidence in relevance judgements has approximately seven points. We also present exploratory results involving gender effects, and the comparison of scales that have an odd number of points (for which a neutral judgement is possible) with scales that have an even number of points
    Type
    a
  3. Shaw, W.M.: Subject and citation indexing : pt.2: the optimal, cluster-based retrieval performance of composite representations (1991) 0.00
    0.0022038904 = product of:
      0.0044077807 = sum of:
        0.0044077807 = product of:
          0.008815561 = sum of:
            0.008815561 = weight(_text_:a in 4842) [ClassicSimilarity], result of:
              0.008815561 = score(doc=4842,freq=4.0), product of:
                0.06116359 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.053045183 = queryNorm
                0.14413087 = fieldWeight in 4842, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4842)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Fortsetzung von pt.1: experimental retrieval results are presented as a function of the exhaustivity and similarity of the composite representations and reveal consistent patterns from which optimal performance levels can be identified. The optimal performance values provide an assessment of the absolute capacity of each composite representation to associate documents relevant to different queries in single-link hierarchies. The effectiveness of the exhaustive representation composed of references and citations is materially superior to the effectiveness of exhaustive composite representations that include subject descriptions
    Type
    a
  4. Shaw, W.M.; Burgin, R.; Howell, P.: Performance standards and evaluations in IR test collections : vector-space and other retrieval models (1997) 0.00
    0.0020244026 = product of:
      0.004048805 = sum of:
        0.004048805 = product of:
          0.00809761 = sum of:
            0.00809761 = weight(_text_:a in 7259) [ClassicSimilarity], result of:
              0.00809761 = score(doc=7259,freq=6.0), product of:
                0.06116359 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.053045183 = queryNorm
                0.13239266 = fieldWeight in 7259, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=7259)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Computes low performance standards for each query and for the group of queries in 13 traditional and 4 TREC test collections. Predicted by the hypergeometric distribution, the standards represent the highest level of retrieval effectiveness attributable to chance. Compares operational levels of performance for vector-space, ad-hoc-feature-based, probabilistic, and other retrieval models to the standards. The effectiveness of these techniques in small, traditional test collections, can be explained by retrieving a few more relevant documents for most queries than expected by chance. The effectiveness of retrieval techniques in the larger TREC test collections can only be explained by retrieving many more relevant documents for most queries than expected by chance. The discrepancy between deviations form chance in traditional and TREC test collections is due to a decrease in performance standards for large test collections, not to an increase in operational performance. The next generation of information retrieval systems would be enhanced by abandoning uninformative performance summaries and focusing on effectiveness and improvements in effectiveness of individual queries
    Type
    a
  5. Shaw, W.M.; Burgin, R.; Howell, P.: Performance standards and evaluations in IR test collections : cluster-based retrieval models (1997) 0.00
    0.0013635876 = product of:
      0.0027271751 = sum of:
        0.0027271751 = product of:
          0.0054543503 = sum of:
            0.0054543503 = weight(_text_:a in 7256) [ClassicSimilarity], result of:
              0.0054543503 = score(doc=7256,freq=2.0), product of:
                0.06116359 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.053045183 = queryNorm
                0.089176424 = fieldWeight in 7256, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=7256)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Type
    a