Search (2 results, page 1 of 1)

Wolfram, D.; Zhang, J.: ¬An investigation of the influence of indexing exhaustivity and term distributions on a document space (2002) 0.00
```
0.0028047764 = product of:
  0.005609553 = sum of:
    0.005609553 = product of:
      0.011219106 = sum of:
        0.011219106 = weight(_text_:a in 5238) [ClassicSimilarity], result of:
          0.011219106 = score(doc=5238,freq=22.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.21126054 = fieldWeight in 5238, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5238)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Wolfram and Zhang are interested in the effect of different indexing exhaustivity, by which they mean the number of terms chosen, and of different index term distributions and different term weighting methods on the resulting document cluster organization. The Distance Angle Retrieval Environment, DARE, which provides a two dimensional display of retrieved documents was used to represent the document clusters based upon a document's distance from the searcher's main interest, and on the angle formed by the document, a point representing a minor interest, and the point representing the main interest. If the centroid and the origin of the document space are assigned as major and minor points the average distance between documents and the centroid can be measured providing an indication of cluster organization. in the form of a size normalized similarity measure. Using 500 records from NTIS and nine models created by intersecting low, observed, and high exhaustivity levels (based upon a negative binomial distribution) with shallow, observed, and steep term distributions (based upon a Zipf distribution) simulation runs were preformed using inverse document frequency, inter-document term frequency, and inverse document frequency based upon both inter and intra-document frequencies. Low exhaustivity and shallow distributions result in a more dense document space and less effective retrieval. High exhaustivity and steeper distributions result in a more diffuse space.

Type

a
Olson, H.A.; Wolfram, D.: Syntagmatic relationships and indexing consistency on a larger scale (2008) 0.00
```
0.0025370158 = product of:
  0.0050740317 = sum of:
    0.0050740317 = product of:
      0.010148063 = sum of:
        0.010148063 = weight(_text_:a in 2214) [ClassicSimilarity], result of:
          0.010148063 = score(doc=2214,freq=18.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.19109234 = fieldWeight in 2214, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2214)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Purpose - The purpose of this article is to examine interindexer consistency on a larger scale than other studies have done to determine if group consensus is reached by larger numbers of indexers and what, if any, relationships emerge between assigned terms. Design/methodology/approach - In total, 64 MLIS students were recruited to assign up to five terms to a document. The authors applied basic data modeling and the exploratory statistical techniques of multi-dimensional scaling (MDS) and hierarchical cluster analysis to determine whether relationships exist in indexing consistency and the coocurrence of assigned terms. Findings - Consistency in the assignment of indexing terms to a document follows an inverse shape, although it is not strictly power law-based unlike many other social phenomena. The exploratory techniques revealed that groups of terms clustered together. The resulting term cooccurrence relationships were largely syntagmatic. Research limitations/implications - The results are based on the indexing of one article by non-expert indexers and are, thus, not generalizable. Based on the study findings, along with the growing popularity of folksonomies and the apparent authority of communally developed information resources, communally developed indexes based on group consensus may have merit. Originality/value - Consistency in the assignment of indexing terms has been studied primarily on a small scale. Few studies have examined indexing on a larger scale with more than a handful of indexers. Recognition of the differences in indexing assignment has implications for the development of public information systems, especially those that do not use a controlled vocabulary and those tagged by end-users. In such cases, multiple access points that accommodate the different ways that users interpret content are needed so that searchers may be guided to relevant content despite using different terminology.

Type

a

Search (2 results, page 1 of 1)

Authors