Search (23 results, page 1 of 2)

  • × theme_ss:"Indexierungsstudien"
  1. Subrahmanyam, B.: Library of Congress Classification numbers : issues of consistency and their implications for union catalogs (2006) 0.04
    0.035437185 = product of:
      0.053155776 = sum of:
        0.035974823 = weight(_text_:based in 5784) [ClassicSimilarity], result of:
          0.035974823 = score(doc=5784,freq=4.0), product of:
            0.15283063 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.050723847 = queryNorm
            0.23539014 = fieldWeight in 5784, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5784)
        0.017180953 = product of:
          0.034361906 = sum of:
            0.034361906 = weight(_text_:22 in 5784) [ClassicSimilarity], result of:
              0.034361906 = score(doc=5784,freq=2.0), product of:
                0.17762627 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050723847 = queryNorm
                0.19345059 = fieldWeight in 5784, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5784)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    This study examined Library of Congress Classification (LCC)-based class numbers assigned to a representative sample of 200 titles in 52 American library systems to determine the level of consistency within and across those systems. The results showed that under the condition that a library system has a title, the probability of that title having the same LCC-based class number across library systems is greater than 85 percent. An examination of 121 titles displaying variations in class numbers among library systems showed certain titles (for example, multi-foci titles, titles in series, bibliographies, and fiction) lend themselves to alternate class numbers. Others were assigned variant numbers either due to latitude in the schedules or for reasons that cannot be pinpointed. With increasing dependence on copy cataloging, the size of such variations may continue to decrease. As the preferred class number with its alternates represents a title more fully than just the preferred class number, this paper argues for continued use of alternates by library systems and for finding a method to link alternate class numbers to preferred class numbers for enriched subject access through local and union catalogs.
    Date
    10. 9.2000 17:38:22
  2. White, H.; Willis, C.; Greenberg, J.: HIVEing : the effect of a semantic web technology on inter-indexer consistency (2014) 0.03
    0.028412666 = product of:
      0.042618997 = sum of:
        0.025438042 = weight(_text_:based in 1781) [ClassicSimilarity], result of:
          0.025438042 = score(doc=1781,freq=2.0), product of:
            0.15283063 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.050723847 = queryNorm
            0.16644597 = fieldWeight in 1781, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1781)
        0.017180953 = product of:
          0.034361906 = sum of:
            0.034361906 = weight(_text_:22 in 1781) [ClassicSimilarity], result of:
              0.034361906 = score(doc=1781,freq=2.0), product of:
                0.17762627 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050723847 = queryNorm
                0.19345059 = fieldWeight in 1781, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1781)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Purpose - The purpose of this paper is to examine the effect of the Helping Interdisciplinary Vocabulary Engineering (HIVE) system on the inter-indexer consistency of information professionals when assigning keywords to a scientific abstract. This study examined first, the inter-indexer consistency of potential HIVE users; second, the impact HIVE had on consistency; and third, challenges associated with using HIVE. Design/methodology/approach - A within-subjects quasi-experimental research design was used for this study. Data were collected using a task-scenario based questionnaire. Analysis was performed on consistency results using Hooper's and Rolling's inter-indexer consistency measures. A series of t-tests was used to judge the significance between consistency measure results. Findings - Results suggest that HIVE improves inter-indexing consistency. Working with HIVE increased consistency rates by 22 percent (Rolling's) and 25 percent (Hooper's) when selecting relevant terms from all vocabularies. A statistically significant difference exists between the assignment of free-text keywords and machine-aided keywords. Issues with homographs, disambiguation, vocabulary choice, and document structure were all identified as potential challenges. Research limitations/implications - Research limitations for this study can be found in the small number of vocabularies used for the study. Future research will include implementing HIVE into the Dryad Repository and studying its application in a repository system. Originality/value - This paper showcases several features used in HIVE system. By using traditional consistency measures to evaluate a semantic web technology, this paper emphasizes the link between traditional indexing and next generation machine-aided indexing (MAI) tools.
  3. Wolfram, D.; Zhang, J.: ¬An investigation of the influence of indexing exhaustivity and term distributions on a document space (2002) 0.02
    0.016958695 = product of:
      0.050876085 = sum of:
        0.050876085 = weight(_text_:based in 5238) [ClassicSimilarity], result of:
          0.050876085 = score(doc=5238,freq=8.0), product of:
            0.15283063 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.050723847 = queryNorm
            0.33289194 = fieldWeight in 5238, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5238)
      0.33333334 = coord(1/3)
    
    Abstract
    Wolfram and Zhang are interested in the effect of different indexing exhaustivity, by which they mean the number of terms chosen, and of different index term distributions and different term weighting methods on the resulting document cluster organization. The Distance Angle Retrieval Environment, DARE, which provides a two dimensional display of retrieved documents was used to represent the document clusters based upon a document's distance from the searcher's main interest, and on the angle formed by the document, a point representing a minor interest, and the point representing the main interest. If the centroid and the origin of the document space are assigned as major and minor points the average distance between documents and the centroid can be measured providing an indication of cluster organization. in the form of a size normalized similarity measure. Using 500 records from NTIS and nine models created by intersecting low, observed, and high exhaustivity levels (based upon a negative binomial distribution) with shallow, observed, and steep term distributions (based upon a Zipf distribution) simulation runs were preformed using inverse document frequency, inter-document term frequency, and inverse document frequency based upon both inter and intra-document frequencies. Low exhaustivity and shallow distributions result in a more dense document space and less effective retrieval. High exhaustivity and steeper distributions result in a more diffuse space.
  4. Olson, H.A.; Wolfram, D.: Syntagmatic relationships and indexing consistency on a larger scale (2008) 0.02
    0.016958695 = product of:
      0.050876085 = sum of:
        0.050876085 = weight(_text_:based in 2214) [ClassicSimilarity], result of:
          0.050876085 = score(doc=2214,freq=8.0), product of:
            0.15283063 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.050723847 = queryNorm
            0.33289194 = fieldWeight in 2214, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2214)
      0.33333334 = coord(1/3)
    
    Abstract
    Purpose - The purpose of this article is to examine interindexer consistency on a larger scale than other studies have done to determine if group consensus is reached by larger numbers of indexers and what, if any, relationships emerge between assigned terms. Design/methodology/approach - In total, 64 MLIS students were recruited to assign up to five terms to a document. The authors applied basic data modeling and the exploratory statistical techniques of multi-dimensional scaling (MDS) and hierarchical cluster analysis to determine whether relationships exist in indexing consistency and the coocurrence of assigned terms. Findings - Consistency in the assignment of indexing terms to a document follows an inverse shape, although it is not strictly power law-based unlike many other social phenomena. The exploratory techniques revealed that groups of terms clustered together. The resulting term cooccurrence relationships were largely syntagmatic. Research limitations/implications - The results are based on the indexing of one article by non-expert indexers and are, thus, not generalizable. Based on the study findings, along with the growing popularity of folksonomies and the apparent authority of communally developed information resources, communally developed indexes based on group consensus may have merit. Originality/value - Consistency in the assignment of indexing terms has been studied primarily on a small scale. Few studies have examined indexing on a larger scale with more than a handful of indexers. Recognition of the differences in indexing assignment has implications for the development of public information systems, especially those that do not use a controlled vocabulary and those tagged by end-users. In such cases, multiple access points that accommodate the different ways that users interpret content are needed so that searchers may be guided to relevant content despite using different terminology.
  5. Hersh, W.R.; Hickam, D.H.: ¬A comparison of two methods for indexing and retrieval from a full-text medical database (1992) 0.02
    0.016788252 = product of:
      0.050364755 = sum of:
        0.050364755 = weight(_text_:based in 4526) [ClassicSimilarity], result of:
          0.050364755 = score(doc=4526,freq=4.0), product of:
            0.15283063 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.050723847 = queryNorm
            0.3295462 = fieldWeight in 4526, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4526)
      0.33333334 = coord(1/3)
    
    Abstract
    Reports results of a study of 2 information retrieval systems on a 2.000 document full text medical database. The first system, SAPHIRE, features concept based automatic indexing and statistical retrieval techniques, while the second system, SWORD, features traditional word based Boolean techniques, 16 medical students at Oregon Health Sciences Univ. each performed 10 searches and their results, recorded in terms of recall and precision, showed nearly equal performance for both systems. SAPHIRE was also compared with a version of SWORD modified to use automatic indexing and ranked retrieval. Using batch input of queries, the latter method performed slightly better
  6. Soergel, D.: Indexing and retrieval performance : the logical evidence (1994) 0.02
    0.016788252 = product of:
      0.050364755 = sum of:
        0.050364755 = weight(_text_:based in 579) [ClassicSimilarity], result of:
          0.050364755 = score(doc=579,freq=4.0), product of:
            0.15283063 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.050723847 = queryNorm
            0.3295462 = fieldWeight in 579, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0546875 = fieldNorm(doc=579)
      0.33333334 = coord(1/3)
    
    Abstract
    This article presents a logical analysis of the characteristics of indexing and their effects on retrieval performance.It establishes the ability to ask the questions one needs to ask as the foundation of performance evaluation, and recall and discrimination as the basic quantitative performance measures for binary noninteractive retrieval systems. It then defines the characteristics of indexing that affect retrieval - namely, indexing devices, viewpoint-based and importance-based indexing exhaustivity, indexing specifity, indexing correctness, and indexing consistency - and examines in detail their effects on retrieval. It concludes that retrieval performance depends chiefly on the match between indexing and the requirements of the individual query and on the adaption of the query formulation to the characteristics of the retrieval system, and that the ensuing complexity must be considered in the design and testing of retrieval systems
  7. Larson, R.R.: Experiments in automatic Library of Congress Classification (1992) 0.01
    0.01438993 = product of:
      0.04316979 = sum of:
        0.04316979 = weight(_text_:based in 1054) [ClassicSimilarity], result of:
          0.04316979 = score(doc=1054,freq=4.0), product of:
            0.15283063 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.050723847 = queryNorm
            0.28246817 = fieldWeight in 1054, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.046875 = fieldNorm(doc=1054)
      0.33333334 = coord(1/3)
    
    Abstract
    This article presents the results of research into the automatic selection of Library of Congress Classification numbers based on the titles and subject headings in MARC records. The method used in this study was based on partial match retrieval techniques using various elements of new recors (i.e., those to be classified) as "queries", and a test database of classification clusters generated from previously classified MARC records. Sixty individual methods for automatic classification were tested on a set of 283 new records, using all combinations of four different partial match methods, five query types, and three representations of search terms. The results indicate that if the best method for a particular case can be determined, then up to 86% of the new records may be correctly classified. The single method with the best accuracy was able to select the correct classification for about 46% of the new records.
  8. Bade, D.: ¬The creation and persistence of misinformation in shared library catalogs : language and subject knowledge in a technological era (2002) 0.01
    0.014174874 = product of:
      0.02126231 = sum of:
        0.014389929 = weight(_text_:based in 1858) [ClassicSimilarity], result of:
          0.014389929 = score(doc=1858,freq=4.0), product of:
            0.15283063 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.050723847 = queryNorm
            0.09415606 = fieldWeight in 1858, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.015625 = fieldNorm(doc=1858)
        0.006872381 = product of:
          0.013744762 = sum of:
            0.013744762 = weight(_text_:22 in 1858) [ClassicSimilarity], result of:
              0.013744762 = score(doc=1858,freq=2.0), product of:
                0.17762627 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050723847 = queryNorm
                0.07738023 = fieldWeight in 1858, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.015625 = fieldNorm(doc=1858)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Date
    22. 9.1997 19:16:05
    Footnote
    Bade begins his discussion of errors in subject analysis by summarizing the contents of seven records containing what he considers to be egregious errors. The examples were drawn only from items that he has encountered in the course of his work. Five of the seven records were full-level ("I" level) records for Eastern European materials created between 1996 and 2000 in the OCLC WorldCat database. The final two examples were taken from records created by Bade himself over an unspecified period of time. Although he is to be commended for examining the actual items cataloged and for examining mostly items that he claims to have adequate linguistic and subject expertise to evaluate reliably, Bade's methodology has major flaws. First and foremost, the number of examples provided is completely inadequate to draw any conclusions about the extent of the problem. Although an in-depth qualitative analysis of a small number of records might have yielded some valuable insight into factors that contribute to errors in subject analysis, Bade provides no Information about the circumstances under which the live OCLC records he critiques were created. Instead, he offers simplistic explanations for the errors based solely an his own assumptions. He supplements his analysis of examples with an extremely brief survey of other studies regarding errors in subject analysis, which consists primarily of criticism of work done by Sheila Intner. In the end, it is impossible to draw any reliable conclusions about the nature or extent of errors in subject analysis found in records in shared bibliographic databases based an Bade's analysis. In the final third of the essay, Bade finally reveals his true concern: the deintellectualization of cataloging. It would strengthen the essay tremendously to present this as the primary premise from the very beginning, as this section offers glimpses of a compelling argument. Bade laments, "Many librarians simply do not sec cataloging as an intellectual activity requiring an educated mind" (p. 20). Commenting an recent trends in copy cataloging practice, he declares, "The disaster of our time is that this work is being done more and more by people who can neither evaluate nor correct imported errors and offen are forbidden from even thinking about it" (p. 26). Bade argues that the most valuable content found in catalog records is the intellectual content contributed by knowledgeable catalogers, and he asserts that to perform intellectually demanding tasks such as subject analysis reliably and effectively, catalogers must have the linguistic and subject knowledge required to gain at least a rudimentary understanding of the materials that they describe. He contends that requiring catalogers to quickly dispense with materials in unfamiliar languages and subjects clearly undermines their ability to perform the intellectual work of cataloging and leads to an increasing number of errors in the bibliographic records contributed to shared databases.
  9. Cleverdon, C.W.: ASLIB Cranfield Research Project : Report on the first stage of an investigation into the comparative efficiency of indexing systems (1960) 0.01
    0.013744762 = product of:
      0.041234285 = sum of:
        0.041234285 = product of:
          0.08246857 = sum of:
            0.08246857 = weight(_text_:22 in 6158) [ClassicSimilarity], result of:
              0.08246857 = score(doc=6158,freq=2.0), product of:
                0.17762627 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050723847 = queryNorm
                0.46428138 = fieldWeight in 6158, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=6158)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Footnote
    Rez. in: College and research libraries 22(1961) no.3, S.228 (G. Jahoda)
  10. Saarti, J.: Consistency of subject indexing of novels by public library professionals and patrons (2002) 0.01
    0.013566956 = product of:
      0.040700868 = sum of:
        0.040700868 = weight(_text_:based in 4473) [ClassicSimilarity], result of:
          0.040700868 = score(doc=4473,freq=2.0), product of:
            0.15283063 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.050723847 = queryNorm
            0.26631355 = fieldWeight in 4473, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0625 = fieldNorm(doc=4473)
      0.33333334 = coord(1/3)
    
    Abstract
    The paper discusses the consistency of fiction indexing of library professionals and patrons based on an empirical test. Indexing was carried out with a Finnish fictional thesaurus and all of the test persons indexed the same five novels. The consistency of indexing was determined to be low; several reasons are postulated. Also an algorithm for typified indexing of fiction is given as well as some suggestions for the development of fiction information retrieval systems and content representation.
  11. Huffman, G.D.; Vital, D.A.; Bivins, R.G.: Generating indices with lexical association methods : term uniqueness (1990) 0.01
    0.011991608 = product of:
      0.035974823 = sum of:
        0.035974823 = weight(_text_:based in 4152) [ClassicSimilarity], result of:
          0.035974823 = score(doc=4152,freq=4.0), product of:
            0.15283063 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.050723847 = queryNorm
            0.23539014 = fieldWeight in 4152, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4152)
      0.33333334 = coord(1/3)
    
    Abstract
    A software system has been developed which orders citations retrieved from an online database in terms of relevancy. The system resulted from an effort generated by NASA's Technology Utilization Program to create new advanced software tools to largely automate the process of determining relevancy of database citations retrieved to support large technology transfer studies. The ranking is based on the generation of an enriched vocabulary using lexical association methods, a user assessment of the vocabulary and a combination of the user assessment and the lexical metric. One of the key elements in relevancy ranking is the enriched vocabulary -the terms mst be both unique and descriptive. This paper examines term uniqueness. Six lexical association methods were employed to generate characteristic word indices. A limited subset of the terms - the highest 20,40,60 and 7,5% of the uniquess words - we compared and uniquess factors developed. Computational times were also measured. It was found that methods based on occurrences and signal produced virtually the same terms. The limited subset of terms producedby the exact and centroid discrimination value were also nearly identical. Unique terms sets were produced by teh occurrence, variance and discrimination value (centroid), An end-user evaluation showed that the generated terms were largely distinct and had values of word precision which were consistent with values of the search precision.
  12. Mann, T.: 'Cataloging must change!' and indexer consistency studies : misreading the evidence at our peril (1997) 0.01
    0.010175217 = product of:
      0.03052565 = sum of:
        0.03052565 = weight(_text_:based in 492) [ClassicSimilarity], result of:
          0.03052565 = score(doc=492,freq=2.0), product of:
            0.15283063 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.050723847 = queryNorm
            0.19973516 = fieldWeight in 492, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.046875 = fieldNorm(doc=492)
      0.33333334 = coord(1/3)
    
    Abstract
    An earlier article ('Cataloging must change' by D. Gregor and C. Mandel in: Library journal 116(1991) no.6, S.42-47) has popularized the belief that there is low consistency (only 10-20% agreement) among subject cataloguers in assigning LCSH. Because of this alleged lack og consistency, the article suggests, cataloguers 'can be more accepting in variations in subject choices' in copy cataloguing. Argues that this inference is based on a serious misreading of previous studies of indexer consistency. The 10-20% figure actually derives from studies of people trying to guess the same natural language key words, precisely in the absence of vocabulary control mechanisms such as thesauri or LCSH. Concludes that sources cited fail support their conclusion and some directly contradict it. Raises the concern that a naive acceptance by the library profession of the 10-20% claim can only have negative consequences for the quality of subject cataloguing created, and accepted throughout the country
  13. Burgin, R.: ¬The effect of indexing exhaustivity on retrieval performance (1991) 0.01
    0.010175217 = product of:
      0.03052565 = sum of:
        0.03052565 = weight(_text_:based in 5262) [ClassicSimilarity], result of:
          0.03052565 = score(doc=5262,freq=2.0), product of:
            0.15283063 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.050723847 = queryNorm
            0.19973516 = fieldWeight in 5262, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.046875 = fieldNorm(doc=5262)
      0.33333334 = coord(1/3)
    
    Abstract
    The study was based on the collection examnined by W.H. Shaw (Inf. proc. man. 26(1990) no.6, S.693-703, 705-718), a test collection of 1239 articles, indexed with the term cystic fibrosis; and 100 queries with 3 sets of relevance evaluations from subject experts. The effect of variations in indexing exhaustivity on retrieval performance in a vector space retrieval system was investigated by using a term weight threshold to construct different document representations for a test collection. Retrieval results showed that retrieval performance, as measured by the mean optimal measure for all queries at a term weight threshold, was highest at the most exhaustive representation, and decreased slightly as terms were eliminated and the indexing representation became less exhaustive. The findings suggest that the vector space model is more robust against variations in indexing exhaustivity that is the single-link clustering model
  14. Rodriguez Bravo, B.: ¬The visibility of women in indexing languages (2006) 0.01
    0.010175217 = product of:
      0.03052565 = sum of:
        0.03052565 = weight(_text_:based in 263) [ClassicSimilarity], result of:
          0.03052565 = score(doc=263,freq=2.0), product of:
            0.15283063 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.050723847 = queryNorm
            0.19973516 = fieldWeight in 263, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.046875 = fieldNorm(doc=263)
      0.33333334 = coord(1/3)
    
    Abstract
    This article analyses how gender matters are handled in indexing languages. The examples chosen were the Library of Congress Subject Headings (LCSH), the UNESCO Thesaurus (UT) and the European Women's Thesaurus (EWT). The study is based on an analysis of the entries Man/Men and Woman/Women, their subdivisions and established relationship appearing under these entries. Other headings or descriptors are also listed when they allude to men or women but the gender sense occupies only second or third place in the entry, in the shape of an adjective or a second noun. A lack of symmetry, in the treatment of gender is noted, with recommendations being made for equal status for men and women, which should, however, avoid unnecessary enumerations.
  15. Bodoff, D.; Richter-Levin, Y.: Viewpoints in indexing term assignment (2020) 0.01
    0.010175217 = product of:
      0.03052565 = sum of:
        0.03052565 = weight(_text_:based in 5765) [ClassicSimilarity], result of:
          0.03052565 = score(doc=5765,freq=2.0), product of:
            0.15283063 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.050723847 = queryNorm
            0.19973516 = fieldWeight in 5765, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.046875 = fieldNorm(doc=5765)
      0.33333334 = coord(1/3)
    
    Abstract
    The literature on assigned indexing considers three possible viewpoints-the author's viewpoint as evidenced in the title, the users' viewpoint, and the indexer's viewpoint-and asks whether and which of those views should be reflected in an indexer's choice of terms to assign to an item. We study this question empirically, as opposed to normatively. Based on the literature that discusses whose viewpoints should be reflected, we construct a research model that includes those same three viewpoints as factors that might be influencing term assignment in actual practice. In the unique study design that we employ, the records of term assignments made by identified indexers in academic libraries are cross-referenced with the results of a survey that those same indexers completed on political views. Our results indicate that in our setting, variance in term assignment was best explained by indexers' personal political views.
  16. Veenema, F.: To index or not to index (1996) 0.01
    0.009163175 = product of:
      0.027489524 = sum of:
        0.027489524 = product of:
          0.05497905 = sum of:
            0.05497905 = weight(_text_:22 in 7247) [ClassicSimilarity], result of:
              0.05497905 = score(doc=7247,freq=2.0), product of:
                0.17762627 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050723847 = queryNorm
                0.30952093 = fieldWeight in 7247, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=7247)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Source
    Canadian journal of information and library science. 21(1996) no.2, S.1-22
  17. Rowley, J.: ¬The controlled versus natural indexing languages debate revisited : a perspective on information retrieval practice and research (1994) 0.01
    0.008479347 = product of:
      0.025438042 = sum of:
        0.025438042 = weight(_text_:based in 7151) [ClassicSimilarity], result of:
          0.025438042 = score(doc=7151,freq=2.0), product of:
            0.15283063 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.050723847 = queryNorm
            0.16644597 = fieldWeight in 7151, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=7151)
      0.33333334 = coord(1/3)
    
    Abstract
    This article revisits the debate concerning controlled and natural indexing languages, as used in searching the databases of the online hosts, in-house information retrieval systems, online public access catalogues and databases stored on CD-ROM. The debate was first formulated in the early days of information retrieval more than a century ago but, despite significant advance in technology, remains unresolved. The article divides the history of the debate into four eras. Era one was characterised by the introduction of controlled vocabulary. Era two focused on comparisons between different indexing languages in order to assess which was best. Era three saw a number of case studies of limited generalisability and a general recognition that the best search performance can be achieved by the parallel use of the two types of indexing languages. The emphasis in Era four has been on the development of end-user-based systems, including online public access catalogues and databases on CD-ROM. Recent developments in the use of expert systems techniques to support the representation of meaning may lead to systems which offer significant support to the user in end-user searching. In the meantime, however, information retrieval in practice involves a mixture of natural and controlled indexing languages used to search a wide variety of different kinds of databases
  18. Chen, X.: ¬The influence of existing consistency measures on the relationship between indexing consistency and exhaustivity (2008) 0.01
    0.008479347 = product of:
      0.025438042 = sum of:
        0.025438042 = weight(_text_:based in 2502) [ClassicSimilarity], result of:
          0.025438042 = score(doc=2502,freq=2.0), product of:
            0.15283063 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.050723847 = queryNorm
            0.16644597 = fieldWeight in 2502, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2502)
      0.33333334 = coord(1/3)
    
    Content
    Consistency studies have discussed the relationship between indexing consistency and exhaustivity, and it commonly accepted that higher exhaustivity results in lower indexing consistency. However, this issue has been oversimplified, and previous studies contain significant misinterpretations. The aim of this study is investigate the relationship between consistency and exhaustivity based on a large sample and to analyse the misinterpretations in earlier studies. A sample of 3,307 monographs, i.e. 6,614 records was drawn from two Chinese bibliographic catalogues. Indexing consistency was measured using two formulae which were popular in previous indexing consistency studies. A relatively high level of consistency was found (64.21% according to the first formula, 70.71% according to the second). Regarding the relationship between consistency and exhaustivity, it was found that when two indexers had identical exhaustivity, indexing consistency was substantially high. On the contrary, when they had different levels of exhaustivity, consistency was significantly low. It was inevitable with the use of the two formulae. Moreover, a detailed discussion was conducted to analyse the misinterpretations in previous studies.
  19. Lu, K.; Mao, J.; Li, G.: Toward effective automated weighted subject indexing : a comparison of different approaches in different environments (2018) 0.01
    0.008479347 = product of:
      0.025438042 = sum of:
        0.025438042 = weight(_text_:based in 4292) [ClassicSimilarity], result of:
          0.025438042 = score(doc=4292,freq=2.0), product of:
            0.15283063 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.050723847 = queryNorm
            0.16644597 = fieldWeight in 4292, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4292)
      0.33333334 = coord(1/3)
    
    Abstract
    Subject indexing plays an important role in supporting subject access to information resources. Current subject indexing systems do not make adequate distinctions on the importance of assigned subject descriptors. Assigning numeric weights to subject descriptors to distinguish their importance to the documents can strengthen the role of subject metadata. Automated methods are more cost-effective. This study compares different automated weighting methods in different environments. Two evaluation methods were used to assess the performance. Experiments on three datasets in the biomedical domain suggest the performance of different weighting methods depends on whether it is an abstract or full text environment. Mutual information with bag-of-words representation shows the best average performance in the full text environment, while cosine with bag-of-words representation is the best in an abstract environment. The cosine measure has relatively consistent and robust performance. A direct weighting method, IDF (Inverse Document Frequency), can produce quick and reasonable estimates of the weights. Bag-of-words representation generally outperforms the concept-based representation. Further improvement in performance can be obtained by using the learning-to-rank method to integrate different weighting methods. This study follows up Lu and Mao (Journal of the Association for Information Science and Technology, 66, 1776-1784, 2015), in which an automated weighted subject indexing method was proposed and validated. The findings from this study contribute to more effective weighted subject indexing.
  20. Booth, A.: How consistent is MEDLINE indexing? (1990) 0.01
    0.008017778 = product of:
      0.024053333 = sum of:
        0.024053333 = product of:
          0.048106667 = sum of:
            0.048106667 = weight(_text_:22 in 3510) [ClassicSimilarity], result of:
              0.048106667 = score(doc=3510,freq=2.0), product of:
                0.17762627 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050723847 = queryNorm
                0.2708308 = fieldWeight in 3510, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3510)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Source
    Health libraries review. 7(1990) no.1, S.22-26