Search (68 results, page 1 of 4)

  • × language_ss:"e"
  • × theme_ss:"Indexierungsstudien"
  • × type_ss:"a"
  1. Iivonen, M.; Kivimäki, K.: Common entities and missing properties : similarities and differences in the indexing of concepts (1998) 0.00
    0.0016998062 = product of:
      0.015298256 = sum of:
        0.009308225 = weight(_text_:in in 3074) [ClassicSimilarity], result of:
          0.009308225 = score(doc=3074,freq=24.0), product of:
            0.029798867 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021906832 = queryNorm
            0.3123684 = fieldWeight in 3074, product of:
              4.8989797 = tf(freq=24.0), with freq of:
                24.0 = termFreq=24.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.046875 = fieldNorm(doc=3074)
        0.005990031 = product of:
          0.017970093 = sum of:
            0.017970093 = weight(_text_:29 in 3074) [ClassicSimilarity], result of:
              0.017970093 = score(doc=3074,freq=2.0), product of:
                0.077061385 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.021906832 = queryNorm
                0.23319192 = fieldWeight in 3074, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3074)
          0.33333334 = coord(1/3)
      0.11111111 = coord(2/18)
    
    Abstract
    The selection and representation of concepts in indexing of the same documents in 2 databases of library and information studies are considered. the authors compare the indexing of 49 documents in KINF and LISA. They focus on the types of concepts presented in indexing, the degree of concept consistency in indexing, and similarities and differences in the indexing of concepts. The largest group of indexed concepts in both databases was the category of entities while concepts belonging to the category of properties were almost missing in both databases. The second largest group of indexed concepts in KINF was the category of activities and in LISA the category of dimensions. Although the concept consistency between KINF and LISA remained rather low and was only 34%, there were approximately 2,2 concepts per document which were indexed from the same documents in both databses. These common concepts belonged mostly to the category of entities
    Date
    24. 2.1999 21:29:51
  2. Veenema, F.: To index or not to index (1996) 0.00
    0.0015689273 = product of:
      0.014120346 = sum of:
        0.0062054833 = weight(_text_:in in 7247) [ClassicSimilarity], result of:
          0.0062054833 = score(doc=7247,freq=6.0), product of:
            0.029798867 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021906832 = queryNorm
            0.2082456 = fieldWeight in 7247, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0625 = fieldNorm(doc=7247)
        0.007914863 = product of:
          0.023744587 = sum of:
            0.023744587 = weight(_text_:22 in 7247) [ClassicSimilarity], result of:
              0.023744587 = score(doc=7247,freq=2.0), product of:
                0.076713994 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.021906832 = queryNorm
                0.30952093 = fieldWeight in 7247, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=7247)
          0.33333334 = coord(1/3)
      0.11111111 = coord(2/18)
    
    Abstract
    Describes an experiment comparing the performance of automatic full-text indexing software for personal computers with the human intellectual assignment of indexing terms in each document in a collection. Considers the times required to index the document, to retrieve documents satisfying 5 typical foreseen information needs, and the recall and precision ratios of searching. The software used is QuickFinder facility in WordPerfect 6.1 for Windows
    Source
    Canadian journal of information and library science. 21(1996) no.2, S.1-22
  3. Neshat, N.; Horri, A.: ¬A study of subject indexing consistency between the National Library of Iran and Humanities Libraries in the area of Iranian studies (2006) 0.00
    0.0015483715 = product of:
      0.013935343 = sum of:
        0.0070098387 = weight(_text_:in in 230) [ClassicSimilarity], result of:
          0.0070098387 = score(doc=230,freq=10.0), product of:
            0.029798867 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021906832 = queryNorm
            0.23523843 = fieldWeight in 230, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0546875 = fieldNorm(doc=230)
        0.0069255047 = product of:
          0.020776514 = sum of:
            0.020776514 = weight(_text_:22 in 230) [ClassicSimilarity], result of:
              0.020776514 = score(doc=230,freq=2.0), product of:
                0.076713994 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.021906832 = queryNorm
                0.2708308 = fieldWeight in 230, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=230)
          0.33333334 = coord(1/3)
      0.11111111 = coord(2/18)
    
    Abstract
    This study represents an attempt to compare indexing consistency between the catalogers of the National Library of Iran (NLI) on one side and 12 major academic and special libraries located in Tehran on the other. The research findings indicate that in 75% of the libraries the subject inconsistency values are 60% to 85%. In terms of subject classes, the consistency values are 10% to 35.2%, the mean of which is 22.5%. Moreover, the findings show that whenever the number of assigned terms increases, the probability of consistency decreases. This confirms Markey's findings in 1984.
    Date
    4. 1.2007 10:22:26
  4. Hudon, M.: Conceptual compatibility in controlled language tools used to index and access the content of moving image collections (2004) 0.00
    0.0013968822 = product of:
      0.01257194 = sum of:
        0.0065819086 = weight(_text_:in in 2655) [ClassicSimilarity], result of:
          0.0065819086 = score(doc=2655,freq=12.0), product of:
            0.029798867 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021906832 = queryNorm
            0.22087781 = fieldWeight in 2655, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.046875 = fieldNorm(doc=2655)
        0.005990031 = product of:
          0.017970093 = sum of:
            0.017970093 = weight(_text_:29 in 2655) [ClassicSimilarity], result of:
              0.017970093 = score(doc=2655,freq=2.0), product of:
                0.077061385 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.021906832 = queryNorm
                0.23319192 = fieldWeight in 2655, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2655)
          0.33333334 = coord(1/3)
      0.11111111 = coord(2/18)
    
    Abstract
    Five controlled vocabularies currently used for content representation in collections of non art moving images were examined to determine their level of conceptual compatibility. Methods borrowed from previous research in the area of indexing language compatibility were used. Quantitative data and qualitative observations allowed us to estimate more precisely and realistically the actual degree of conceptual redundancy in these indexing languages. It was found that the conceptual overlap is high enough to justify the pursuit of research and development work an a common basic indexing and access language that could be used to name objects, events, categories of persons, and relations most frequently depicted in non art moving image collections.
    Date
    29. 8.2004 16:17:19
    Series
    Advances in knowledge organization; vol.9
  5. Taniguchi, S.: Recording evidence in bibliographic records and descriptive metadata (2005) 0.00
    0.0013908951 = product of:
      0.012518056 = sum of:
        0.0065819086 = weight(_text_:in in 3565) [ClassicSimilarity], result of:
          0.0065819086 = score(doc=3565,freq=12.0), product of:
            0.029798867 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021906832 = queryNorm
            0.22087781 = fieldWeight in 3565, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.046875 = fieldNorm(doc=3565)
        0.0059361467 = product of:
          0.01780844 = sum of:
            0.01780844 = weight(_text_:22 in 3565) [ClassicSimilarity], result of:
              0.01780844 = score(doc=3565,freq=2.0), product of:
                0.076713994 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.021906832 = queryNorm
                0.23214069 = fieldWeight in 3565, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3565)
          0.33333334 = coord(1/3)
      0.11111111 = coord(2/18)
    
    Abstract
    In this article recording evidence for data values in addition to the values themselves in bibliographic records and descriptive metadata is proposed, with the aim of improving the expressiveness and reliability of those records and metadata. Recorded evidence indicates why and how data values are recorded for elements. Recording the history of changes in data values is also proposed, with the aim of reinforcing recorded evidence. First, evidence that can be recorded is categorized into classes: identifiers of rules or tasks, action descriptions of them, and input and output data of them. Dates of recording values and evidence are an additional class. Then, the relative usefulness of evidence classes and also levels (i.e., the record, data element, or data value level) to which an individual evidence class is applied, is examined. Second, examples that can be viewed as recorded evidence in existing bibliographic records and current cataloging rules are shown. Third, some examples of bibliographic records and descriptive metadata with notes of evidence are demonstrated. Fourth, ways of using recorded evidence are addressed.
    Date
    18. 6.2005 13:16:22
  6. Ansari, M.: Matching between assigned descriptors and title keywords in medical theses (2005) 0.00
    0.001341411 = product of:
      0.012072699 = sum of:
        0.0070810067 = weight(_text_:in in 4739) [ClassicSimilarity], result of:
          0.0070810067 = score(doc=4739,freq=20.0), product of:
            0.029798867 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021906832 = queryNorm
            0.2376267 = fieldWeight in 4739, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4739)
        0.0049916925 = product of:
          0.0149750775 = sum of:
            0.0149750775 = weight(_text_:29 in 4739) [ClassicSimilarity], result of:
              0.0149750775 = score(doc=4739,freq=2.0), product of:
                0.077061385 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.021906832 = queryNorm
                0.19432661 = fieldWeight in 4739, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4739)
          0.33333334 = coord(1/3)
      0.11111111 = coord(2/18)
    
    Abstract
    Purpose - To examine the degree of exact and partial match between the assigned descriptors and title keywords of medical theses written in Farsi and submitted for a PhD degree.Design/methodology/approach - A sample population of 506 theses in Pediatrics, Gynecology, Cardiology and Psychiatry was randomly picked out of a total of 909 indexed in the Indexing Department of the Central Library of the Iran University of Medical Science and Health Care Services. The results obtained are compared with those reported for other documents written in Farsi and English. Where applicable, the influence of the foreign language and its structure is commented on.Findings - It is shown that the degree of match between the assigned descriptors and the title keywords is greater than 70 per cent, equaling those reported for Farsi books and Michigan University Library catalogue in USA. It is also shown that the frequency of the match has increased since 1982, indicating that the authors have become more attentive in their choice of title.Research limitations/implications - Detailed analysis of results, however, shows significant differences between the degree of exact match amongst the four categories, with psychiatry theses that use more common terms showing highest exact match findings (50 per cent).Originality/value - This paper highlights the need for a closer collaboration with medical institutions for definition of approved terms and their incorporation in indexation in order to improve findings in various medical categories.
    Date
    3.12.2005 19:38:29
  7. Leininger, K.: Interindexer consistency in PsychINFO (2000) 0.00
    0.0011766955 = product of:
      0.010590259 = sum of:
        0.0046541123 = weight(_text_:in in 2552) [ClassicSimilarity], result of:
          0.0046541123 = score(doc=2552,freq=6.0), product of:
            0.029798867 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021906832 = queryNorm
            0.1561842 = fieldWeight in 2552, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.046875 = fieldNorm(doc=2552)
        0.0059361467 = product of:
          0.01780844 = sum of:
            0.01780844 = weight(_text_:22 in 2552) [ClassicSimilarity], result of:
              0.01780844 = score(doc=2552,freq=2.0), product of:
                0.076713994 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.021906832 = queryNorm
                0.23214069 = fieldWeight in 2552, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2552)
          0.33333334 = coord(1/3)
      0.11111111 = coord(2/18)
    
    Abstract
    Reports results of a study to examine interindexer consistency (the degree to which indexers, when assigning terms to a chosen record, will choose the same terms to reflect that record) in the PsycINFO database using 60 records that were inadvertently processed twice between 1996 and 1998. Five aspects of interindexer consistency were analysed. Two methods were used to calculate interindexer consistency: one posited by Hooper (1965) and the other by Rollin (1981). Aspects analysed were: checktag consistency (66.24% using Hooper's calculation and 77.17% using Rollin's); major-to-all term consistency (49.31% and 62.59% respectively); overall indexing consistency (49.02% and 63.32%); classification code consistency (44.17% and 45.00%); and major-to-major term consistency (43.24% and 56.09%). The average consistency across all categories was 50.4% using Hooper's method and 60.83% using Rollin's. Although comparison with previous studies is difficult due to methodological variations in the overall study of indexing consistency and the specific characteristics of the database, results generally support previous findings when trends and similar studies are analysed.
    Date
    9. 2.1997 18:44:22
  8. Booth, A.: How consistent is MEDLINE indexing? (1990) 0.00
    0.0011178222 = product of:
      0.0100604 = sum of:
        0.0031348949 = weight(_text_:in in 3510) [ClassicSimilarity], result of:
          0.0031348949 = score(doc=3510,freq=2.0), product of:
            0.029798867 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021906832 = queryNorm
            0.10520181 = fieldWeight in 3510, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3510)
        0.0069255047 = product of:
          0.020776514 = sum of:
            0.020776514 = weight(_text_:22 in 3510) [ClassicSimilarity], result of:
              0.020776514 = score(doc=3510,freq=2.0), product of:
                0.076713994 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.021906832 = queryNorm
                0.2708308 = fieldWeight in 3510, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3510)
          0.33333334 = coord(1/3)
      0.11111111 = coord(2/18)
    
    Abstract
    A known-item search for abstracts to previously retrieved references revealed that 2 documents from the same annual volume had been indexed twice. Working from the premise that the whole volume may have been double-indexed, a search strategy was devised that limited the journal code to the year in question. 57 references were retrieved, comprising 28 pairs of duplicates plus a citation for the whole volume. Author, title, source and descriptors were requested off-line and the citations were paired with their duplicates. The 4 categories of descriptors-major descriptors, minor descriptors, subheadings and check-tags-were compared for depth and consistency of indexing and lessons that might be learnt from the study are discussed.
    Source
    Health libraries review. 7(1990) no.1, S.22-26
  9. Lee, D.H.; Schleyer, T.: Social tagging is no substitute for controlled indexing : a comparison of Medical Subject Headings and CiteULike tags assigned to 231,388 papers (2012) 0.00
    0.0010522349 = product of:
      0.0094701145 = sum of:
        0.0044784215 = weight(_text_:in in 383) [ClassicSimilarity], result of:
          0.0044784215 = score(doc=383,freq=8.0), product of:
            0.029798867 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021906832 = queryNorm
            0.15028831 = fieldWeight in 383, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=383)
        0.0049916925 = product of:
          0.0149750775 = sum of:
            0.0149750775 = weight(_text_:29 in 383) [ClassicSimilarity], result of:
              0.0149750775 = score(doc=383,freq=2.0), product of:
                0.077061385 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.021906832 = queryNorm
                0.19432661 = fieldWeight in 383, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=383)
          0.33333334 = coord(1/3)
      0.11111111 = coord(2/18)
    
    Abstract
    Social tagging and controlled indexing both facilitate access to information resources. Given the increasing popularity of social tagging and the limitations of controlled indexing (primarily cost and scalability), it is reasonable to investigate to what degree social tagging could substitute for controlled indexing. In this study, we compared CiteULike tags to Medical Subject Headings (MeSH) terms for 231,388 citations indexed in MEDLINE. In addition to descriptive analyses of the data sets, we present a paper-by-paper analysis of tags and MeSH terms: the number of common annotations, Jaccard similarity, and coverage ratio. In the analysis, we apply three increasingly progressive levels of text processing, ranging from normalization to stemming, to reduce the impact of lexical differences. Annotations of our corpus consisted of over 76,968 distinct tags and 21,129 distinct MeSH terms. The top 20 tags/MeSH terms showed little direct overlap. On a paper-by-paper basis, the number of common annotations ranged from 0.29 to 0.5 and the Jaccard similarity from 2.12% to 3.3% using increased levels of text processing. At most, 77,834 citations (33.6%) shared at least one annotation. Our results show that CiteULike tags and MeSH terms are quite distinct lexically, reflecting different viewpoints/processes between social tagging and controlled indexing.
    Date
    26. 8.2012 14:29:37
  10. Subrahmanyam, B.: Library of Congress Classification numbers : issues of consistency and their implications for union catalogs (2006) 0.00
    0.0010472457 = product of:
      0.009425211 = sum of:
        0.0044784215 = weight(_text_:in in 5784) [ClassicSimilarity], result of:
          0.0044784215 = score(doc=5784,freq=8.0), product of:
            0.029798867 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021906832 = queryNorm
            0.15028831 = fieldWeight in 5784, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5784)
        0.0049467892 = product of:
          0.014840367 = sum of:
            0.014840367 = weight(_text_:22 in 5784) [ClassicSimilarity], result of:
              0.014840367 = score(doc=5784,freq=2.0), product of:
                0.076713994 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.021906832 = queryNorm
                0.19345059 = fieldWeight in 5784, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5784)
          0.33333334 = coord(1/3)
      0.11111111 = coord(2/18)
    
    Abstract
    This study examined Library of Congress Classification (LCC)-based class numbers assigned to a representative sample of 200 titles in 52 American library systems to determine the level of consistency within and across those systems. The results showed that under the condition that a library system has a title, the probability of that title having the same LCC-based class number across library systems is greater than 85 percent. An examination of 121 titles displaying variations in class numbers among library systems showed certain titles (for example, multi-foci titles, titles in series, bibliographies, and fiction) lend themselves to alternate class numbers. Others were assigned variant numbers either due to latitude in the schedules or for reasons that cannot be pinpointed. With increasing dependence on copy cataloging, the size of such variations may continue to decrease. As the preferred class number with its alternates represents a title more fully than just the preferred class number, this paper argues for continued use of alternates by library systems and for finding a method to link alternate class numbers to preferred class numbers for enriched subject access through local and union catalogs.
    Date
    10. 9.2000 17:38:22
  11. Shoham, S.; Kedar, R.: ¬The subject cataloging of monographs with the use of keywords (2001) 0.00
    0.0010408289 = product of:
      0.009367459 = sum of:
        0.0053741056 = weight(_text_:in in 5442) [ClassicSimilarity], result of:
          0.0053741056 = score(doc=5442,freq=18.0), product of:
            0.029798867 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021906832 = queryNorm
            0.18034597 = fieldWeight in 5442, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.03125 = fieldNorm(doc=5442)
        0.003993354 = product of:
          0.011980061 = sum of:
            0.011980061 = weight(_text_:29 in 5442) [ClassicSimilarity], result of:
              0.011980061 = score(doc=5442,freq=2.0), product of:
                0.077061385 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.021906832 = queryNorm
                0.15546128 = fieldWeight in 5442, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5442)
          0.33333334 = coord(1/3)
      0.11111111 = coord(2/18)
    
    Content
    The overall objective of this study was to examine the implementation of a different approach to the expression of the subject content of monographs in the cataloging record, i.e., the use of a post-coordinate, thesaurus of keywords, using inter-indexer consistency testing and in-depth analysis of mistakes in indexing. A sample of 50 non-fiction monographs was subject cataloged by 16 library science students (non-experienced indexers) using the new Hebrew Thesaurus of Indexing Terms (1996). The 800 indexing records of the non-experienced indexers were compared to the "correct indexing records" (prepared by a panel of three experienced indexers). Indexing consistency was measured using two different formulas used in previous inter-indexer studies. A medium level of inter-indexer consistency was found. In the analysis of mistakes, it was found that the most frequent mistake was the assignment of indexing terms to minor subject matter (i.e., subjects that were less than 20% of the content of the book). Among possible explanations offered for these finding are: sparseness of scope notes in the thesaurus, the priority given by Israeli public libraries to Hebrew language materials in the development of their non-fiction collection, and the size of the output of the Israeli publishing industry of non-fiction materials in Hebrew. The results of the consistency tests and the mistakes analysis were also examined in light of several factors: (1) the number of indexing terms assigned; (2) the length of the monographs (number of pages); and (3) subject area of each monograph. The same examinations were carried out for the subject cataloging records prepared by the Israeli Center for Libraries (ICL) for these monographs.
    Source
    Cataloging and classification quarterly. 33(2001) no.2, S.29-54
  12. White, H.; Willis, C.; Greenberg, J.: HIVEing : the effect of a semantic web technology on inter-indexer consistency (2014) 0.00
    9.805796E-4 = product of:
      0.008825216 = sum of:
        0.003878427 = weight(_text_:in in 1781) [ClassicSimilarity], result of:
          0.003878427 = score(doc=1781,freq=6.0), product of:
            0.029798867 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021906832 = queryNorm
            0.1301535 = fieldWeight in 1781, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1781)
        0.0049467892 = product of:
          0.014840367 = sum of:
            0.014840367 = weight(_text_:22 in 1781) [ClassicSimilarity], result of:
              0.014840367 = score(doc=1781,freq=2.0), product of:
                0.076713994 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.021906832 = queryNorm
                0.19345059 = fieldWeight in 1781, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1781)
          0.33333334 = coord(1/3)
      0.11111111 = coord(2/18)
    
    Abstract
    Purpose - The purpose of this paper is to examine the effect of the Helping Interdisciplinary Vocabulary Engineering (HIVE) system on the inter-indexer consistency of information professionals when assigning keywords to a scientific abstract. This study examined first, the inter-indexer consistency of potential HIVE users; second, the impact HIVE had on consistency; and third, challenges associated with using HIVE. Design/methodology/approach - A within-subjects quasi-experimental research design was used for this study. Data were collected using a task-scenario based questionnaire. Analysis was performed on consistency results using Hooper's and Rolling's inter-indexer consistency measures. A series of t-tests was used to judge the significance between consistency measure results. Findings - Results suggest that HIVE improves inter-indexing consistency. Working with HIVE increased consistency rates by 22 percent (Rolling's) and 25 percent (Hooper's) when selecting relevant terms from all vocabularies. A statistically significant difference exists between the assignment of free-text keywords and machine-aided keywords. Issues with homographs, disambiguation, vocabulary choice, and document structure were all identified as potential challenges. Research limitations/implications - Research limitations for this study can be found in the small number of vocabularies used for the study. Future research will include implementing HIVE into the Dryad Repository and studying its application in a repository system. Originality/value - This paper showcases several features used in HIVE system. By using traditional consistency measures to evaluate a semantic web technology, this paper emphasizes the link between traditional indexing and next generation machine-aided indexing (MAI) tools.
  13. Huffman, G.D.; Vital, D.A.; Bivins, R.G.: Generating indices with lexical association methods : term uniqueness (1990) 0.00
    9.0649055E-4 = product of:
      0.008158415 = sum of:
        0.0031667221 = weight(_text_:in in 4152) [ClassicSimilarity], result of:
          0.0031667221 = score(doc=4152,freq=4.0), product of:
            0.029798867 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021906832 = queryNorm
            0.10626988 = fieldWeight in 4152, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4152)
        0.0049916925 = product of:
          0.0149750775 = sum of:
            0.0149750775 = weight(_text_:29 in 4152) [ClassicSimilarity], result of:
              0.0149750775 = score(doc=4152,freq=2.0), product of:
                0.077061385 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.021906832 = queryNorm
                0.19432661 = fieldWeight in 4152, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4152)
          0.33333334 = coord(1/3)
      0.11111111 = coord(2/18)
    
    Abstract
    A software system has been developed which orders citations retrieved from an online database in terms of relevancy. The system resulted from an effort generated by NASA's Technology Utilization Program to create new advanced software tools to largely automate the process of determining relevancy of database citations retrieved to support large technology transfer studies. The ranking is based on the generation of an enriched vocabulary using lexical association methods, a user assessment of the vocabulary and a combination of the user assessment and the lexical metric. One of the key elements in relevancy ranking is the enriched vocabulary -the terms mst be both unique and descriptive. This paper examines term uniqueness. Six lexical association methods were employed to generate characteristic word indices. A limited subset of the terms - the highest 20,40,60 and 7,5% of the uniquess words - we compared and uniquess factors developed. Computational times were also measured. It was found that methods based on occurrences and signal produced virtually the same terms. The limited subset of terms producedby the exact and centroid discrimination value were also nearly identical. Unique terms sets were produced by teh occurrence, variance and discrimination value (centroid), An end-user evaluation showed that the generated terms were largely distinct and had values of word precision which were consistent with values of the search precision.
    Date
    23.11.1995 11:29:46
  14. Braam, R.R.; Bruil, J.: Quality of indexing information : authors' views on indexing of their articles in chemical abstracts online CA-file (1992) 0.00
    4.9510814E-4 = product of:
      0.008911947 = sum of:
        0.008911947 = weight(_text_:in in 2638) [ClassicSimilarity], result of:
          0.008911947 = score(doc=2638,freq=22.0), product of:
            0.029798867 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021906832 = queryNorm
            0.29906997 = fieldWeight in 2638, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.046875 = fieldNorm(doc=2638)
      0.055555556 = coord(1/18)
    
    Abstract
    Studies the quality of subject indexing by Chemical Abstracts Indexing Service by confronting authors with the particular indexing terms attributed to their computer, for 270 articles published in 54 journals, 5 articles out of each journal. Responses (80%) indicate the superior quality of keywords, both as content descriptors and as retrieval tools. Author judgements on these 2 different aspects do not always converge, however. CAS's indexing policy to cover only 'new' aspects is reflected in author's judgements that index lists are somewhat incomplete, in particular in the case of thesaurus terms (index headings). The large effort expanded by CAS in maintaining and using a subject thesuaurs, in order to select valid index headings, as compared to quick and cheap keyword postings, does not lead to clear superior quality of thesaurus terms for document description nor in retrieval. Some 20% of papers were not placed in 'proper' CA main section, according to authors. As concerns the use of indexing data by third parties, in bibliometrics, users should be aware of the indexing policies behind the data, in order to prevent invalid interpretations
  15. Peset, F.; Garzón-Farinós, F.; González, L.M.; García-Massó, X.; Ferrer-Sapena, A.; Toca-Herrera, J.L.; Sánchez-Pérez, E.A.: Survival analysis of author keywords : an application to the library and information sciences area (2020) 0.00
    4.3093634E-4 = product of:
      0.007756854 = sum of:
        0.007756854 = weight(_text_:in in 5774) [ClassicSimilarity], result of:
          0.007756854 = score(doc=5774,freq=24.0), product of:
            0.029798867 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021906832 = queryNorm
            0.260307 = fieldWeight in 5774, product of:
              4.8989797 = tf(freq=24.0), with freq of:
                24.0 = termFreq=24.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5774)
      0.055555556 = coord(1/18)
    
    Abstract
    Our purpose is to adapt a statistical method for the analysis of discrete numerical series to the keywords appearing in scientific articles of a given area. As an example, we apply our methodological approach to the study of the keywords in the Library and Information Sciences (LIS) area. Our objective is to detect the new author keywords that appear in a fixed knowledge area in the period of 1 year in order to quantify the probabilities of survival for 10 years as a function of the impact of the journals where they appeared. Many of the new keywords appearing in the LIS field are ephemeral. Actually, more than half are never used again. In general, the terms most commonly used in the LIS area come from other areas. The average survival time of these keywords is approximately 3 years, being slightly higher in the case of words that were published in journals classified in the second quartile of the area. We believe that measuring the appearance and disappearance of terms will allow understanding some relevant aspects of the evolution of a discipline, providing in this way a new bibliometric approach.
  16. Boyce, B.R.; McLain, J.P.: Entry point depth and online search using a controlled vocabulary (1989) 0.00
    4.266052E-4 = product of:
      0.0076788934 = sum of:
        0.0076788934 = weight(_text_:in in 2287) [ClassicSimilarity], result of:
          0.0076788934 = score(doc=2287,freq=12.0), product of:
            0.029798867 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021906832 = queryNorm
            0.2576908 = fieldWeight in 2287, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2287)
      0.055555556 = coord(1/18)
    
    Abstract
    The depth of indexing, the number of terms assigned on average to each document in a retrieval system as entry points, has a significantly effect on the standard retrieval performance measures in modern commercial retrieval systems, just as it did in previous experimental work. Tests on the effect of basic index search, as opposed to controlled vocabulary search, in these real systems are quite different than traditional comparisons of free text searching with controlled vocabulary searching. In modern commercial systems the controlled vocabulary serves as a precision device, since the strucure of the default for unqualified search terms in these systems requires that it do so.
  17. Morris, L.R.: ¬The frequency of use of Library of Congress Classification numbers and Dewey Decimal Classification numbers in the MARC file in the field of library science (1991) 0.00
    4.266052E-4 = product of:
      0.0076788934 = sum of:
        0.0076788934 = weight(_text_:in in 2308) [ClassicSimilarity], result of:
          0.0076788934 = score(doc=2308,freq=12.0), product of:
            0.029798867 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021906832 = queryNorm
            0.2576908 = fieldWeight in 2308, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2308)
      0.055555556 = coord(1/18)
    
    Abstract
    The LCC and DDC systems were devised and updated by librarians who had and have no access to the eventual frequency of use of each number in those classification systems. 80% of the monographs in a MARC file of over 1.000.000 records are classified into 20% of the classification numbers in the field of library science and only 20% of the mongraphs are classified into 80% of the classification numbers in the field of library science. Classification of monographs coulld be made easier and performed more accurately if many of the little used and unused numbers were eliminated and many of the most crowded numbers were expanded. A number of examples are included
  18. Cleverdon, C.W.: ¬The Cranfield tests on index language devices (1967) 0.00
    4.2222967E-4 = product of:
      0.007600134 = sum of:
        0.007600134 = weight(_text_:in in 1957) [ClassicSimilarity], result of:
          0.007600134 = score(doc=1957,freq=4.0), product of:
            0.029798867 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021906832 = queryNorm
            0.25504774 = fieldWeight in 1957, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.09375 = fieldNorm(doc=1957)
      0.055555556 = coord(1/18)
    
    Footnote
    Wiederabgedruckt in: Readings in information retrieval. Ed.: K. Sparck Jones u. P. Willett. San Francisco: Morgan Kaufmann 1997. S.47-58.
  19. Bodoff, D.; Richter-Levin, Y.: Viewpoints in indexing term assignment (2020) 0.00
    4.2222967E-4 = product of:
      0.007600134 = sum of:
        0.007600134 = weight(_text_:in in 5765) [ClassicSimilarity], result of:
          0.007600134 = score(doc=5765,freq=16.0), product of:
            0.029798867 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021906832 = queryNorm
            0.25504774 = fieldWeight in 5765, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.046875 = fieldNorm(doc=5765)
      0.055555556 = coord(1/18)
    
    Abstract
    The literature on assigned indexing considers three possible viewpoints-the author's viewpoint as evidenced in the title, the users' viewpoint, and the indexer's viewpoint-and asks whether and which of those views should be reflected in an indexer's choice of terms to assign to an item. We study this question empirically, as opposed to normatively. Based on the literature that discusses whose viewpoints should be reflected, we construct a research model that includes those same three viewpoints as factors that might be influencing term assignment in actual practice. In the unique study design that we employ, the records of term assignments made by identified indexers in academic libraries are cross-referenced with the results of a survey that those same indexers completed on political views. Our results indicate that in our setting, variance in term assignment was best explained by indexers' personal political views.
  20. Qin, J.: Semantic similarities between a keyword database and a controlled vocabulary database : an investigation in the antibiotic resistance literature (2000) 0.00
    4.125901E-4 = product of:
      0.007426622 = sum of:
        0.007426622 = weight(_text_:in in 4386) [ClassicSimilarity], result of:
          0.007426622 = score(doc=4386,freq=22.0), product of:
            0.029798867 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021906832 = queryNorm
            0.24922498 = fieldWeight in 4386, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4386)
      0.055555556 = coord(1/18)
    
    Abstract
    The 'KeyWords Plus' in the Science Citation Index database represents an approach to combining citation and semantic indexing in describing the document content. This paper explores the similariites or dissimilarities between citation-semantic and analytic indexing. The dataset consisted of over 400 matching records in the SCI and MEDLINE databases on antibiotic resistance in pneumonia. The degree of similarity in indexing terms was found to vary on a scale from completely different to completely identical with various levels in between. The within-document similarity in the 2 databases was measured by a variation on the Jaccard coefficient - the Inclusion Index. The average inclusion coefficient was 0,4134 for SCI and 0,3371 for Medline. The 20 terms occuring most frequently in each database were identified. The 2 groups of terms shared the same terms that consist of the 'intellectual base' for the subject. conceptual similarity was analyzed through scatterplots of matching and nonmatching terms vs. partially identical and broader/narrower terms. The study also found that both databases differed in assigning terms in various semantic categories. Implications of this research and further studies are suggested

Authors