Search (90 results, page 2 of 5)

  • × theme_ss:"Indexierungsstudien"
  • × type_ss:"a"
  1. Iivonen, M.; Kivimäki, K.: Common entities and missing properties : similarities and differences in the indexing of concepts (1998) 0.00
    7.2368426E-4 = product of:
      0.010855263 = sum of:
        0.009165013 = weight(_text_:in in 3074) [ClassicSimilarity], result of:
          0.009165013 = score(doc=3074,freq=24.0), product of:
            0.029340398 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021569785 = queryNorm
            0.3123684 = fieldWeight in 3074, product of:
              4.8989797 = tf(freq=24.0), with freq of:
                24.0 = termFreq=24.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.046875 = fieldNorm(doc=3074)
        0.0016902501 = weight(_text_:s in 3074) [ClassicSimilarity], result of:
          0.0016902501 = score(doc=3074,freq=2.0), product of:
            0.023451481 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.021569785 = queryNorm
            0.072074346 = fieldWeight in 3074, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.046875 = fieldNorm(doc=3074)
      0.06666667 = coord(2/30)
    
    Abstract
    The selection and representation of concepts in indexing of the same documents in 2 databases of library and information studies are considered. the authors compare the indexing of 49 documents in KINF and LISA. They focus on the types of concepts presented in indexing, the degree of concept consistency in indexing, and similarities and differences in the indexing of concepts. The largest group of indexed concepts in both databases was the category of entities while concepts belonging to the category of properties were almost missing in both databases. The second largest group of indexed concepts in KINF was the category of activities and in LISA the category of dimensions. Although the concept consistency between KINF and LISA remained rather low and was only 34%, there were approximately 2,2 concepts per document which were indexed from the same documents in both databses. These common concepts belonged mostly to the category of entities
    Source
    Knowledge organization. 25(1998) no.3, S.90-102
  2. Braam, R.R.; Bruil, J.: Quality of indexing information : authors' views on indexing of their articles in chemical abstracts online CA-file (1992) 0.00
    6.9767213E-4 = product of:
      0.010465082 = sum of:
        0.008774832 = weight(_text_:in in 2638) [ClassicSimilarity], result of:
          0.008774832 = score(doc=2638,freq=22.0), product of:
            0.029340398 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021569785 = queryNorm
            0.29906997 = fieldWeight in 2638, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.046875 = fieldNorm(doc=2638)
        0.0016902501 = weight(_text_:s in 2638) [ClassicSimilarity], result of:
          0.0016902501 = score(doc=2638,freq=2.0), product of:
            0.023451481 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.021569785 = queryNorm
            0.072074346 = fieldWeight in 2638, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.046875 = fieldNorm(doc=2638)
      0.06666667 = coord(2/30)
    
    Abstract
    Studies the quality of subject indexing by Chemical Abstracts Indexing Service by confronting authors with the particular indexing terms attributed to their computer, for 270 articles published in 54 journals, 5 articles out of each journal. Responses (80%) indicate the superior quality of keywords, both as content descriptors and as retrieval tools. Author judgements on these 2 different aspects do not always converge, however. CAS's indexing policy to cover only 'new' aspects is reflected in author's judgements that index lists are somewhat incomplete, in particular in the case of thesaurus terms (index headings). The large effort expanded by CAS in maintaining and using a subject thesuaurs, in order to select valid index headings, as compared to quick and cheap keyword postings, does not lead to clear superior quality of thesaurus terms for document description nor in retrieval. Some 20% of papers were not placed in 'proper' CA main section, according to authors. As concerns the use of indexing data by third parties, in bibliometrics, users should be aware of the indexing policies behind the data, in order to prevent invalid interpretations
    Source
    Journal of information science. 18(1992) no.5, S.399-408
  3. Gregor, D.; Mandel, C.: Cataloging must change! (1991) 0.00
    6.7147816E-4 = product of:
      0.010072172 = sum of:
        0.0052914224 = weight(_text_:in in 1999) [ClassicSimilarity], result of:
          0.0052914224 = score(doc=1999,freq=2.0), product of:
            0.029340398 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021569785 = queryNorm
            0.18034597 = fieldWeight in 1999, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.09375 = fieldNorm(doc=1999)
        0.00478075 = weight(_text_:s in 1999) [ClassicSimilarity], result of:
          0.00478075 = score(doc=1999,freq=4.0), product of:
            0.023451481 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.021569785 = queryNorm
            0.20385705 = fieldWeight in 1999, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.09375 = fieldNorm(doc=1999)
      0.06666667 = coord(2/30)
    
    Footnote
    Vgl. auch die Erwiderung von T. Mann in: Cataloging and classification quarterly 23(1997) nos.3/4, S.3-45
    Source
    Library journal. 116(1991) no.6, S.42-47
  4. Boyce, B.R.; McLain, J.P.: Entry point depth and online search using a controlled vocabulary (1989) 0.00
    6.355139E-4 = product of:
      0.009532709 = sum of:
        0.00756075 = weight(_text_:in in 2287) [ClassicSimilarity], result of:
          0.00756075 = score(doc=2287,freq=12.0), product of:
            0.029340398 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021569785 = queryNorm
            0.2576908 = fieldWeight in 2287, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2287)
        0.0019719584 = weight(_text_:s in 2287) [ClassicSimilarity], result of:
          0.0019719584 = score(doc=2287,freq=2.0), product of:
            0.023451481 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.021569785 = queryNorm
            0.08408674 = fieldWeight in 2287, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2287)
      0.06666667 = coord(2/30)
    
    Abstract
    The depth of indexing, the number of terms assigned on average to each document in a retrieval system as entry points, has a significantly effect on the standard retrieval performance measures in modern commercial retrieval systems, just as it did in previous experimental work. Tests on the effect of basic index search, as opposed to controlled vocabulary search, in these real systems are quite different than traditional comparisons of free text searching with controlled vocabulary searching. In modern commercial systems the controlled vocabulary serves as a precision device, since the strucure of the default for unqualified search terms in these systems requires that it do so.
    Source
    Journal of the American Society for Information Science. 40(1989), S.273-276
  5. Morris, L.R.: ¬The frequency of use of Library of Congress Classification numbers and Dewey Decimal Classification numbers in the MARC file in the field of library science (1991) 0.00
    6.355139E-4 = product of:
      0.009532709 = sum of:
        0.00756075 = weight(_text_:in in 2308) [ClassicSimilarity], result of:
          0.00756075 = score(doc=2308,freq=12.0), product of:
            0.029340398 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021569785 = queryNorm
            0.2576908 = fieldWeight in 2308, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2308)
        0.0019719584 = weight(_text_:s in 2308) [ClassicSimilarity], result of:
          0.0019719584 = score(doc=2308,freq=2.0), product of:
            0.023451481 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.021569785 = queryNorm
            0.08408674 = fieldWeight in 2308, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2308)
      0.06666667 = coord(2/30)
    
    Abstract
    The LCC and DDC systems were devised and updated by librarians who had and have no access to the eventual frequency of use of each number in those classification systems. 80% of the monographs in a MARC file of over 1.000.000 records are classified into 20% of the classification numbers in the field of library science and only 20% of the mongraphs are classified into 80% of the classification numbers in the field of library science. Classification of monographs coulld be made easier and performed more accurately if many of the little used and unused numbers were eliminated and many of the most crowded numbers were expanded. A number of examples are included
    Source
    Technical services quarterly. 8(1991) no.1, S.37-49
  6. Chan, L.M.: Alphabetical arrangement and subject collocation in Library of Congress Subject Headings (1977) 0.00
    6.2059314E-4 = product of:
      0.009308897 = sum of:
        0.00705523 = weight(_text_:in in 2268) [ClassicSimilarity], result of:
          0.00705523 = score(doc=2268,freq=8.0), product of:
            0.029340398 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021569785 = queryNorm
            0.24046129 = fieldWeight in 2268, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0625 = fieldNorm(doc=2268)
        0.002253667 = weight(_text_:s in 2268) [ClassicSimilarity], result of:
          0.002253667 = score(doc=2268,freq=2.0), product of:
            0.023451481 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.021569785 = queryNorm
            0.09609913 = fieldWeight in 2268, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0625 = fieldNorm(doc=2268)
      0.06666667 = coord(2/30)
    
    Abstract
    Beginning with Cutter, theorists of subject headings have conceded that certain elements of systematic arrangement in the dictionary catalog are inevitable; yet the fact that no specific guidelines have ever been developed for the determination of the extent to which subject collocation at the expense of specific and direct entry should be allowed has resulted in the many irregularities and inconsistencies now existing in the LCSH
    Source
    Library resources and technical services. 21(1977), S.156-169
  7. Evedove, P.R. Dal; Evedove Tartarotti, R.C. Dal; Lopes Fujita, M.S.: Verbal protocols in Brazilian information science : a perspective from indexing studies (2018) 0.00
    6.2059314E-4 = product of:
      0.009308897 = sum of:
        0.00705523 = weight(_text_:in in 4783) [ClassicSimilarity], result of:
          0.00705523 = score(doc=4783,freq=8.0), product of:
            0.029340398 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021569785 = queryNorm
            0.24046129 = fieldWeight in 4783, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0625 = fieldNorm(doc=4783)
        0.002253667 = weight(_text_:s in 4783) [ClassicSimilarity], result of:
          0.002253667 = score(doc=4783,freq=2.0), product of:
            0.023451481 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.021569785 = queryNorm
            0.09609913 = fieldWeight in 4783, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0625 = fieldNorm(doc=4783)
      0.06666667 = coord(2/30)
    
    Pages
    S.475-482
    Series
    Advances in knowledge organization; vol.16
    Source
    Challenges and opportunities for knowledge organization in the digital age: proceedings of the Fifteenth International ISKO Conference, 9-11 July 2018, Porto, Portugal / organized by: International Society for Knowledge Organization (ISKO), ISKO Spain and Portugal Chapter, University of Porto - Faculty of Arts and Humanities, Research Centre in Communication, Information and Digital Culture (CIC.digital) - Porto. Eds.: F. Ribeiro u. M.E. Cerveira
  8. Bodoff, D.; Richter-Levin, Y.: Viewpoints in indexing term assignment (2020) 0.00
    6.115635E-4 = product of:
      0.009173452 = sum of:
        0.007483202 = weight(_text_:in in 5765) [ClassicSimilarity], result of:
          0.007483202 = score(doc=5765,freq=16.0), product of:
            0.029340398 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021569785 = queryNorm
            0.25504774 = fieldWeight in 5765, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.046875 = fieldNorm(doc=5765)
        0.0016902501 = weight(_text_:s in 5765) [ClassicSimilarity], result of:
          0.0016902501 = score(doc=5765,freq=2.0), product of:
            0.023451481 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.021569785 = queryNorm
            0.072074346 = fieldWeight in 5765, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.046875 = fieldNorm(doc=5765)
      0.06666667 = coord(2/30)
    
    Abstract
    The literature on assigned indexing considers three possible viewpoints-the author's viewpoint as evidenced in the title, the users' viewpoint, and the indexer's viewpoint-and asks whether and which of those views should be reflected in an indexer's choice of terms to assign to an item. We study this question empirically, as opposed to normatively. Based on the literature that discusses whose viewpoints should be reflected, we construct a research model that includes those same three viewpoints as factors that might be influencing term assignment in actual practice. In the unique study design that we employ, the records of term assignments made by identified indexers in academic libraries are cross-referenced with the results of a survey that those same indexers completed on political views. Our results indicate that in our setting, variance in term assignment was best explained by indexers' personal political views.
    Source
    Journal of the Association for Information Science and Technology. 71(2020) no.4, S.450-461
  9. Broxis, P.F.: ASSIA social science information service (1989) 0.00
    6.0353905E-4 = product of:
      0.009053085 = sum of:
        0.006236001 = weight(_text_:in in 1511) [ClassicSimilarity], result of:
          0.006236001 = score(doc=1511,freq=4.0), product of:
            0.029340398 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021569785 = queryNorm
            0.21253976 = fieldWeight in 1511, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.078125 = fieldNorm(doc=1511)
        0.0028170836 = weight(_text_:s in 1511) [ClassicSimilarity], result of:
          0.0028170836 = score(doc=1511,freq=2.0), product of:
            0.023451481 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.021569785 = queryNorm
            0.120123915 = fieldWeight in 1511, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.078125 = fieldNorm(doc=1511)
      0.06666667 = coord(2/30)
    
    Abstract
    ASSIA (Applied Social Science Index and Abtracts) started in 1987 as a bimonthly indexing and abstracting service in the society field, aimed at practitioners as well as sociologists. Considers the following aspects of the service: arrangement of ASSIA; journal coverage; indexing approach; services for subscribers; and who are the users?
    Source
    Outlook on research libraries. 11(1989) no.2, S.3-8
  10. Biagetti, M.T.: Indexing and scientific research needs (2006) 0.00
    5.974731E-4 = product of:
      0.008962097 = sum of:
        0.0061733257 = weight(_text_:in in 235) [ClassicSimilarity], result of:
          0.0061733257 = score(doc=235,freq=8.0), product of:
            0.029340398 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021569785 = queryNorm
            0.21040362 = fieldWeight in 235, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0546875 = fieldNorm(doc=235)
        0.0027887707 = weight(_text_:s in 235) [ClassicSimilarity], result of:
          0.0027887707 = score(doc=235,freq=4.0), product of:
            0.023451481 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.021569785 = queryNorm
            0.118916616 = fieldWeight in 235, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0546875 = fieldNorm(doc=235)
      0.06666667 = coord(2/30)
    
    Abstract
    The paper examines main problems of semantic indexing taking into consideration the connection with the needs of scientific research, in particular in the field of Social Sciences. Multi-modal indexing approach, which allows researchers to find documents according to different dimensions of research, is described. Request-oriented indexing and Pragmatic approach are also discussed and, finally, the possibility of assuming as fundamental principle, in indexing, C. S. Peirce theory of Abduction, is outlined.
    Pages
    S.241-246
    Series
    Advances in knowledge organization; vol.10
  11. Krovetz, R.; Croft, W.B.: Lexical ambiguity and information retrieval (1992) 0.00
    5.9159653E-4 = product of:
      0.008873948 = sum of:
        0.006901989 = weight(_text_:in in 4028) [ClassicSimilarity], result of:
          0.006901989 = score(doc=4028,freq=10.0), product of:
            0.029340398 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021569785 = queryNorm
            0.23523843 = fieldWeight in 4028, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4028)
        0.0019719584 = weight(_text_:s in 4028) [ClassicSimilarity], result of:
          0.0019719584 = score(doc=4028,freq=2.0), product of:
            0.023451481 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.021569785 = queryNorm
            0.08408674 = fieldWeight in 4028, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4028)
      0.06666667 = coord(2/30)
    
    Abstract
    Reports on an analysis of lexical ambiguity in information retrieval text collections and on experiments to determine the utility of word meanings for separating relevant from nonrelevant documents. Results show that there is considerable ambiguity even in a specialised database. Word senses provide a significant separation between relevant and nonrelevant documents, but several factors contribute to determining whether disambiguation will make an improvement in performance such as: resolving lexical ambiguity was found to have little impact on retrieval effectiveness for documents that have many words in common with the query. Discusses other uses of word sense disambiguation in an information retrieval context
    Source
    ACM transactions on information systems. 10(1992) no.2, S.115-141
  12. Tseng, Y.-H.: Keyword extraction techniques and relevance feedback (1997) 0.00
    5.9159653E-4 = product of:
      0.008873948 = sum of:
        0.006901989 = weight(_text_:in in 1830) [ClassicSimilarity], result of:
          0.006901989 = score(doc=1830,freq=10.0), product of:
            0.029340398 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021569785 = queryNorm
            0.23523843 = fieldWeight in 1830, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1830)
        0.0019719584 = weight(_text_:s in 1830) [ClassicSimilarity], result of:
          0.0019719584 = score(doc=1830,freq=2.0), product of:
            0.023451481 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.021569785 = queryNorm
            0.08408674 = fieldWeight in 1830, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1830)
      0.06666667 = coord(2/30)
    
    Abstract
    Automatic keyword extraction is an important and fundamental technology in an advanced information retrieval systems. Briefly compares several major keyword extraction methods, lists their advantages and disadvantages, and reports recent research progress in Taiwan. Also describes the application of a keyword extraction algorithm in an information retrieval system for relevance feedback. Preliminary analysis shows that the error rate of extracting relevant keywords is 18%, and that the precision rate is over 50%. The main disadvantage of this approach is that the extraction results depend on the retrieval results, which in turn depend on the data held by the database. Apart from collecting more data, this problem can be alleviated by the application of a thesaurus constructed by the same keyword extraction algorithm
    Footnote
    [In Chinesisch]
    Source
    Bulletin of the Library Association of China. 1997, no.59, Dec., S.59-64
  13. Mann, T.: 'Cataloging must change!' and indexer consistency studies : misreading the evidence at our peril (1997) 0.00
    5.914012E-4 = product of:
      0.008871017 = sum of:
        0.0064806426 = weight(_text_:in in 492) [ClassicSimilarity], result of:
          0.0064806426 = score(doc=492,freq=12.0), product of:
            0.029340398 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021569785 = queryNorm
            0.22087781 = fieldWeight in 492, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.046875 = fieldNorm(doc=492)
        0.002390375 = weight(_text_:s in 492) [ClassicSimilarity], result of:
          0.002390375 = score(doc=492,freq=4.0), product of:
            0.023451481 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.021569785 = queryNorm
            0.101928525 = fieldWeight in 492, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.046875 = fieldNorm(doc=492)
      0.06666667 = coord(2/30)
    
    Abstract
    An earlier article ('Cataloging must change' by D. Gregor and C. Mandel in: Library journal 116(1991) no.6, S.42-47) has popularized the belief that there is low consistency (only 10-20% agreement) among subject cataloguers in assigning LCSH. Because of this alleged lack og consistency, the article suggests, cataloguers 'can be more accepting in variations in subject choices' in copy cataloguing. Argues that this inference is based on a serious misreading of previous studies of indexer consistency. The 10-20% figure actually derives from studies of people trying to guess the same natural language key words, precisely in the absence of vocabulary control mechanisms such as thesauri or LCSH. Concludes that sources cited fail support their conclusion and some directly contradict it. Raises the concern that a naive acceptance by the library profession of the 10-20% claim can only have negative consequences for the quality of subject cataloguing created, and accepted throughout the country
    Source
    Cataloging and classification quarterly. 23(1997) nos.3/4, S.3-45
  14. Qin, J.: Semantic similarities between a keyword database and a controlled vocabulary database : an investigation in the antibiotic resistance literature (2000) 0.00
    5.813935E-4 = product of:
      0.008720902 = sum of:
        0.0073123598 = weight(_text_:in in 4386) [ClassicSimilarity], result of:
          0.0073123598 = score(doc=4386,freq=22.0), product of:
            0.029340398 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021569785 = queryNorm
            0.24922498 = fieldWeight in 4386, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4386)
        0.0014085418 = weight(_text_:s in 4386) [ClassicSimilarity], result of:
          0.0014085418 = score(doc=4386,freq=2.0), product of:
            0.023451481 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.021569785 = queryNorm
            0.060061958 = fieldWeight in 4386, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4386)
      0.06666667 = coord(2/30)
    
    Abstract
    The 'KeyWords Plus' in the Science Citation Index database represents an approach to combining citation and semantic indexing in describing the document content. This paper explores the similariites or dissimilarities between citation-semantic and analytic indexing. The dataset consisted of over 400 matching records in the SCI and MEDLINE databases on antibiotic resistance in pneumonia. The degree of similarity in indexing terms was found to vary on a scale from completely different to completely identical with various levels in between. The within-document similarity in the 2 databases was measured by a variation on the Jaccard coefficient - the Inclusion Index. The average inclusion coefficient was 0,4134 for SCI and 0,3371 for Medline. The 20 terms occuring most frequently in each database were identified. The 2 groups of terms shared the same terms that consist of the 'intellectual base' for the subject. conceptual similarity was analyzed through scatterplots of matching and nonmatching terms vs. partially identical and broader/narrower terms. The study also found that both databases differed in assigning terms in various semantic categories. Implications of this research and further studies are suggested
    Source
    Journal of the American Society for Information Science. 51(2000) no.2, S.166-180
  15. Lu, K.; Mao, J.; Li, G.: Toward effective automated weighted subject indexing : a comparison of different approaches in different environments (2018) 0.00
    5.7375047E-4 = product of:
      0.008606257 = sum of:
        0.006614278 = weight(_text_:in in 4292) [ClassicSimilarity], result of:
          0.006614278 = score(doc=4292,freq=18.0), product of:
            0.029340398 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021569785 = queryNorm
            0.22543246 = fieldWeight in 4292, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4292)
        0.001991979 = weight(_text_:s in 4292) [ClassicSimilarity], result of:
          0.001991979 = score(doc=4292,freq=4.0), product of:
            0.023451481 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.021569785 = queryNorm
            0.08494043 = fieldWeight in 4292, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4292)
      0.06666667 = coord(2/30)
    
    Abstract
    Subject indexing plays an important role in supporting subject access to information resources. Current subject indexing systems do not make adequate distinctions on the importance of assigned subject descriptors. Assigning numeric weights to subject descriptors to distinguish their importance to the documents can strengthen the role of subject metadata. Automated methods are more cost-effective. This study compares different automated weighting methods in different environments. Two evaluation methods were used to assess the performance. Experiments on three datasets in the biomedical domain suggest the performance of different weighting methods depends on whether it is an abstract or full text environment. Mutual information with bag-of-words representation shows the best average performance in the full text environment, while cosine with bag-of-words representation is the best in an abstract environment. The cosine measure has relatively consistent and robust performance. A direct weighting method, IDF (Inverse Document Frequency), can produce quick and reasonable estimates of the weights. Bag-of-words representation generally outperforms the concept-based representation. Further improvement in performance can be obtained by using the learning-to-rank method to integrate different weighting methods. This study follows up Lu and Mao (Journal of the Association for Information Science and Technology, 66, 1776-1784, 2015), in which an automated weighted subject indexing method was proposed and validated. The findings from this study contribute to more effective weighted subject indexing.
    Footnote
    Vgl. das Erratum in JASIST 69(2018) no.7, S.956.
    Source
    Journal of the Association for Information Science and Technology. 69(2018) no.1, S.121-133
  16. Rowley, J.: ¬The controlled versus natural indexing languages debate revisited : a perspective on information retrieval practice and research (1994) 0.00
    5.58707E-4 = product of:
      0.008380604 = sum of:
        0.006972062 = weight(_text_:in in 7151) [ClassicSimilarity], result of:
          0.006972062 = score(doc=7151,freq=20.0), product of:
            0.029340398 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021569785 = queryNorm
            0.2376267 = fieldWeight in 7151, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=7151)
        0.0014085418 = weight(_text_:s in 7151) [ClassicSimilarity], result of:
          0.0014085418 = score(doc=7151,freq=2.0), product of:
            0.023451481 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.021569785 = queryNorm
            0.060061958 = fieldWeight in 7151, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0390625 = fieldNorm(doc=7151)
      0.06666667 = coord(2/30)
    
    Abstract
    This article revisits the debate concerning controlled and natural indexing languages, as used in searching the databases of the online hosts, in-house information retrieval systems, online public access catalogues and databases stored on CD-ROM. The debate was first formulated in the early days of information retrieval more than a century ago but, despite significant advance in technology, remains unresolved. The article divides the history of the debate into four eras. Era one was characterised by the introduction of controlled vocabulary. Era two focused on comparisons between different indexing languages in order to assess which was best. Era three saw a number of case studies of limited generalisability and a general recognition that the best search performance can be achieved by the parallel use of the two types of indexing languages. The emphasis in Era four has been on the development of end-user-based systems, including online public access catalogues and databases on CD-ROM. Recent developments in the use of expert systems techniques to support the representation of meaning may lead to systems which offer significant support to the user in end-user searching. In the meantime, however, information retrieval in practice involves a mixture of natural and controlled indexing languages used to search a wide variety of different kinds of databases
    Source
    Journal of information science. 20(1994) no.2, S.108-119
  17. Ellis, D.; Furner, J.; Willett, P.: On the creation of hypertext links in full-text documents : measurement of retrieval effectiveness (1996) 0.00
    5.58707E-4 = product of:
      0.008380604 = sum of:
        0.006972062 = weight(_text_:in in 4214) [ClassicSimilarity], result of:
          0.006972062 = score(doc=4214,freq=20.0), product of:
            0.029340398 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021569785 = queryNorm
            0.2376267 = fieldWeight in 4214, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4214)
        0.0014085418 = weight(_text_:s in 4214) [ClassicSimilarity], result of:
          0.0014085418 = score(doc=4214,freq=2.0), product of:
            0.023451481 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.021569785 = queryNorm
            0.060061958 = fieldWeight in 4214, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4214)
      0.06666667 = coord(2/30)
    
    Abstract
    An important stage in the process or retrieval of objects from a hypertext database is the creation of a set of internodal links that are intended to represent the relationships existing between objects; this operation is often undertaken manually, just as index terms are often manually assigned to documents in a conventional retrieval system. In an earlier article (1994), the results were published of a study in which several different sets of links were inserted, each by a different person, between the paragraphs of each of a number of full-text documents. These results showed little similarity between the link-sets, a finding that was comparable with those of studies of inter-indexer consistency, which suggest that there is generally only a low level of agreement between the sets of index terms assigned to a document by different indexers. In this article, a description is provided of an investigation into the nature of the relationship existing between (i) the levels of inter-linker consistency obtaining among the group of hypertext databases used in our earlier experiments, and (ii) the levels of effectiveness of a number of searches carried out in those databases. An account is given of the implementation of the searches and of the methods used in the calculation of numerical values expressing their effectiveness. Analysis of the results of a comparison between recorded levels of consistency and those of effectiveness does not allow us to draw conclusions about the consistency - effectiveness relationship that are equivalent to those drawn in comparable studies of inter-indexer consistency
    Source
    Journal of the American Society for Information Science. 47(1996) no.4, S.287-300
  18. Ansari, M.: Matching between assigned descriptors and title keywords in medical theses (2005) 0.00
    5.58707E-4 = product of:
      0.008380604 = sum of:
        0.006972062 = weight(_text_:in in 4739) [ClassicSimilarity], result of:
          0.006972062 = score(doc=4739,freq=20.0), product of:
            0.029340398 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021569785 = queryNorm
            0.2376267 = fieldWeight in 4739, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4739)
        0.0014085418 = weight(_text_:s in 4739) [ClassicSimilarity], result of:
          0.0014085418 = score(doc=4739,freq=2.0), product of:
            0.023451481 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.021569785 = queryNorm
            0.060061958 = fieldWeight in 4739, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4739)
      0.06666667 = coord(2/30)
    
    Abstract
    Purpose - To examine the degree of exact and partial match between the assigned descriptors and title keywords of medical theses written in Farsi and submitted for a PhD degree.Design/methodology/approach - A sample population of 506 theses in Pediatrics, Gynecology, Cardiology and Psychiatry was randomly picked out of a total of 909 indexed in the Indexing Department of the Central Library of the Iran University of Medical Science and Health Care Services. The results obtained are compared with those reported for other documents written in Farsi and English. Where applicable, the influence of the foreign language and its structure is commented on.Findings - It is shown that the degree of match between the assigned descriptors and the title keywords is greater than 70 per cent, equaling those reported for Farsi books and Michigan University Library catalogue in USA. It is also shown that the frequency of the match has increased since 1982, indicating that the authors have become more attentive in their choice of title.Research limitations/implications - Detailed analysis of results, however, shows significant differences between the degree of exact match amongst the four categories, with psychiatry theses that use more common terms showing highest exact match findings (50 per cent).Originality/value - This paper highlights the need for a closer collaboration with medical institutions for definition of approved terms and their incorporation in indexation in order to improve findings in various medical categories.
    Source
    Library review. 54(2005) no.7, S.410-414
  19. Reich, P.; Biever, E.J.: Indexing consistency : The input/output function of thesauri (1991) 0.00
    5.575784E-4 = product of:
      0.008363675 = sum of:
        0.006110009 = weight(_text_:in in 2258) [ClassicSimilarity], result of:
          0.006110009 = score(doc=2258,freq=6.0), product of:
            0.029340398 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021569785 = queryNorm
            0.2082456 = fieldWeight in 2258, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0625 = fieldNorm(doc=2258)
        0.002253667 = weight(_text_:s in 2258) [ClassicSimilarity], result of:
          0.002253667 = score(doc=2258,freq=2.0), product of:
            0.023451481 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.021569785 = queryNorm
            0.09609913 = fieldWeight in 2258, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0625 = fieldNorm(doc=2258)
      0.06666667 = coord(2/30)
    
    Abstract
    This study measures inter-indexer consistency as determined by the number of identical terms assigned to the same document by two different indexing organizations using the same thesaurus as a source for the entry vocabulary. The authors derive consistency figures of 24 percent and 45 percent for two samples. Factors in the consistency failures include variations in indexing depth, differences in choice of concepts for indexing, different indexing policies, and a highly specific indexing vocabulray. Results indicate that broad search strategies are often necessary for adequate search yields.
    Source
    College and research libraries. 52(1991) no.4, S.336-342
  20. Haanen, E.: Specificiteit en consistentie : een kwantitatief oderzoek naar trefwoordtoekenning door UBA en UBN (1991) 0.00
    5.575784E-4 = product of:
      0.008363675 = sum of:
        0.006110009 = weight(_text_:in in 4778) [ClassicSimilarity], result of:
          0.006110009 = score(doc=4778,freq=6.0), product of:
            0.029340398 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.021569785 = queryNorm
            0.2082456 = fieldWeight in 4778, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.0625 = fieldNorm(doc=4778)
        0.002253667 = weight(_text_:s in 4778) [ClassicSimilarity], result of:
          0.002253667 = score(doc=4778,freq=2.0), product of:
            0.023451481 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.021569785 = queryNorm
            0.09609913 = fieldWeight in 4778, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0625 = fieldNorm(doc=4778)
      0.06666667 = coord(2/30)
    
    Abstract
    Online public access catalogues enable users to undertake subject searching by classification schedules, natural language, or controlled language terminology. In practice the 1st method is little used. Controlled language systems require indexers to index specifically and consistently. A comparative survey was made of indexing practices at Amsterdam and Mijmegen university libraries. On average Amsterdam assigned each document 3.5 index terms against 1.8 at Nijmegen. This discrepancy in indexing policy is the result of long-standing practices in each institution. Nijmegen has failed to utilise the advantages offered by online cataloges
    Source
    Open. 23(1991) no.2, S.45-49

Authors

Languages