Search (74 results, page 1 of 4)

Veenema, F.: To index or not to index (1996) 0.07

0.06896008 = product of:
  0.10344011 = sum of:
    0.014795236 = weight(_text_:in in 7247) [ClassicSimilarity], result of:
      0.014795236 = score(doc=7247,freq=6.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.2082456 = fieldWeight in 7247, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0625 = fieldNorm(doc=7247)
    0.08864488 = sum of:
      0.032032568 = weight(_text_:science in 7247) [ClassicSimilarity], result of:
        0.032032568 = score(doc=7247,freq=2.0), product of:
          0.1375819 = queryWeight, product of:
            2.6341193 = idf(docFreq=8627, maxDocs=44218)
            0.052230705 = queryNorm
          0.23282544 = fieldWeight in 7247, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            2.6341193 = idf(docFreq=8627, maxDocs=44218)
            0.0625 = fieldNorm(doc=7247)
      0.056612313 = weight(_text_:22 in 7247) [ClassicSimilarity], result of:
        0.056612313 = score(doc=7247,freq=2.0), product of:
          0.18290302 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052230705 = queryNorm
          0.30952093 = fieldWeight in 7247, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0625 = fieldNorm(doc=7247)
  0.6666667 = coord(2/3)

Abstract: Describes an experiment comparing the performance of automatic full-text indexing software for personal computers with the human intellectual assignment of indexing terms in each document in a collection. Considers the times required to index the document, to retrieve documents satisfying 5 typical foreseen information needs, and the recall and precision ratios of searching. The software used is QuickFinder facility in WordPerfect 6.1 for Windows
Source: Canadian journal of information and library science. 21(1996) no.2, S.1-22

Taniguchi, S.: Recording evidence in bibliographic records and descriptive metadata (2005) 0.05

0.054784253 = product of:
  0.08217638 = sum of:
    0.015692718 = weight(_text_:in in 3565) [ClassicSimilarity], result of:
      0.015692718 = score(doc=3565,freq=12.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.22087781 = fieldWeight in 3565, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=3565)
    0.06648366 = sum of:
      0.024024425 = weight(_text_:science in 3565) [ClassicSimilarity], result of:
        0.024024425 = score(doc=3565,freq=2.0), product of:
          0.1375819 = queryWeight, product of:
            2.6341193 = idf(docFreq=8627, maxDocs=44218)
            0.052230705 = queryNorm
          0.17461908 = fieldWeight in 3565, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            2.6341193 = idf(docFreq=8627, maxDocs=44218)
            0.046875 = fieldNorm(doc=3565)
      0.042459235 = weight(_text_:22 in 3565) [ClassicSimilarity], result of:
        0.042459235 = score(doc=3565,freq=2.0), product of:
          0.18290302 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052230705 = queryNorm
          0.23214069 = fieldWeight in 3565, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=3565)
  0.6666667 = coord(2/3)

Abstract: In this article recording evidence for data values in addition to the values themselves in bibliographic records and descriptive metadata is proposed, with the aim of improving the expressiveness and reliability of those records and metadata. Recorded evidence indicates why and how data values are recorded for elements. Recording the history of changes in data values is also proposed, with the aim of reinforcing recorded evidence. First, evidence that can be recorded is categorized into classes: identifiers of rules or tasks, action descriptions of them, and input and output data of them. Dates of recording values and evidence are an additional class. Then, the relative usefulness of evidence classes and also levels (i.e., the record, data element, or data value level) to which an individual evidence class is applied, is examined. Second, examples that can be viewed as recorded evidence in existing bibliographic records and current cataloging rules are shown. Third, some examples of bibliographic records and descriptive metadata with notes of evidence are demonstrated. Fourth, ways of using recorded evidence are addressed.
Date: 18. 6.2005 13:16:22
Source: Journal of the American Society for Information Science and Technology. 56(2005) no.8, S.872-882

Leininger, K.: Interindexer consistency in PsychINFO (2000) 0.05

0.05172006 = product of:
  0.07758009 = sum of:
    0.011096427 = weight(_text_:in in 2552) [ClassicSimilarity], result of:
      0.011096427 = score(doc=2552,freq=6.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.1561842 = fieldWeight in 2552, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=2552)
    0.06648366 = sum of:
      0.024024425 = weight(_text_:science in 2552) [ClassicSimilarity], result of:
        0.024024425 = score(doc=2552,freq=2.0), product of:
          0.1375819 = queryWeight, product of:
            2.6341193 = idf(docFreq=8627, maxDocs=44218)
            0.052230705 = queryNorm
          0.17461908 = fieldWeight in 2552, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            2.6341193 = idf(docFreq=8627, maxDocs=44218)
            0.046875 = fieldNorm(doc=2552)
      0.042459235 = weight(_text_:22 in 2552) [ClassicSimilarity], result of:
        0.042459235 = score(doc=2552,freq=2.0), product of:
          0.18290302 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052230705 = queryNorm
          0.23214069 = fieldWeight in 2552, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=2552)
  0.6666667 = coord(2/3)

Abstract: Reports results of a study to examine interindexer consistency (the degree to which indexers, when assigning terms to a chosen record, will choose the same terms to reflect that record) in the PsycINFO database using 60 records that were inadvertently processed twice between 1996 and 1998. Five aspects of interindexer consistency were analysed. Two methods were used to calculate interindexer consistency: one posited by Hooper (1965) and the other by Rollin (1981). Aspects analysed were: checktag consistency (66.24% using Hooper's calculation and 77.17% using Rollin's); major-to-all term consistency (49.31% and 62.59% respectively); overall indexing consistency (49.02% and 63.32%); classification code consistency (44.17% and 45.00%); and major-to-major term consistency (43.24% and 56.09%). The average consistency across all categories was 50.4% using Hooper's method and 60.83% using Rollin's. Although comparison with previous studies is difficult due to methodological variations in the overall study of indexing consistency and the specific characteristics of the database, results generally support previous findings when trends and similar studies are analysed.
Date: 9. 2.1997 18:44:22
Source: Journal of librarianship and information science. 32(2000) no.1, S.4-8

Broxis, P.F.: ASSIA social science information service (1989) 0.03

0.028942255 = product of:
  0.043413382 = sum of:
    0.015100324 = weight(_text_:in in 1511) [ClassicSimilarity], result of:
      0.015100324 = score(doc=1511,freq=4.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.21253976 = fieldWeight in 1511, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.078125 = fieldNorm(doc=1511)
    0.028313057 = product of:
      0.056626115 = sum of:
        0.056626115 = weight(_text_:science in 1511) [ClassicSimilarity], result of:
          0.056626115 = score(doc=1511,freq=4.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.41158113 = fieldWeight in 1511, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.078125 = fieldNorm(doc=1511)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: ASSIA (Applied Social Science Index and Abtracts) started in 1987 as a bimonthly indexing and abstracting service in the society field, aimed at practitioners as well as sociologists. Considers the following aspects of the service: arrangement of ASSIA; journal coverage; indexing approach; services for subscribers; and who are the users?

Morris, L.R.: ¬The frequency of use of Library of Congress Classification numbers and Dewey Decimal Classification numbers in the MARC file in the field of library science (1991) 0.03

0.028387709 = product of:
  0.042581562 = sum of:
    0.01830817 = weight(_text_:in in 2308) [ClassicSimilarity], result of:
      0.01830817 = score(doc=2308,freq=12.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.2576908 = fieldWeight in 2308, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2308)
    0.024273392 = product of:
      0.048546784 = sum of:
        0.048546784 = weight(_text_:science in 2308) [ClassicSimilarity], result of:
          0.048546784 = score(doc=2308,freq=6.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.35285735 = fieldWeight in 2308, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2308)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: The LCC and DDC systems were devised and updated by librarians who had and have no access to the eventual frequency of use of each number in those classification systems. 80% of the monographs in a MARC file of over 1.000.000 records are classified into 20% of the classification numbers in the field of library science and only 20% of the mongraphs are classified into 80% of the classification numbers in the field of library science. Classification of monographs coulld be made easier and performed more accurately if many of the little used and unused numbers were eliminated and many of the most crowded numbers were expanded. A number of examples are included

Neshat, N.; Horri, A.: ¬A study of subject indexing consistency between the National Library of Iran and Humanities Libraries in the area of Iranian studies (2006) 0.03

0.027653923 = product of:
  0.041480884 = sum of:
    0.016712997 = weight(_text_:in in 230) [ClassicSimilarity], result of:
      0.016712997 = score(doc=230,freq=10.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.23523843 = fieldWeight in 230, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0546875 = fieldNorm(doc=230)
    0.024767887 = product of:
      0.049535774 = sum of:
        0.049535774 = weight(_text_:22 in 230) [ClassicSimilarity], result of:
          0.049535774 = score(doc=230,freq=2.0), product of:
            0.18290302 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052230705 = queryNorm
            0.2708308 = fieldWeight in 230, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=230)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: This study represents an attempt to compare indexing consistency between the catalogers of the National Library of Iran (NLI) on one side and 12 major academic and special libraries located in Tehran on the other. The research findings indicate that in 75% of the libraries the subject inconsistency values are 60% to 85%. In terms of subject classes, the consistency values are 10% to 35.2%, the mean of which is 22.5%. Moreover, the findings show that whenever the number of assigned terms increases, the probability of consistency decreases. This confirms Markey's findings in 1984.
Date: 4. 1.2007 10:22:26

Qin, J.: Semantic similarities between a keyword database and a controlled vocabulary database : an investigation in the antibiotic resistance literature (2000) 0.02
```
0.023363223 = product of:
  0.035044834 = sum of:
    0.0177067 = weight(_text_:in in 4386) [ClassicSimilarity], result of:
      0.0177067 = score(doc=4386,freq=22.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.24922498 = fieldWeight in 4386, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4386)
    0.017338136 = product of:
      0.034676272 = sum of:
        0.034676272 = weight(_text_:science in 4386) [ClassicSimilarity], result of:
          0.034676272 = score(doc=4386,freq=6.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.25204095 = fieldWeight in 4386, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4386)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

The 'KeyWords Plus' in the Science Citation Index database represents an approach to combining citation and semantic indexing in describing the document content. This paper explores the similariites or dissimilarities between citation-semantic and analytic indexing. The dataset consisted of over 400 matching records in the SCI and MEDLINE databases on antibiotic resistance in pneumonia. The degree of similarity in indexing terms was found to vary on a scale from completely different to completely identical with various levels in between. The within-document similarity in the 2 databases was measured by a variation on the Jaccard coefficient - the Inclusion Index. The average inclusion coefficient was 0,4134 for SCI and 0,3371 for Medline. The 20 terms occuring most frequently in each database were identified. The 2 groups of terms shared the same terms that consist of the 'intellectual base' for the subject. conceptual similarity was analyzed through scatterplots of matching and nonmatching terms vs. partially identical and broader/narrower terms. The study also found that both databases differed in assigning terms in various semantic categories. Implications of this research and further studies are suggested

Object

Science Citation Index

Source

Journal of the American Society for Information Science. 51(2000) no.2, S.166-180
Gil-Leiva, I.; Alonso-Arroyo, A.: Keywords given by authors of scientific articles in database descriptors (2007) 0.02
```
0.023363223 = product of:
  0.035044834 = sum of:
    0.0177067 = weight(_text_:in in 211) [ClassicSimilarity], result of:
      0.0177067 = score(doc=211,freq=22.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.24922498 = fieldWeight in 211, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=211)
    0.017338136 = product of:
      0.034676272 = sum of:
        0.034676272 = weight(_text_:science in 211) [ClassicSimilarity], result of:
          0.034676272 = score(doc=211,freq=6.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.25204095 = fieldWeight in 211, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=211)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

In this article, the authors analyze the keywords given by authors of scientific articles and the descriptors assigned to the articles to ascertain the presence of the keywords in the descriptors. Six-hundred forty INSPEC (Information Service for Physics, Engineering, and Computing), CAB (Current Agriculture Bibliography) abstracts, ISTA (Information Science and Technology Abstracts), and LISA (Library and Information Science Abstracts) database records were consulted. After detailed comparisons, it was found that keywords provided by authors have an important presence in the database descriptors studied; nearly 25% of all the keywords appeared in exactly the same form as descriptors, with another 21% though normalized, still detected in the descriptors. This means that almost 46% of keywords appear in the descriptors, either as such or after normalization. Elsewhere, three distinct indexing policies appear, one represented by INSPEC and LISA (indexers seem to have freedom to assign the descriptors they deem necessary); another is represented by CAB (no record has fewer than four descriptors and, in general, a large number of descriptors is employed). In contrast, in ISTA, a certain institutional code exists towards economy in indexing because 84% of records contain only four descriptors.

Source

Journal of the American Society for Information Science and Technology. 58(2007) no.8, S.1175-1187
Braam, R.R.; Bruil, J.: Quality of indexing information : authors' views on indexing of their articles in chemical abstracts online CA-file (1992) 0.02
```
0.022173502 = product of:
  0.033260252 = sum of:
    0.02124804 = weight(_text_:in in 2638) [ClassicSimilarity], result of:
      0.02124804 = score(doc=2638,freq=22.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.29906997 = fieldWeight in 2638, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=2638)
    0.012012213 = product of:
      0.024024425 = sum of:
        0.024024425 = weight(_text_:science in 2638) [ClassicSimilarity], result of:
          0.024024425 = score(doc=2638,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.17461908 = fieldWeight in 2638, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.046875 = fieldNorm(doc=2638)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Studies the quality of subject indexing by Chemical Abstracts Indexing Service by confronting authors with the particular indexing terms attributed to their computer, for 270 articles published in 54 journals, 5 articles out of each journal. Responses (80%) indicate the superior quality of keywords, both as content descriptors and as retrieval tools. Author judgements on these 2 different aspects do not always converge, however. CAS's indexing policy to cover only 'new' aspects is reflected in author's judgements that index lists are somewhat incomplete, in particular in the case of thesaurus terms (index headings). The large effort expanded by CAS in maintaining and using a subject thesuaurs, in order to select valid index headings, as compared to quick and cheap keyword postings, does not lead to clear superior quality of thesaurus terms for document description nor in retrieval. Some 20% of papers were not placed in 'proper' CA main section, according to authors. As concerns the use of indexing data by third parties, in bibliometrics, users should be aware of the indexing policies behind the data, in order to prevent invalid interpretations

Source

Journal of information science. 18(1992) no.5, S.399-408

Evedove, P.R. Dal; Evedove Tartarotti, R.C. Dal; Lopes Fujita, M.S.: Verbal protocols in Brazilian information science : a perspective from indexing studies (2018) 0.02

0.022066902 = product of:
  0.03310035 = sum of:
    0.017084066 = weight(_text_:in in 4783) [ClassicSimilarity], result of:
      0.017084066 = score(doc=4783,freq=8.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.24046129 = fieldWeight in 4783, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0625 = fieldNorm(doc=4783)
    0.016016284 = product of:
      0.032032568 = sum of:
        0.032032568 = weight(_text_:science in 4783) [ClassicSimilarity], result of:
          0.032032568 = score(doc=4783,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.23282544 = fieldWeight in 4783, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0625 = fieldNorm(doc=4783)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Series: Advances in knowledge organization; vol.16
Source: Challenges and opportunities for knowledge organization in the digital age: proceedings of the Fifteenth International ISKO Conference, 9-11 July 2018, Porto, Portugal / organized by: International Society for Knowledge Organization (ISKO), ISKO Spain and Portugal Chapter, University of Porto - Faculty of Arts and Humanities, Research Centre in Communication, Information and Digital Culture (CIC.digital) - Porto. Eds.: F. Ribeiro u. M.E. Cerveira

Boyce, B.R.; McLain, J.P.: Entry point depth and online search using a controlled vocabulary (1989) 0.02

0.021548279 = product of:
  0.032322418 = sum of:
    0.01830817 = weight(_text_:in in 2287) [ClassicSimilarity], result of:
      0.01830817 = score(doc=2287,freq=12.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.2576908 = fieldWeight in 2287, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2287)
    0.014014249 = product of:
      0.028028497 = sum of:
        0.028028497 = weight(_text_:science in 2287) [ClassicSimilarity], result of:
          0.028028497 = score(doc=2287,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.20372227 = fieldWeight in 2287, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2287)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: The depth of indexing, the number of terms assigned on average to each document in a retrieval system as entry points, has a significantly effect on the standard retrieval performance measures in modern commercial retrieval systems, just as it did in previous experimental work. Tests on the effect of basic index search, as opposed to controlled vocabulary search, in these real systems are quite different than traditional comparisons of free text searching with controlled vocabulary searching. In modern commercial systems the controlled vocabulary serves as a precision device, since the strucure of the default for unqualified search terms in these systems requires that it do so.
Source: Journal of the American Society for Information Science. 40(1989), S.273-276

Booth, A.: How consistent is MEDLINE indexing? (1990) 0.02

0.021494776 = product of:
  0.032242164 = sum of:
    0.0074742786 = weight(_text_:in in 3510) [ClassicSimilarity], result of:
      0.0074742786 = score(doc=3510,freq=2.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.10520181 = fieldWeight in 3510, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3510)
    0.024767887 = product of:
      0.049535774 = sum of:
        0.049535774 = weight(_text_:22 in 3510) [ClassicSimilarity], result of:
          0.049535774 = score(doc=3510,freq=2.0), product of:
            0.18290302 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052230705 = queryNorm
            0.2708308 = fieldWeight in 3510, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3510)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: A known-item search for abstracts to previously retrieved references revealed that 2 documents from the same annual volume had been indexed twice. Working from the premise that the whole volume may have been double-indexed, a search strategy was devised that limited the journal code to the year in question. 57 references were retrieved, comprising 28 pairs of duplicates plus a citation for the whole volume. Author, title, source and descriptors were requested off-line and the citations were paired with their duplicates. The 4 categories of descriptors-major descriptors, minor descriptors, subheadings and check-tags-were compared for depth and consistency of indexing and lessons that might be learnt from the study are discussed.
Source: Health libraries review. 7(1990) no.1, S.22-26

Lu, K.; Mao, J.; Li, G.: Toward effective automated weighted subject indexing : a comparison of different approaches in different environments (2018) 0.02
```
0.020115227 = product of:
  0.03017284 = sum of:
    0.016016312 = weight(_text_:in in 4292) [ClassicSimilarity], result of:
      0.016016312 = score(doc=4292,freq=18.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.22543246 = fieldWeight in 4292, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4292)
    0.014156529 = product of:
      0.028313057 = sum of:
        0.028313057 = weight(_text_:science in 4292) [ClassicSimilarity], result of:
          0.028313057 = score(doc=4292,freq=4.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.20579056 = fieldWeight in 4292, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4292)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Subject indexing plays an important role in supporting subject access to information resources. Current subject indexing systems do not make adequate distinctions on the importance of assigned subject descriptors. Assigning numeric weights to subject descriptors to distinguish their importance to the documents can strengthen the role of subject metadata. Automated methods are more cost-effective. This study compares different automated weighting methods in different environments. Two evaluation methods were used to assess the performance. Experiments on three datasets in the biomedical domain suggest the performance of different weighting methods depends on whether it is an abstract or full text environment. Mutual information with bag-of-words representation shows the best average performance in the full text environment, while cosine with bag-of-words representation is the best in an abstract environment. The cosine measure has relatively consistent and robust performance. A direct weighting method, IDF (Inverse Document Frequency), can produce quick and reasonable estimates of the weights. Bag-of-words representation generally outperforms the concept-based representation. Further improvement in performance can be obtained by using the learning-to-rank method to integrate different weighting methods. This study follows up Lu and Mao (Journal of the Association for Information Science and Technology, 66, 1776-1784, 2015), in which an automated weighted subject indexing method was proposed and validated. The findings from this study contribute to more effective weighted subject indexing.

Footnote

Vgl. das Erratum in JASIST 69(2018) no.7, S.956.

Source

Journal of the Association for Information Science and Technology. 69(2018) no.1, S.121-133

Bodoff, D.; Richter-Levin, Y.: Viewpoints in indexing term assignment (2020) 0.02

0.020088403 = product of:
  0.030132603 = sum of:
    0.01812039 = weight(_text_:in in 5765) [ClassicSimilarity], result of:
      0.01812039 = score(doc=5765,freq=16.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.25504774 = fieldWeight in 5765, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=5765)
    0.012012213 = product of:
      0.024024425 = sum of:
        0.024024425 = weight(_text_:science in 5765) [ClassicSimilarity], result of:
          0.024024425 = score(doc=5765,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.17461908 = fieldWeight in 5765, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.046875 = fieldNorm(doc=5765)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: The literature on assigned indexing considers three possible viewpoints-the author's viewpoint as evidenced in the title, the users' viewpoint, and the indexer's viewpoint-and asks whether and which of those views should be reflected in an indexer's choice of terms to assign to an item. We study this question empirically, as opposed to normatively. Based on the literature that discusses whose viewpoints should be reflected, we construct a research model that includes those same three viewpoints as factors that might be influencing term assignment in actual practice. In the unique study design that we employ, the records of term assignments made by identified indexers in academic libraries are cross-referenced with the results of a survey that those same indexers completed on political views. Our results indicate that in our setting, variance in term assignment was best explained by indexers' personal political views.
Source: Journal of the Association for Information Science and Technology. 71(2020) no.4, S.450-461

Peset, F.; Garzón-Farinós, F.; González, L.M.; García-Massó, X.; Ferrer-Sapena, A.; Toca-Herrera, J.L.; Sánchez-Pérez, E.A.: Survival analysis of author keywords : an application to the library and information sciences area (2020) 0.02
```
0.019002816 = product of:
  0.028504223 = sum of:
    0.018494045 = weight(_text_:in in 5774) [ClassicSimilarity], result of:
      0.018494045 = score(doc=5774,freq=24.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.260307 = fieldWeight in 5774, product of:
          4.8989797 = tf(freq=24.0), with freq of:
            24.0 = termFreq=24.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5774)
    0.010010177 = product of:
      0.020020355 = sum of:
        0.020020355 = weight(_text_:science in 5774) [ClassicSimilarity], result of:
          0.020020355 = score(doc=5774,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.1455159 = fieldWeight in 5774, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5774)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Our purpose is to adapt a statistical method for the analysis of discrete numerical series to the keywords appearing in scientific articles of a given area. As an example, we apply our methodological approach to the study of the keywords in the Library and Information Sciences (LIS) area. Our objective is to detect the new author keywords that appear in a fixed knowledge area in the period of 1 year in order to quantify the probabilities of survival for 10 years as a function of the impact of the journals where they appeared. Many of the new keywords appearing in the LIS field are ephemeral. Actually, more than half are never used again. In general, the terms most commonly used in the LIS area come from other areas. The average survival time of these keywords is approximately 3 years, being slightly higher in the case of words that were published in journals classified in the second quartile of the area. We believe that measuring the appearance and disappearance of terms will allow understanding some relevant aspects of the evolution of a discipline, providing in this way a new bibliometric approach.

Source

Journal of the Association for Information Science and Technology. 71(2020) no.4, S.462-473
Subrahmanyam, B.: Library of Congress Classification numbers : issues of consistency and their implications for union catalogs (2006) 0.02
```
0.018912595 = product of:
  0.02836889 = sum of:
    0.010677542 = weight(_text_:in in 5784) [ClassicSimilarity], result of:
      0.010677542 = score(doc=5784,freq=8.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.15028831 = fieldWeight in 5784, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5784)
    0.017691348 = product of:
      0.035382695 = sum of:
        0.035382695 = weight(_text_:22 in 5784) [ClassicSimilarity], result of:
          0.035382695 = score(doc=5784,freq=2.0), product of:
            0.18290302 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052230705 = queryNorm
            0.19345059 = fieldWeight in 5784, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5784)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

This study examined Library of Congress Classification (LCC)-based class numbers assigned to a representative sample of 200 titles in 52 American library systems to determine the level of consistency within and across those systems. The results showed that under the condition that a library system has a title, the probability of that title having the same LCC-based class number across library systems is greater than 85 percent. An examination of 121 titles displaying variations in class numbers among library systems showed certain titles (for example, multi-foci titles, titles in series, bibliographies, and fiction) lend themselves to alternate class numbers. Others were assigned variant numbers either due to latitude in the schedules or for reasons that cannot be pinpointed. With increasing dependence on copy cataloging, the size of such variations may continue to decrease. As the preferred class number with its alternates represents a title more fully than just the preferred class number, this paper argues for continued use of alternates by library systems and for finding a method to link alternate class numbers to preferred class numbers for enriched subject access through local and union catalogs.

Date

10. 9.2000 17:38:22

Tonta, Y.: ¬A study of indexing consistency between Library of Congress and British Library catalogers (1991) 0.02

0.01873103 = product of:
  0.028096544 = sum of:
    0.01208026 = weight(_text_:in in 2277) [ClassicSimilarity], result of:
      0.01208026 = score(doc=2277,freq=4.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.17003182 = fieldWeight in 2277, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0625 = fieldNorm(doc=2277)
    0.016016284 = product of:
      0.032032568 = sum of:
        0.032032568 = weight(_text_:science in 2277) [ClassicSimilarity], result of:
          0.032032568 = score(doc=2277,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.23282544 = fieldWeight in 2277, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0625 = fieldNorm(doc=2277)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: Indexing consistency between Library of Congress and British Library catalogers using the LCSH is compared.82 titles published in 1987 in the field of library and information science were identified for comparison, and for each title its LC subject headings, assigned by both LC and BL catalogers, were compared. By applying Hooper's 'consistency of a pair' equation, the average indexing consistency value was calculated for the 82 titles. The average indexing value between LC and BL catalogers is 16% for exact matches, and 36% for partial matches

David, C.; Giroux, L.; Bertrand-Gastaldy, S.; Lanteigne, D.: Indexing as problem solving : a cognitive approach to consistency (1995) 0.02

0.01873103 = product of:
  0.028096544 = sum of:
    0.01208026 = weight(_text_:in in 3833) [ClassicSimilarity], result of:
      0.01208026 = score(doc=3833,freq=4.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.17003182 = fieldWeight in 3833, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0625 = fieldNorm(doc=3833)
    0.016016284 = product of:
      0.032032568 = sum of:
        0.032032568 = weight(_text_:science in 3833) [ClassicSimilarity], result of:
          0.032032568 = score(doc=3833,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.23282544 = fieldWeight in 3833, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0625 = fieldNorm(doc=3833)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: Presents results of an experiment in which 8 indexers (4 beginners and 4 experts) were asked to index the same 4 documents with 2 different thesauri. The 3 kind of verbal reports provide complementary data on strategic behaviour. it is of prime importance to consider the indexing task as an ill-defined problem, where the solutionm is partly defined by the indexer
Source: Forging new partnerships in information: converging technologies. Proceedings of the 58th Annual Meeting of the American Society for Information Science, ASIS'95, Chicago, IL, 9-12 October 1995. Ed.: T. Kinney

White, H.; Willis, C.; Greenberg, J.: HIVEing : the effect of a semantic web technology on inter-indexer consistency (2014) 0.02
```
0.017958915 = product of:
  0.026938371 = sum of:
    0.009247023 = weight(_text_:in in 1781) [ClassicSimilarity], result of:
      0.009247023 = score(doc=1781,freq=6.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.1301535 = fieldWeight in 1781, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1781)
    0.017691348 = product of:
      0.035382695 = sum of:
        0.035382695 = weight(_text_:22 in 1781) [ClassicSimilarity], result of:
          0.035382695 = score(doc=1781,freq=2.0), product of:
            0.18290302 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052230705 = queryNorm
            0.19345059 = fieldWeight in 1781, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1781)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Purpose - The purpose of this paper is to examine the effect of the Helping Interdisciplinary Vocabulary Engineering (HIVE) system on the inter-indexer consistency of information professionals when assigning keywords to a scientific abstract. This study examined first, the inter-indexer consistency of potential HIVE users; second, the impact HIVE had on consistency; and third, challenges associated with using HIVE. Design/methodology/approach - A within-subjects quasi-experimental research design was used for this study. Data were collected using a task-scenario based questionnaire. Analysis was performed on consistency results using Hooper's and Rolling's inter-indexer consistency measures. A series of t-tests was used to judge the significance between consistency measure results. Findings - Results suggest that HIVE improves inter-indexing consistency. Working with HIVE increased consistency rates by 22 percent (Rolling's) and 25 percent (Hooper's) when selecting relevant terms from all vocabularies. A statistically significant difference exists between the assignment of free-text keywords and machine-aided keywords. Issues with homographs, disambiguation, vocabulary choice, and document structure were all identified as potential challenges. Research limitations/implications - Research limitations for this study can be found in the small number of vocabularies used for the study. Future research will include implementing HIVE into the Dryad Repository and studying its application in a repository system. Originality/value - This paper showcases several features used in HIVE system. By using traditional consistency measures to evaluate a semantic web technology, this paper emphasizes the link between traditional indexing and next generation machine-aided indexing (MAI) tools.
Rowley, J.: ¬The controlled versus natural indexing languages debate revisited : a perspective on information retrieval practice and research (1994) 0.02
```
0.01792857 = product of:
  0.026892854 = sum of:
    0.016882677 = weight(_text_:in in 7151) [ClassicSimilarity], result of:
      0.016882677 = score(doc=7151,freq=20.0), product of:
        0.07104705 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.052230705 = queryNorm
        0.2376267 = fieldWeight in 7151, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=7151)
    0.010010177 = product of:
      0.020020355 = sum of:
        0.020020355 = weight(_text_:science in 7151) [ClassicSimilarity], result of:
          0.020020355 = score(doc=7151,freq=2.0), product of:
            0.1375819 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.052230705 = queryNorm
            0.1455159 = fieldWeight in 7151, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=7151)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

This article revisits the debate concerning controlled and natural indexing languages, as used in searching the databases of the online hosts, in-house information retrieval systems, online public access catalogues and databases stored on CD-ROM. The debate was first formulated in the early days of information retrieval more than a century ago but, despite significant advance in technology, remains unresolved. The article divides the history of the debate into four eras. Era one was characterised by the introduction of controlled vocabulary. Era two focused on comparisons between different indexing languages in order to assess which was best. Era three saw a number of case studies of limited generalisability and a general recognition that the best search performance can be achieved by the parallel use of the two types of indexing languages. The emphasis in Era four has been on the development of end-user-based systems, including online public access catalogues and databases on CD-ROM. Recent developments in the use of expert systems techniques to support the representation of meaning may lead to systems which offer significant support to the user in end-user searching. In the meantime, however, information retrieval in practice involves a mixture of natural and controlled indexing languages used to search a wide variety of different kinds of databases

Source

Journal of information science. 20(1994) no.2, S.108-119

Search (74 results, page 1 of 4)

Authors

Years

Languages

Themes