Search (58 results, page 2 of 3)

Krovetz, R.; Croft, W.B.: Lexical ambiguity and information retrieval (1992) 0.00
```
4.604387E-4 = product of:
  0.00690658 = sum of:
    0.00690658 = product of:
      0.01381316 = sum of:
        0.01381316 = weight(_text_:information in 4028) [ClassicSimilarity], result of:
          0.01381316 = score(doc=4028,freq=8.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.27153665 = fieldWeight in 4028, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4028)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)
```
Abstract

Reports on an analysis of lexical ambiguity in information retrieval text collections and on experiments to determine the utility of word meanings for separating relevant from nonrelevant documents. Results show that there is considerable ambiguity even in a specialised database. Word senses provide a significant separation between relevant and nonrelevant documents, but several factors contribute to determining whether disambiguation will make an improvement in performance such as: resolving lexical ambiguity was found to have little impact on retrieval effectiveness for documents that have many words in common with the query. Discusses other uses of word sense disambiguation in an information retrieval context

Source

ACM transactions on information systems. 10(1992) no.2, S.115-141

David, C.; Giroux, L.; Bertrand-Gastaldy, S.; Lanteigne, D.: Indexing as problem solving : a cognitive approach to consistency (1995) 0.00

4.5571616E-4 = product of:
  0.006835742 = sum of:
    0.006835742 = product of:
      0.013671484 = sum of:
        0.013671484 = weight(_text_:information in 3833) [ClassicSimilarity], result of:
          0.013671484 = score(doc=3833,freq=6.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.2687516 = fieldWeight in 3833, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=3833)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)

Imprint: Medford, NJ : Learned Information
Source: Forging new partnerships in information: converging technologies. Proceedings of the 58th Annual Meeting of the American Society for Information Science, ASIS'95, Chicago, IL, 9-12 October 1995. Ed.: T. Kinney

Hersh, W.R.; Hickam, D.H.: ¬A comparison of two methods for indexing and retrieval from a full-text medical database (1992) 0.00
```
3.987516E-4 = product of:
  0.005981274 = sum of:
    0.005981274 = product of:
      0.011962548 = sum of:
        0.011962548 = weight(_text_:information in 4526) [ClassicSimilarity], result of:
          0.011962548 = score(doc=4526,freq=6.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.23515764 = fieldWeight in 4526, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4526)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)
```
Abstract

Reports results of a study of 2 information retrieval systems on a 2.000 document full text medical database. The first system, SAPHIRE, features concept based automatic indexing and statistical retrieval techniques, while the second system, SWORD, features traditional word based Boolean techniques, 16 medical students at Oregon Health Sciences Univ. each performed 10 searches and their results, recorded in terms of recall and precision, showed nearly equal performance for both systems. SAPHIRE was also compared with a version of SWORD modified to use automatic indexing and ranked retrieval. Using batch input of queries, the latter method performed slightly better

Imprint

Medford, NJ : Learned Information Inc.

Source

Proceedings of the 55th Annual Meeting of the American Society for Information Science, Pittsburgh, 26.-29.10.92. Ed.: D. Shaw

Cleverdon, C.W.: ¬The Cranfield tests on index language devices (1967) 0.00

3.9466174E-4 = product of:
  0.005919926 = sum of:
    0.005919926 = product of:
      0.011839852 = sum of:
        0.011839852 = weight(_text_:information in 1957) [ClassicSimilarity], result of:
          0.011839852 = score(doc=1957,freq=2.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.23274569 = fieldWeight in 1957, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.09375 = fieldNorm(doc=1957)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)

Footnote: Wiederabgedruckt in: Readings in information retrieval. Ed.: K. Sparck Jones u. P. Willett. San Francisco: Morgan Kaufmann 1997. S.47-58.

Saracevic, T.: Measuring the degree of agreement between searchers (1984) 0.00

3.9466174E-4 = product of:
  0.005919926 = sum of:
    0.005919926 = product of:
      0.011839852 = sum of:
        0.011839852 = weight(_text_:information in 2410) [ClassicSimilarity], result of:
          0.011839852 = score(doc=2410,freq=2.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.23274569 = fieldWeight in 2410, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.09375 = fieldNorm(doc=2410)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)

Source: Challenges to an information society : proceedings of the 47th ASIS annual Meeting, Philadelphia, Pennsylvania, October 21-25, 1984. Ed.: Barbara Flood

Edwards, S.: Indexing practices at the National Agricultural Library (1993) 0.00

3.7209064E-4 = product of:
  0.0055813594 = sum of:
    0.0055813594 = product of:
      0.011162719 = sum of:
        0.011162719 = weight(_text_:information in 555) [ClassicSimilarity], result of:
          0.011162719 = score(doc=555,freq=4.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.21943474 = fieldWeight in 555, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=555)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)

Abstract: This article discusses indexing practices at the National Agriculture Library. Indexers at NAL scan over 2,200 incoming journals for input into its bibliographic database, AGRICOLA. The National Agriculture Library's coverage extends worldwide covering a broad range of agriculture subjects. Access to AGRICOLA occurs in several ways: onsite search, commercial vendors, Dialog Information Services, Inc. and BRS Information Technologies. The National Agricultural Library uses CAB THESAURUS to describe the subject content of articles in AGRICOLA.

Evedove, P.R. Dal; Evedove Tartarotti, R.C. Dal; Lopes Fujita, M.S.: Verbal protocols in Brazilian information science : a perspective from indexing studies (2018) 0.00

3.7209064E-4 = product of:
  0.0055813594 = sum of:
    0.0055813594 = product of:
      0.011162719 = sum of:
        0.011162719 = weight(_text_:information in 4783) [ClassicSimilarity], result of:
          0.011162719 = score(doc=4783,freq=4.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.21943474 = fieldWeight in 4783, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=4783)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)

Source: Challenges and opportunities for knowledge organization in the digital age: proceedings of the Fifteenth International ISKO Conference, 9-11 July 2018, Porto, Portugal / organized by: International Society for Knowledge Organization (ISKO), ISKO Spain and Portugal Chapter, University of Porto - Faculty of Arts and Humanities, Research Centre in Communication, Information and Digital Culture (CIC.digital) - Porto. Eds.: F. Ribeiro u. M.E. Cerveira

Rowley, J.: ¬The controlled versus natural indexing languages debate revisited : a perspective on information retrieval practice and research (1994) 0.00
```
3.6770437E-4 = product of:
  0.005515565 = sum of:
    0.005515565 = product of:
      0.01103113 = sum of:
        0.01103113 = weight(_text_:information in 7151) [ClassicSimilarity], result of:
          0.01103113 = score(doc=7151,freq=10.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.21684799 = fieldWeight in 7151, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=7151)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)
```
Abstract

This article revisits the debate concerning controlled and natural indexing languages, as used in searching the databases of the online hosts, in-house information retrieval systems, online public access catalogues and databases stored on CD-ROM. The debate was first formulated in the early days of information retrieval more than a century ago but, despite significant advance in technology, remains unresolved. The article divides the history of the debate into four eras. Era one was characterised by the introduction of controlled vocabulary. Era two focused on comparisons between different indexing languages in order to assess which was best. Era three saw a number of case studies of limited generalisability and a general recognition that the best search performance can be achieved by the parallel use of the two types of indexing languages. The emphasis in Era four has been on the development of end-user-based systems, including online public access catalogues and databases on CD-ROM. Recent developments in the use of expert systems techniques to support the representation of meaning may lead to systems which offer significant support to the user in end-user searching. In the meantime, however, information retrieval in practice involves a mixture of natural and controlled indexing languages used to search a wide variety of different kinds of databases

Source

Journal of information science. 20(1994) no.2, S.108-119

David, C.; Giroux, L.; Bertrand-Gastaldy, S.; Lanteigne, D.: Indexing as problem solving : a cognitive approach to consistency (1995) 0.00

3.4178712E-4 = product of:
  0.0051268064 = sum of:
    0.0051268064 = product of:
      0.010253613 = sum of:
        0.010253613 = weight(_text_:information in 3609) [ClassicSimilarity], result of:
          0.010253613 = score(doc=3609,freq=6.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.20156369 = fieldWeight in 3609, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=3609)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)

Imprint: Alberta : Alberta University, School of Library and Information Studies
Source: Connectedness: information, systems, people, organizations. Proceedings of CAIS/ACSI 95, the proceedings of the 23rd Annual Conference of the Canadian Association for Information Science. Ed. by Hope A. Olson and Denis B. Ward

Broxis, P.F.: ASSIA social science information service (1989) 0.00

3.2888478E-4 = product of:
  0.0049332716 = sum of:
    0.0049332716 = product of:
      0.009866543 = sum of:
        0.009866543 = weight(_text_:information in 1511) [ClassicSimilarity], result of:
          0.009866543 = score(doc=1511,freq=2.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.19395474 = fieldWeight in 1511, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.078125 = fieldNorm(doc=1511)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)

Lu, K.; Mao, J.; Li, G.: Toward effective automated weighted subject indexing : a comparison of different approaches in different environments (2018) 0.00
```
3.2888478E-4 = product of:
  0.0049332716 = sum of:
    0.0049332716 = product of:
      0.009866543 = sum of:
        0.009866543 = weight(_text_:information in 4292) [ClassicSimilarity], result of:
          0.009866543 = score(doc=4292,freq=8.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.19395474 = fieldWeight in 4292, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4292)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)
```
Abstract

Subject indexing plays an important role in supporting subject access to information resources. Current subject indexing systems do not make adequate distinctions on the importance of assigned subject descriptors. Assigning numeric weights to subject descriptors to distinguish their importance to the documents can strengthen the role of subject metadata. Automated methods are more cost-effective. This study compares different automated weighting methods in different environments. Two evaluation methods were used to assess the performance. Experiments on three datasets in the biomedical domain suggest the performance of different weighting methods depends on whether it is an abstract or full text environment. Mutual information with bag-of-words representation shows the best average performance in the full text environment, while cosine with bag-of-words representation is the best in an abstract environment. The cosine measure has relatively consistent and robust performance. A direct weighting method, IDF (Inverse Document Frequency), can produce quick and reasonable estimates of the weights. Bag-of-words representation generally outperforms the concept-based representation. Further improvement in performance can be obtained by using the learning-to-rank method to integrate different weighting methods. This study follows up Lu and Mao (Journal of the Association for Information Science and Technology, 66, 1776-1784, 2015), in which an automated weighted subject indexing method was proposed and validated. The findings from this study contribute to more effective weighted subject indexing.

Source

Journal of the Association for Information Science and Technology. 69(2018) no.1, S.121-133
Deaves, J.C.; Pache, J.E.: Chemical and numerical indexing for the INSPEC database (1989) 0.00
```
3.255793E-4 = product of:
  0.0048836893 = sum of:
    0.0048836893 = product of:
      0.009767379 = sum of:
        0.009767379 = weight(_text_:information in 2289) [ClassicSimilarity], result of:
          0.009767379 = score(doc=2289,freq=4.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.1920054 = fieldWeight in 2289, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2289)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)
```
Abstract

The wealth of chemical information on the INSPEC database is easily retrieved using the printed subject indexes to the associated abstract journals. However, this subject indexing is insufficient for machine retrieval, and free-text searching has special difficulties. An easy-to-use retrieval system has been developed which overcomes many problems, especially the retrieval of non-stoichiometric compositions, which are a feature solid-state chemistry. The scheme is limited to inorganic material, but allows flexibility and identification of dopants, interfaces and surfaces or substrates. At the same time, a system has been introduced for the online retrieval of numerical data included in the data base. This has successfully standardized the way in which such data is held for searching, enabling further refinement of searches where numerical information is significant

Braam, R.R.; Bruil, J.: Quality of indexing information : authors' views on indexing of their articles in chemical abstracts online CA-file (1992) 0.00

2.79068E-4 = product of:
  0.0041860198 = sum of:
    0.0041860198 = product of:
      0.0083720395 = sum of:
        0.0083720395 = weight(_text_:information in 2638) [ClassicSimilarity], result of:
          0.0083720395 = score(doc=2638,freq=4.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.16457605 = fieldWeight in 2638, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2638)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)

Source: Journal of information science. 18(1992) no.5, S.399-408

Tonta, Y.: ¬A study of indexing consistency between Library of Congress and British Library catalogers (1991) 0.00

2.6310782E-4 = product of:
  0.0039466172 = sum of:
    0.0039466172 = product of:
      0.0078932345 = sum of:
        0.0078932345 = weight(_text_:information in 2277) [ClassicSimilarity], result of:
          0.0078932345 = score(doc=2277,freq=2.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.1551638 = fieldWeight in 2277, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=2277)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)

Abstract: Indexing consistency between Library of Congress and British Library catalogers using the LCSH is compared.82 titles published in 1987 in the field of library and information science were identified for comparison, and for each title its LC subject headings, assigned by both LC and BL catalogers, were compared. By applying Hooper's 'consistency of a pair' equation, the average indexing consistency value was calculated for the 82 titles. The average indexing value between LC and BL catalogers is 16% for exact matches, and 36% for partial matches

Prasher, R.G.: Evaluation of indexing system (1989) 0.00

2.6310782E-4 = product of:
  0.0039466172 = sum of:
    0.0039466172 = product of:
      0.0078932345 = sum of:
        0.0078932345 = weight(_text_:information in 4998) [ClassicSimilarity], result of:
          0.0078932345 = score(doc=4998,freq=2.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.1551638 = fieldWeight in 4998, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=4998)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)

Abstract: Describes information system and its various components-index file construstion, query formulation and searching. Discusses an indexing system, and brings out the need for its evaluation. Explains the concept of the efficiency of indexing systems and discusses factors which control this efficiency. Gives criteria for evaluation. Discusses recall and precision ratios, as also noise ratio, novelty ratio, and exhaustivity and specificity and the impact of each on the efficiency of indexing system. Mention also various steps for evaluation.

Soergel, D.: Indexing and retrieval performance : the logical evidence (1997) 0.00

2.6310782E-4 = product of:
  0.0039466172 = sum of:
    0.0039466172 = product of:
      0.0078932345 = sum of:
        0.0078932345 = weight(_text_:information in 578) [ClassicSimilarity], result of:
          0.0078932345 = score(doc=578,freq=2.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.1551638 = fieldWeight in 578, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=578)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)

Imprint: The Hague : International Federation for Information and Documentation (FID)

Saarti, J.: Consistency of subject indexing of novels by public library professionals and patrons (2002) 0.00

2.6310782E-4 = product of:
  0.0039466172 = sum of:
    0.0039466172 = product of:
      0.0078932345 = sum of:
        0.0078932345 = weight(_text_:information in 4473) [ClassicSimilarity], result of:
          0.0078932345 = score(doc=4473,freq=2.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.1551638 = fieldWeight in 4473, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=4473)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)

Abstract: The paper discusses the consistency of fiction indexing of library professionals and patrons based on an empirical test. Indexing was carried out with a Finnish fictional thesaurus and all of the test persons indexed the same five novels. The consistency of indexing was determined to be low; several reasons are postulated. Also an algorithm for typified indexing of fiction is given as well as some suggestions for the development of fiction information retrieval systems and content representation.

Ballard, R.M.: Indexing and its relevance to technical processing (1993) 0.00
```
2.3255666E-4 = product of:
  0.0034883497 = sum of:
    0.0034883497 = product of:
      0.0069766995 = sum of:
        0.0069766995 = weight(_text_:information in 554) [ClassicSimilarity], result of:
          0.0069766995 = score(doc=554,freq=4.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.13714671 = fieldWeight in 554, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=554)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)
```
Abstract

The development of regional on-line catalogs and in-house information systems for retrieval of references provide examples of the impact of indexing theory and applications on technical processing. More emphasis must be given to understanding the techniques for evaluating the effectiveness of a file, irrespective of whether that file was created as a library catalog or an index to information sources. The most significant advances in classification theory in recent decades has been as a result of efforts to improve effectiveness of indexing systems. Library classification systems are indexing languages or systems. Courses offered for the preparation of indexers in the United States and the United Kingdom are reviewed. A point of congruence for both the indexer and the library classifier would appear to be the need for a thorough preparation in the techniques of subject analysis. Any subject heading list will suffer from omissions as well as the inclusion of terms which the patron will never use. Indexing theory has provided the technical services department with methods for evaluation of effectiveness. The writer does not believe that these techniques are used, nor do current courses, workshops, and continuing education programs stress them. When theory is totally subjugated to practice, critical thinking and maximum effectiveness will suffer.
Olson, H.A.; Wolfram, D.: Syntagmatic relationships and indexing consistency on a larger scale (2008) 0.00
```
2.3255666E-4 = product of:
  0.0034883497 = sum of:
    0.0034883497 = product of:
      0.0069766995 = sum of:
        0.0069766995 = weight(_text_:information in 2214) [ClassicSimilarity], result of:
          0.0069766995 = score(doc=2214,freq=4.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.13714671 = fieldWeight in 2214, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2214)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)
```
Abstract

Purpose - The purpose of this article is to examine interindexer consistency on a larger scale than other studies have done to determine if group consensus is reached by larger numbers of indexers and what, if any, relationships emerge between assigned terms. Design/methodology/approach - In total, 64 MLIS students were recruited to assign up to five terms to a document. The authors applied basic data modeling and the exploratory statistical techniques of multi-dimensional scaling (MDS) and hierarchical cluster analysis to determine whether relationships exist in indexing consistency and the coocurrence of assigned terms. Findings - Consistency in the assignment of indexing terms to a document follows an inverse shape, although it is not strictly power law-based unlike many other social phenomena. The exploratory techniques revealed that groups of terms clustered together. The resulting term cooccurrence relationships were largely syntagmatic. Research limitations/implications - The results are based on the indexing of one article by non-expert indexers and are, thus, not generalizable. Based on the study findings, along with the growing popularity of folksonomies and the apparent authority of communally developed information resources, communally developed indexes based on group consensus may have merit. Originality/value - Consistency in the assignment of indexing terms has been studied primarily on a small scale. Few studies have examined indexing on a larger scale with more than a handful of indexers. Recognition of the differences in indexing assignment has implications for the development of public information systems, especially those that do not use a controlled vocabulary and those tagged by end-users. In such cases, multiple access points that accommodate the different ways that users interpret content are needed so that searchers may be guided to relevant content despite using different terminology.
Lee, D.H.; Schleyer, T.: Social tagging is no substitute for controlled indexing : a comparison of Medical Subject Headings and CiteULike tags assigned to 231,388 papers (2012) 0.00
```
2.3255666E-4 = product of:
  0.0034883497 = sum of:
    0.0034883497 = product of:
      0.0069766995 = sum of:
        0.0069766995 = weight(_text_:information in 383) [ClassicSimilarity], result of:
          0.0069766995 = score(doc=383,freq=4.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.13714671 = fieldWeight in 383, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=383)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)
```
Abstract

Social tagging and controlled indexing both facilitate access to information resources. Given the increasing popularity of social tagging and the limitations of controlled indexing (primarily cost and scalability), it is reasonable to investigate to what degree social tagging could substitute for controlled indexing. In this study, we compared CiteULike tags to Medical Subject Headings (MeSH) terms for 231,388 citations indexed in MEDLINE. In addition to descriptive analyses of the data sets, we present a paper-by-paper analysis of tags and MeSH terms: the number of common annotations, Jaccard similarity, and coverage ratio. In the analysis, we apply three increasingly progressive levels of text processing, ranging from normalization to stemming, to reduce the impact of lexical differences. Annotations of our corpus consisted of over 76,968 distinct tags and 21,129 distinct MeSH terms. The top 20 tags/MeSH terms showed little direct overlap. On a paper-by-paper basis, the number of common annotations ranged from 0.29 to 0.5 and the Jaccard similarity from 2.12% to 3.3% using increased levels of text processing. At most, 77,834 citations (33.6%) shared at least one annotation. Our results show that CiteULike tags and MeSH terms are quite distinct lexically, reflecting different viewpoints/processes between social tagging and controlled indexing.

Source

Journal of the American Society for Information Science and Technology. 63(2012) no.9, S.1747-1757

Search (58 results, page 2 of 3)

Authors

Years

Types

Themes