Search (93 results, page 2 of 5)

Broxis, P.F.: ASSIA social science information service (1989) 0.01

0.0070104985 = product of:
  0.017526247 = sum of:
    0.009632425 = weight(_text_:a in 1511) [ClassicSimilarity], result of:
      0.009632425 = score(doc=1511,freq=4.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.18016359 = fieldWeight in 1511, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.078125 = fieldNorm(doc=1511)
    0.007893822 = product of:
      0.015787644 = sum of:
        0.015787644 = weight(_text_:information in 1511) [ClassicSimilarity], result of:
          0.015787644 = score(doc=1511,freq=2.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.19395474 = fieldWeight in 1511, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.078125 = fieldNorm(doc=1511)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: ASSIA (Applied Social Science Index and Abtracts) started in 1987 as a bimonthly indexing and abstracting service in the society field, aimed at practitioners as well as sociologists. Considers the following aspects of the service: arrangement of ASSIA; journal coverage; indexing approach; services for subscribers; and who are the users?
Type: a

Edwards, S.: Indexing practices at the National Agricultural Library (1993) 0.01

0.006654713 = product of:
  0.016636781 = sum of:
    0.00770594 = weight(_text_:a in 555) [ClassicSimilarity], result of:
      0.00770594 = score(doc=555,freq=4.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.14413087 = fieldWeight in 555, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0625 = fieldNorm(doc=555)
    0.0089308405 = product of:
      0.017861681 = sum of:
        0.017861681 = weight(_text_:information in 555) [ClassicSimilarity], result of:
          0.017861681 = score(doc=555,freq=4.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.21943474 = fieldWeight in 555, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=555)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: This article discusses indexing practices at the National Agriculture Library. Indexers at NAL scan over 2,200 incoming journals for input into its bibliographic database, AGRICOLA. The National Agriculture Library's coverage extends worldwide covering a broad range of agriculture subjects. Access to AGRICOLA occurs in several ways: onsite search, commercial vendors, Dialog Information Services, Inc. and BRS Information Technologies. The National Agricultural Library uses CAB THESAURUS to describe the subject content of articles in AGRICOLA.
Type: a

Evedove, P.R. Dal; Evedove Tartarotti, R.C. Dal; Lopes Fujita, M.S.: Verbal protocols in Brazilian information science : a perspective from indexing studies (2018) 0.01

0.006654713 = product of:
  0.016636781 = sum of:
    0.00770594 = weight(_text_:a in 4783) [ClassicSimilarity], result of:
      0.00770594 = score(doc=4783,freq=4.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.14413087 = fieldWeight in 4783, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0625 = fieldNorm(doc=4783)
    0.0089308405 = product of:
      0.017861681 = sum of:
        0.017861681 = weight(_text_:information in 4783) [ClassicSimilarity], result of:
          0.017861681 = score(doc=4783,freq=4.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.21943474 = fieldWeight in 4783, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=4783)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Source: Challenges and opportunities for knowledge organization in the digital age: proceedings of the Fifteenth International ISKO Conference, 9-11 July 2018, Porto, Portugal / organized by: International Society for Knowledge Organization (ISKO), ISKO Spain and Portugal Chapter, University of Porto - Faculty of Arts and Humanities, Research Centre in Communication, Information and Digital Culture (CIC.digital) - Porto. Eds.: F. Ribeiro u. M.E. Cerveira
Type: a

Peset, F.; Garzón-Farinós, F.; González, L.M.; García-Massó, X.; Ferrer-Sapena, A.; Toca-Herrera, J.L.; Sánchez-Pérez, E.A.: Survival analysis of author keywords : an application to the library and information sciences area (2020) 0.01
```
0.0065874713 = product of:
  0.016468678 = sum of:
    0.009632425 = weight(_text_:a in 5774) [ClassicSimilarity], result of:
      0.009632425 = score(doc=5774,freq=16.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.18016359 = fieldWeight in 5774, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5774)
    0.006836252 = product of:
      0.013672504 = sum of:
        0.013672504 = weight(_text_:information in 5774) [ClassicSimilarity], result of:
          0.013672504 = score(doc=5774,freq=6.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.16796975 = fieldWeight in 5774, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5774)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

Our purpose is to adapt a statistical method for the analysis of discrete numerical series to the keywords appearing in scientific articles of a given area. As an example, we apply our methodological approach to the study of the keywords in the Library and Information Sciences (LIS) area. Our objective is to detect the new author keywords that appear in a fixed knowledge area in the period of 1 year in order to quantify the probabilities of survival for 10 years as a function of the impact of the journals where they appeared. Many of the new keywords appearing in the LIS field are ephemeral. Actually, more than half are never used again. In general, the terms most commonly used in the LIS area come from other areas. The average survival time of these keywords is approximately 3 years, being slightly higher in the case of words that were published in journals classified in the second quartile of the area. We believe that measuring the appearance and disappearance of terms will allow understanding some relevant aspects of the evolution of a discipline, providing in this way a new bibliometric approach.

Source

Journal of the Association for Information Science and Technology. 71(2020) no.4, S.462-473

Type

a
Ellis, D.; Furner, J.; Willett, P.: On the creation of hypertext links in full-text documents : measurement of retrieval effectiveness (1996) 0.01
```
0.0064903568 = product of:
  0.016225891 = sum of:
    0.012278981 = weight(_text_:a in 4214) [ClassicSimilarity], result of:
      0.012278981 = score(doc=4214,freq=26.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.22966442 = fieldWeight in 4214, product of:
          5.0990195 = tf(freq=26.0), with freq of:
            26.0 = termFreq=26.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4214)
    0.003946911 = product of:
      0.007893822 = sum of:
        0.007893822 = weight(_text_:information in 4214) [ClassicSimilarity], result of:
          0.007893822 = score(doc=4214,freq=2.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.09697737 = fieldWeight in 4214, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4214)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

An important stage in the process or retrieval of objects from a hypertext database is the creation of a set of internodal links that are intended to represent the relationships existing between objects; this operation is often undertaken manually, just as index terms are often manually assigned to documents in a conventional retrieval system. In an earlier article (1994), the results were published of a study in which several different sets of links were inserted, each by a different person, between the paragraphs of each of a number of full-text documents. These results showed little similarity between the link-sets, a finding that was comparable with those of studies of inter-indexer consistency, which suggest that there is generally only a low level of agreement between the sets of index terms assigned to a document by different indexers. In this article, a description is provided of an investigation into the nature of the relationship existing between (i) the levels of inter-linker consistency obtaining among the group of hypertext databases used in our earlier experiments, and (ii) the levels of effectiveness of a number of searches carried out in those databases. An account is given of the implementation of the searches and of the methods used in the calculation of numerical values expressing their effectiveness. Analysis of the results of a comparison between recorded levels of consistency and those of effectiveness does not allow us to draw conclusions about the consistency - effectiveness relationship that are equivalent to those drawn in comparable studies of inter-indexer consistency

Source

Journal of the American Society for Information Science. 47(1996) no.4, S.287-300

Type

a

Boyce, B.R.; McLain, J.P.: Entry point depth and online search using a controlled vocabulary (1989) 0.01

0.006474727 = product of:
  0.016186817 = sum of:
    0.010661141 = weight(_text_:a in 2287) [ClassicSimilarity], result of:
      0.010661141 = score(doc=2287,freq=10.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.19940455 = fieldWeight in 2287, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2287)
    0.005525676 = product of:
      0.011051352 = sum of:
        0.011051352 = weight(_text_:information in 2287) [ClassicSimilarity], result of:
          0.011051352 = score(doc=2287,freq=2.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.13576832 = fieldWeight in 2287, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2287)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: The depth of indexing, the number of terms assigned on average to each document in a retrieval system as entry points, has a significantly effect on the standard retrieval performance measures in modern commercial retrieval systems, just as it did in previous experimental work. Tests on the effect of basic index search, as opposed to controlled vocabulary search, in these real systems are quite different than traditional comparisons of free text searching with controlled vocabulary searching. In modern commercial systems the controlled vocabulary serves as a precision device, since the strucure of the default for unqualified search terms in these systems requires that it do so.
Source: Journal of the American Society for Information Science. 40(1989), S.273-276
Type: a

Deaves, J.C.; Pache, J.E.: Chemical and numerical indexing for the INSPEC database (1989) 0.01

0.0064290287 = product of:
  0.016072571 = sum of:
    0.008258085 = weight(_text_:a in 2289) [ClassicSimilarity], result of:
      0.008258085 = score(doc=2289,freq=6.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.1544581 = fieldWeight in 2289, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2289)
    0.007814486 = product of:
      0.015628971 = sum of:
        0.015628971 = weight(_text_:information in 2289) [ClassicSimilarity], result of:
          0.015628971 = score(doc=2289,freq=4.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.1920054 = fieldWeight in 2289, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2289)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: The wealth of chemical information on the INSPEC database is easily retrieved using the printed subject indexes to the associated abstract journals. However, this subject indexing is insufficient for machine retrieval, and free-text searching has special difficulties. An easy-to-use retrieval system has been developed which overcomes many problems, especially the retrieval of non-stoichiometric compositions, which are a feature solid-state chemistry. The scheme is limited to inorganic material, but allows flexibility and identification of dopants, interfaces and surfaces or substrates. At the same time, a system has been introduced for the online retrieval of numerical data included in the data base. This has successfully standardized the way in which such data is held for searching, enabling further refinement of searches where numerical information is significant
Type: a

Tseng, Y.-H.: Keyword extraction techniques and relevance feedback (1997) 0.01

0.0064290287 = product of:
  0.016072571 = sum of:
    0.008258085 = weight(_text_:a in 1830) [ClassicSimilarity], result of:
      0.008258085 = score(doc=1830,freq=6.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.1544581 = fieldWeight in 1830, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1830)
    0.007814486 = product of:
      0.015628971 = sum of:
        0.015628971 = weight(_text_:information in 1830) [ClassicSimilarity], result of:
          0.015628971 = score(doc=1830,freq=4.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.1920054 = fieldWeight in 1830, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1830)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: Automatic keyword extraction is an important and fundamental technology in an advanced information retrieval systems. Briefly compares several major keyword extraction methods, lists their advantages and disadvantages, and reports recent research progress in Taiwan. Also describes the application of a keyword extraction algorithm in an information retrieval system for relevance feedback. Preliminary analysis shows that the error rate of extracting relevant keywords is 18%, and that the precision rate is over 50%. The main disadvantage of this approach is that the extraction results depend on the retrieval results, which in turn depend on the data held by the database. Apart from collecting more data, this problem can be alleviated by the application of a thesaurus constructed by the same keyword extraction algorithm
Type: a

Olson, H.A.; Wolfram, D.: Syntagmatic relationships and indexing consistency on a larger scale (2008) 0.01
```
0.0063194023 = product of:
  0.015798505 = sum of:
    0.01021673 = weight(_text_:a in 2214) [ClassicSimilarity], result of:
      0.01021673 = score(doc=2214,freq=18.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.19109234 = fieldWeight in 2214, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2214)
    0.0055817757 = product of:
      0.011163551 = sum of:
        0.011163551 = weight(_text_:information in 2214) [ClassicSimilarity], result of:
          0.011163551 = score(doc=2214,freq=4.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.13714671 = fieldWeight in 2214, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2214)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

Purpose - The purpose of this article is to examine interindexer consistency on a larger scale than other studies have done to determine if group consensus is reached by larger numbers of indexers and what, if any, relationships emerge between assigned terms. Design/methodology/approach - In total, 64 MLIS students were recruited to assign up to five terms to a document. The authors applied basic data modeling and the exploratory statistical techniques of multi-dimensional scaling (MDS) and hierarchical cluster analysis to determine whether relationships exist in indexing consistency and the coocurrence of assigned terms. Findings - Consistency in the assignment of indexing terms to a document follows an inverse shape, although it is not strictly power law-based unlike many other social phenomena. The exploratory techniques revealed that groups of terms clustered together. The resulting term cooccurrence relationships were largely syntagmatic. Research limitations/implications - The results are based on the indexing of one article by non-expert indexers and are, thus, not generalizable. Based on the study findings, along with the growing popularity of folksonomies and the apparent authority of communally developed information resources, communally developed indexes based on group consensus may have merit. Originality/value - Consistency in the assignment of indexing terms has been studied primarily on a small scale. Few studies have examined indexing on a larger scale with more than a handful of indexers. Recognition of the differences in indexing assignment has implications for the development of public information systems, especially those that do not use a controlled vocabulary and those tagged by end-users. In such cases, multiple access points that accommodate the different ways that users interpret content are needed so that searchers may be guided to relevant content despite using different terminology.

Type

a

Tonta, Y.: ¬A study of indexing consistency between Library of Congress and British Library catalogers (1991) 0.01

0.0063011474 = product of:
  0.015752869 = sum of:
    0.009437811 = weight(_text_:a in 2277) [ClassicSimilarity], result of:
      0.009437811 = score(doc=2277,freq=6.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.17652355 = fieldWeight in 2277, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0625 = fieldNorm(doc=2277)
    0.006315058 = product of:
      0.012630116 = sum of:
        0.012630116 = weight(_text_:information in 2277) [ClassicSimilarity], result of:
          0.012630116 = score(doc=2277,freq=2.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.1551638 = fieldWeight in 2277, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=2277)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: Indexing consistency between Library of Congress and British Library catalogers using the LCSH is compared.82 titles published in 1987 in the field of library and information science were identified for comparison, and for each title its LC subject headings, assigned by both LC and BL catalogers, were compared. By applying Hooper's 'consistency of a pair' equation, the average indexing consistency value was calculated for the 82 titles. The average indexing value between LC and BL catalogers is 16% for exact matches, and 36% for partial matches
Type: a

Soergel, D.: Indexing and retrieval performance : the logical evidence (1997) 0.01

0.0063011474 = product of:
  0.015752869 = sum of:
    0.009437811 = weight(_text_:a in 578) [ClassicSimilarity], result of:
      0.009437811 = score(doc=578,freq=6.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.17652355 = fieldWeight in 578, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0625 = fieldNorm(doc=578)
    0.006315058 = product of:
      0.012630116 = sum of:
        0.012630116 = weight(_text_:information in 578) [ClassicSimilarity], result of:
          0.012630116 = score(doc=578,freq=2.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.1551638 = fieldWeight in 578, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=578)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Imprint: The Hague : International Federation for Information and Documentation (FID)
Source: From classification to 'knowledge organization': Dorking revisited or 'past is prelude'. A collection of reprints to commemorate the firty year span between the Dorking Conference (First International Study Conference on Classification Research 1957) and the Sixth International Study Conference on Classification Research (London 1997). Ed.: A. Gilchrist
Type: a

Wolfram, D.; Zhang, J.: ¬An investigation of the influence of indexing exhaustivity and term distributions on a document space (2002) 0.01
```
0.0060967724 = product of:
  0.01524193 = sum of:
    0.01129502 = weight(_text_:a in 5238) [ClassicSimilarity], result of:
      0.01129502 = score(doc=5238,freq=22.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.21126054 = fieldWeight in 5238, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5238)
    0.003946911 = product of:
      0.007893822 = sum of:
        0.007893822 = weight(_text_:information in 5238) [ClassicSimilarity], result of:
          0.007893822 = score(doc=5238,freq=2.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.09697737 = fieldWeight in 5238, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5238)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

Wolfram and Zhang are interested in the effect of different indexing exhaustivity, by which they mean the number of terms chosen, and of different index term distributions and different term weighting methods on the resulting document cluster organization. The Distance Angle Retrieval Environment, DARE, which provides a two dimensional display of retrieved documents was used to represent the document clusters based upon a document's distance from the searcher's main interest, and on the angle formed by the document, a point representing a minor interest, and the point representing the main interest. If the centroid and the origin of the document space are assigned as major and minor points the average distance between documents and the centroid can be measured providing an indication of cluster organization. in the form of a size normalized similarity measure. Using 500 records from NTIS and nine models created by intersecting low, observed, and high exhaustivity levels (based upon a negative binomial distribution) with shallow, observed, and steep term distributions (based upon a Zipf distribution) simulation runs were preformed using inverse document frequency, inter-document term frequency, and inverse document frequency based upon both inter and intra-document frequencies. Low exhaustivity and shallow distributions result in a more dense document space and less effective retrieval. High exhaustivity and steeper distributions result in a more diffuse space.

Source

Journal of the American Society for Information Science and Technology. 53(2002) no.11, S.944-952

Type

a

Bellamy, L.M.; Bickham, L.: Thesaurus development for subject cataloging (1989) 0.01

0.005898641 = product of:
  0.014746603 = sum of:
    0.0100103095 = weight(_text_:a in 2262) [ClassicSimilarity], result of:
      0.0100103095 = score(doc=2262,freq=12.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.18723148 = fieldWeight in 2262, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046875 = fieldNorm(doc=2262)
    0.0047362936 = product of:
      0.009472587 = sum of:
        0.009472587 = weight(_text_:information in 2262) [ClassicSimilarity], result of:
          0.009472587 = score(doc=2262,freq=2.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.116372846 = fieldWeight in 2262, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2262)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: The biomedical book collection in the Genetech Library and Information Services was first inventoried and cataloged in 1983 when it totaled about 2000 titles. Cataloging records were retrieved from the OCLC system and used as a basis for cataloging. A year of cataloging produced a list of 1900 subject terms. More than one term describing the same concept often appears on the list, and no hierarchical structure related the terms to one another. As the collection grew, the subject catalog became increasingly inconsistent. To bring consistency to subject cataloging, a thesaurus of biomedical terms was constructed using the list of subject headings as a basis. This thesaurus follows the broad categories of the National Library of Medicine's Medical Subject Headings and, with some exceptions, the Guidelines for the Establishment and Development of Monolingual Thesauri. It has enabled the cataloger in providing greater in-depth subject analysis of materials added to the collection and in consistently assigning subject headings to cataloging record.
Type: a

Burgin, R.: ¬The effect of indexing exhaustivity on retrieval performance (1991) 0.01

0.005898641 = product of:
  0.014746603 = sum of:
    0.0100103095 = weight(_text_:a in 5262) [ClassicSimilarity], result of:
      0.0100103095 = score(doc=5262,freq=12.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.18723148 = fieldWeight in 5262, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046875 = fieldNorm(doc=5262)
    0.0047362936 = product of:
      0.009472587 = sum of:
        0.009472587 = weight(_text_:information in 5262) [ClassicSimilarity], result of:
          0.009472587 = score(doc=5262,freq=2.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.116372846 = fieldWeight in 5262, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=5262)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: The study was based on the collection examnined by W.H. Shaw (Inf. proc. man. 26(1990) no.6, S.693-703, 705-718), a test collection of 1239 articles, indexed with the term cystic fibrosis; and 100 queries with 3 sets of relevance evaluations from subject experts. The effect of variations in indexing exhaustivity on retrieval performance in a vector space retrieval system was investigated by using a term weight threshold to construct different document representations for a test collection. Retrieval results showed that retrieval performance, as measured by the mean optimal measure for all queries at a term weight threshold, was highest at the most exhaustive representation, and decreased slightly as terms were eliminated and the indexing representation became less exhaustive. The findings suggest that the vector space model is more robust against variations in indexing exhaustivity that is the single-link clustering model
Source: Information processing and management. 27(1991) no.6, S.623-628
Type: a

Gil-Leiva, I.; Alonso-Arroyo, A.: Keywords given by authors of scientific articles in database descriptors (2007) 0.01
```
0.00588199 = product of:
  0.014704974 = sum of:
    0.0068111527 = weight(_text_:a in 211) [ClassicSimilarity], result of:
      0.0068111527 = score(doc=211,freq=8.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.12739488 = fieldWeight in 211, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=211)
    0.007893822 = product of:
      0.015787644 = sum of:
        0.015787644 = weight(_text_:information in 211) [ClassicSimilarity], result of:
          0.015787644 = score(doc=211,freq=8.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.19395474 = fieldWeight in 211, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=211)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

In this article, the authors analyze the keywords given by authors of scientific articles and the descriptors assigned to the articles to ascertain the presence of the keywords in the descriptors. Six-hundred forty INSPEC (Information Service for Physics, Engineering, and Computing), CAB (Current Agriculture Bibliography) abstracts, ISTA (Information Science and Technology Abstracts), and LISA (Library and Information Science Abstracts) database records were consulted. After detailed comparisons, it was found that keywords provided by authors have an important presence in the database descriptors studied; nearly 25% of all the keywords appeared in exactly the same form as descriptors, with another 21% though normalized, still detected in the descriptors. This means that almost 46% of keywords appear in the descriptors, either as such or after normalization. Elsewhere, three distinct indexing policies appear, one represented by INSPEC and LISA (indexers seem to have freedom to assign the descriptors they deem necessary); another is represented by CAB (no record has fewer than four descriptors and, in general, a large number of descriptors is employed). In contrast, in ISTA, a certain institutional code exists towards economy in indexing because 84% of records contain only four descriptors.

Source

Journal of the American Society for Information Science and Technology. 58(2007) no.8, S.1175-1187

Type

a
Keen, E.M.: Designing and testing an interactive ranked retrieval system for professional searchers (1994) 0.01
```
0.0056654564 = product of:
  0.014163641 = sum of:
    0.01021673 = weight(_text_:a in 1066) [ClassicSimilarity], result of:
      0.01021673 = score(doc=1066,freq=18.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.19109234 = fieldWeight in 1066, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1066)
    0.003946911 = product of:
      0.007893822 = sum of:
        0.007893822 = weight(_text_:information in 1066) [ClassicSimilarity], result of:
          0.007893822 = score(doc=1066,freq=2.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.09697737 = fieldWeight in 1066, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1066)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

Reports 3 explorations of ranked system design. 2 tests used a 'cystic fibrosis' test collection with 100 queries. Experiment 1 compared a Boolean with a ranked interactive system using a subject qualified trained searcher, and reporting recall and precision results. Experiment 2 compared 15 different ranked match algorithms in a batch mode using 2 test collections, and included some new proximate pairs and term weighting approaches. Experiment 3 is a design plan for an interactive ranked prototype offering mid search algorithm choices plus other manual search devices (such as obligatory and unwanted terms), as influenced by thinking aloud comments from experiment 1. Concludes that, in Boolean versus ranked using inverse collection frequency, the searcher inspected more records on ranked than Boolean and so achieved a higher recall but lower precision; however, the presentation order of the relevant records, was, on average, very similar in both systems. Concludes also that: query reformulation was quite strongly practised in ranked searching but does not appear to have been effective; the term pairs proximate weithing methods in experiment 2 enhanced precision on both test collections when used with inverse collection frequency weighting (ICF); and the design plan for an interactive prototype adds to a selection of match algorithms other devices, such as obligatory and unwanted term marking, evidence for this being found from think aloud comments

Source

Journal of information science. 20(1994) no.6, S.389-398

Type

a

Chan, L.M.: Inter-indexer consistency in subject cataloging (1989) 0.01

0.0056083994 = product of:
  0.014020998 = sum of:
    0.00770594 = weight(_text_:a in 2276) [ClassicSimilarity], result of:
      0.00770594 = score(doc=2276,freq=4.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.14413087 = fieldWeight in 2276, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0625 = fieldNorm(doc=2276)
    0.006315058 = product of:
      0.012630116 = sum of:
        0.012630116 = weight(_text_:information in 2276) [ClassicSimilarity], result of:
          0.012630116 = score(doc=2276,freq=2.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.1551638 = fieldWeight in 2276, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=2276)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: The purpose of the current study has been twofold: (1) to develop a valid methodology for studying indexing consistency in MARC records and, (2) to study such consistency in subject cataloging practice between non-LC libraries and the Library of Congress
Source: Information technology and libraries. 8(1989), S.349-358
Type: a

Saarti, J.: Consistency of subject indexing of novels by public library professionals and patrons (2002) 0.01

0.0056083994 = product of:
  0.014020998 = sum of:
    0.00770594 = weight(_text_:a in 4473) [ClassicSimilarity], result of:
      0.00770594 = score(doc=4473,freq=4.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.14413087 = fieldWeight in 4473, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0625 = fieldNorm(doc=4473)
    0.006315058 = product of:
      0.012630116 = sum of:
        0.012630116 = weight(_text_:information in 4473) [ClassicSimilarity], result of:
          0.012630116 = score(doc=4473,freq=2.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.1551638 = fieldWeight in 4473, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=4473)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: The paper discusses the consistency of fiction indexing of library professionals and patrons based on an empirical test. Indexing was carried out with a Finnish fictional thesaurus and all of the test persons indexed the same five novels. The consistency of indexing was determined to be low; several reasons are postulated. Also an algorithm for typified indexing of fiction is given as well as some suggestions for the development of fiction information retrieval systems and content representation.
Type: a

Ballard, R.M.: Indexing and its relevance to technical processing (1993) 0.01
```
0.00556948 = product of:
  0.0139237 = sum of:
    0.008341924 = weight(_text_:a in 554) [ClassicSimilarity], result of:
      0.008341924 = score(doc=554,freq=12.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.15602624 = fieldWeight in 554, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=554)
    0.0055817757 = product of:
      0.011163551 = sum of:
        0.011163551 = weight(_text_:information in 554) [ClassicSimilarity], result of:
          0.011163551 = score(doc=554,freq=4.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.13714671 = fieldWeight in 554, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=554)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

The development of regional on-line catalogs and in-house information systems for retrieval of references provide examples of the impact of indexing theory and applications on technical processing. More emphasis must be given to understanding the techniques for evaluating the effectiveness of a file, irrespective of whether that file was created as a library catalog or an index to information sources. The most significant advances in classification theory in recent decades has been as a result of efforts to improve effectiveness of indexing systems. Library classification systems are indexing languages or systems. Courses offered for the preparation of indexers in the United States and the United Kingdom are reviewed. A point of congruence for both the indexer and the library classifier would appear to be the need for a thorough preparation in the techniques of subject analysis. Any subject heading list will suffer from omissions as well as the inclusion of terms which the patron will never use. Indexing theory has provided the technical services department with methods for evaluation of effectiveness. The writer does not believe that these techniques are used, nor do current courses, workshops, and continuing education programs stress them. When theory is totally subjugated to practice, critical thinking and maximum effectiveness will suffer.

Type

a
Lu, K.; Mao, J.; Li, G.: Toward effective automated weighted subject indexing : a comparison of different approaches in different environments (2018) 0.01
```
0.0055169817 = product of:
  0.013792454 = sum of:
    0.005898632 = weight(_text_:a in 4292) [ClassicSimilarity], result of:
      0.005898632 = score(doc=4292,freq=6.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.11032722 = fieldWeight in 4292, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4292)
    0.007893822 = product of:
      0.015787644 = sum of:
        0.015787644 = weight(_text_:information in 4292) [ClassicSimilarity], result of:
          0.015787644 = score(doc=4292,freq=8.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.19395474 = fieldWeight in 4292, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4292)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

Subject indexing plays an important role in supporting subject access to information resources. Current subject indexing systems do not make adequate distinctions on the importance of assigned subject descriptors. Assigning numeric weights to subject descriptors to distinguish their importance to the documents can strengthen the role of subject metadata. Automated methods are more cost-effective. This study compares different automated weighting methods in different environments. Two evaluation methods were used to assess the performance. Experiments on three datasets in the biomedical domain suggest the performance of different weighting methods depends on whether it is an abstract or full text environment. Mutual information with bag-of-words representation shows the best average performance in the full text environment, while cosine with bag-of-words representation is the best in an abstract environment. The cosine measure has relatively consistent and robust performance. A direct weighting method, IDF (Inverse Document Frequency), can produce quick and reasonable estimates of the weights. Bag-of-words representation generally outperforms the concept-based representation. Further improvement in performance can be obtained by using the learning-to-rank method to integrate different weighting methods. This study follows up Lu and Mao (Journal of the Association for Information Science and Technology, 66, 1776-1784, 2015), in which an automated weighted subject indexing method was proposed and validated. The findings from this study contribute to more effective weighted subject indexing.

Source

Journal of the Association for Information Science and Technology. 69(2018) no.1, S.121-133

Type

a

Search (93 results, page 2 of 5)

Authors

Years

Languages

Types

Themes