Search (14 results, page 1 of 1)

Veenema, F.: To index or not to index (1996) 0.01

0.014140441 = product of:
  0.028280882 = sum of:
    0.028280882 = product of:
      0.056561764 = sum of:
        0.056561764 = weight(_text_:22 in 7247) [ClassicSimilarity], result of:
          0.056561764 = score(doc=7247,freq=2.0), product of:
            0.1827397 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052184064 = queryNorm
            0.30952093 = fieldWeight in 7247, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=7247)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Canadian journal of information and library science. 21(1996) no.2, S.1-22

Rowley, J.: ¬The controlled versus natural indexing languages debate revisited : a perspective on information retrieval practice and research (1994) 0.01
```
0.013613109 = product of:
  0.027226217 = sum of:
    0.027226217 = product of:
      0.054452434 = sum of:
        0.054452434 = weight(_text_:systems in 7151) [ClassicSimilarity], result of:
          0.054452434 = score(doc=7151,freq=8.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.339541 = fieldWeight in 7151, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0390625 = fieldNorm(doc=7151)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This article revisits the debate concerning controlled and natural indexing languages, as used in searching the databases of the online hosts, in-house information retrieval systems, online public access catalogues and databases stored on CD-ROM. The debate was first formulated in the early days of information retrieval more than a century ago but, despite significant advance in technology, remains unresolved. The article divides the history of the debate into four eras. Era one was characterised by the introduction of controlled vocabulary. Era two focused on comparisons between different indexing languages in order to assess which was best. Era three saw a number of case studies of limited generalisability and a general recognition that the best search performance can be achieved by the parallel use of the two types of indexing languages. The emphasis in Era four has been on the development of end-user-based systems, including online public access catalogues and databases on CD-ROM. Recent developments in the use of expert systems techniques to support the representation of meaning may lead to systems which offer significant support to the user in end-user searching. In the meantime, however, information retrieval in practice involves a mixture of natural and controlled indexing languages used to search a wide variety of different kinds of databases
Ballard, R.M.: Indexing and its relevance to technical processing (1993) 0.01
```
0.013613109 = product of:
  0.027226217 = sum of:
    0.027226217 = product of:
      0.054452434 = sum of:
        0.054452434 = weight(_text_:systems in 554) [ClassicSimilarity], result of:
          0.054452434 = score(doc=554,freq=8.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.339541 = fieldWeight in 554, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0390625 = fieldNorm(doc=554)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The development of regional on-line catalogs and in-house information systems for retrieval of references provide examples of the impact of indexing theory and applications on technical processing. More emphasis must be given to understanding the techniques for evaluating the effectiveness of a file, irrespective of whether that file was created as a library catalog or an index to information sources. The most significant advances in classification theory in recent decades has been as a result of efforts to improve effectiveness of indexing systems. Library classification systems are indexing languages or systems. Courses offered for the preparation of indexers in the United States and the United Kingdom are reviewed. A point of congruence for both the indexer and the library classifier would appear to be the need for a thorough preparation in the techniques of subject analysis. Any subject heading list will suffer from omissions as well as the inclusion of terms which the patron will never use. Indexing theory has provided the technical services department with methods for evaluation of effectiveness. The writer does not believe that these techniques are used, nor do current courses, workshops, and continuing education programs stress them. When theory is totally subjugated to practice, critical thinking and maximum effectiveness will suffer.
Morris, L.R.: ¬The frequency of use of Library of Congress Classification numbers and Dewey Decimal Classification numbers in the MARC file in the field of library science (1991) 0.01
```
0.013476291 = product of:
  0.026952581 = sum of:
    0.026952581 = product of:
      0.053905163 = sum of:
        0.053905163 = weight(_text_:systems in 2308) [ClassicSimilarity], result of:
          0.053905163 = score(doc=2308,freq=4.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.33612844 = fieldWeight in 2308, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2308)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The LCC and DDC systems were devised and updated by librarians who had and have no access to the eventual frequency of use of each number in those classification systems. 80% of the monographs in a MARC file of over 1.000.000 records are classified into 20% of the classification numbers in the field of library science and only 20% of the mongraphs are classified into 80% of the classification numbers in the field of library science. Classification of monographs coulld be made easier and performed more accurately if many of the little used and unused numbers were eliminated and many of the most crowded numbers were expanded. A number of examples are included
Hersh, W.R.; Hickam, D.H.: ¬A comparison of two methods for indexing and retrieval from a full-text medical database (1992) 0.01
```
0.013476291 = product of:
  0.026952581 = sum of:
    0.026952581 = product of:
      0.053905163 = sum of:
        0.053905163 = weight(_text_:systems in 4526) [ClassicSimilarity], result of:
          0.053905163 = score(doc=4526,freq=4.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.33612844 = fieldWeight in 4526, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4526)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Reports results of a study of 2 information retrieval systems on a 2.000 document full text medical database. The first system, SAPHIRE, features concept based automatic indexing and statistical retrieval techniques, while the second system, SWORD, features traditional word based Boolean techniques, 16 medical students at Oregon Health Sciences Univ. each performed 10 searches and their results, recorded in terms of recall and precision, showed nearly equal performance for both systems. SAPHIRE was also compared with a version of SWORD modified to use automatic indexing and ranked retrieval. Using batch input of queries, the latter method performed slightly better
Soergel, D.: Indexing and retrieval performance : the logical evidence (1994) 0.01
```
0.013476291 = product of:
  0.026952581 = sum of:
    0.026952581 = product of:
      0.053905163 = sum of:
        0.053905163 = weight(_text_:systems in 579) [ClassicSimilarity], result of:
          0.053905163 = score(doc=579,freq=4.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.33612844 = fieldWeight in 579, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=579)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This article presents a logical analysis of the characteristics of indexing and their effects on retrieval performance.It establishes the ability to ask the questions one needs to ask as the foundation of performance evaluation, and recall and discrimination as the basic quantitative performance measures for binary noninteractive retrieval systems. It then defines the characteristics of indexing that affect retrieval - namely, indexing devices, viewpoint-based and importance-based indexing exhaustivity, indexing specifity, indexing correctness, and indexing consistency - and examines in detail their effects on retrieval. It concludes that retrieval performance depends chiefly on the match between indexing and the requirements of the individual query and on the adaption of the query formulation to the characteristics of the retrieval system, and that the ensuing complexity must be considered in the design and testing of retrieval systems

Booth, A.: How consistent is MEDLINE indexing? (1990) 0.01

0.012372886 = product of:
  0.024745772 = sum of:
    0.024745772 = product of:
      0.049491543 = sum of:
        0.049491543 = weight(_text_:22 in 3510) [ClassicSimilarity], result of:
          0.049491543 = score(doc=3510,freq=2.0), product of:
            0.1827397 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052184064 = queryNorm
            0.2708308 = fieldWeight in 3510, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3510)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Health libraries review. 7(1990) no.1, S.22-26

Haanen, E.: Specificiteit en consistentie : een kwantitatief oderzoek naar trefwoordtoekenning door UBA en UBN (1991) 0.01
```
0.010890487 = product of:
  0.021780973 = sum of:
    0.021780973 = product of:
      0.043561947 = sum of:
        0.043561947 = weight(_text_:systems in 4778) [ClassicSimilarity], result of:
          0.043561947 = score(doc=4778,freq=2.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.2716328 = fieldWeight in 4778, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0625 = fieldNorm(doc=4778)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Online public access catalogues enable users to undertake subject searching by classification schedules, natural language, or controlled language terminology. In practice the 1st method is little used. Controlled language systems require indexers to index specifically and consistently. A comparative survey was made of indexing practices at Amsterdam and Mijmegen university libraries. On average Amsterdam assigned each document 3.5 index terms against 1.8 at Nijmegen. This discrepancy in indexing policy is the result of long-standing practices in each institution. Nijmegen has failed to utilise the advantages offered by online cataloges

Krovetz, R.; Croft, W.B.: Lexical ambiguity and information retrieval (1992) 0.01

0.009529176 = product of:
  0.019058352 = sum of:
    0.019058352 = product of:
      0.038116705 = sum of:
        0.038116705 = weight(_text_:systems in 4028) [ClassicSimilarity], result of:
          0.038116705 = score(doc=4028,freq=2.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.23767869 = fieldWeight in 4028, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4028)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: ACM transactions on information systems. 10(1992) no.2, S.115-141

Connell, T.H.: Use of the LCSH system : realities (1996) 0.01
```
0.009529176 = product of:
  0.019058352 = sum of:
    0.019058352 = product of:
      0.038116705 = sum of:
        0.038116705 = weight(_text_:systems in 6941) [ClassicSimilarity], result of:
          0.038116705 = score(doc=6941,freq=2.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.23767869 = fieldWeight in 6941, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6941)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Explores the question of whether academic libraries keep up with the changes in the LCSH system. Analysis of the handling of 15 subject headings in 50 academic library catalogues available via the Internet found that libraries are not consistently maintaining subject authority control, or making syndetic references and scope notes in their catalogues. Discusses the results from the perspective of the libraries' performance, performance on the headings overall, performance on references, performance on the type of change made to the headings,a nd performance within 3 widely used onlien catalogue systems (DRA, INNOPAC and NOTIS). Discusses the implications of the findings in relationship to expressions of dissatisfaction with the effectiveness of subject cataloguing expressed by discussion groups on the Internet
Tseng, Y.-H.: Keyword extraction techniques and relevance feedback (1997) 0.01
```
0.009529176 = product of:
  0.019058352 = sum of:
    0.019058352 = product of:
      0.038116705 = sum of:
        0.038116705 = weight(_text_:systems in 1830) [ClassicSimilarity], result of:
          0.038116705 = score(doc=1830,freq=2.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.23767869 = fieldWeight in 1830, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1830)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Automatic keyword extraction is an important and fundamental technology in an advanced information retrieval systems. Briefly compares several major keyword extraction methods, lists their advantages and disadvantages, and reports recent research progress in Taiwan. Also describes the application of a keyword extraction algorithm in an information retrieval system for relevance feedback. Preliminary analysis shows that the error rate of extracting relevant keywords is 18%, and that the precision rate is over 50%. The main disadvantage of this approach is that the extraction results depend on the retrieval results, which in turn depend on the data held by the database. Apart from collecting more data, this problem can be alleviated by the application of a thesaurus constructed by the same keyword extraction algorithm

David, C.; Giroux, L.; Bertrand-Gastaldy, S.; Lanteigne, D.: Indexing as problem solving : a cognitive approach to consistency (1995) 0.01

0.008167865 = product of:
  0.01633573 = sum of:
    0.01633573 = product of:
      0.03267146 = sum of:
        0.03267146 = weight(_text_:systems in 3609) [ClassicSimilarity], result of:
          0.03267146 = score(doc=3609,freq=2.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.2037246 = fieldWeight in 3609, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.046875 = fieldNorm(doc=3609)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Connectedness: information, systems, people, organizations. Proceedings of CAIS/ACSI 95, the proceedings of the 23rd Annual Conference of the Canadian Association for Information Science. Ed. by Hope A. Olson and Denis B. Ward

Harter, S.P.; Cheng, Y.-R.: Colinked descriptors : improving vocabulary selection for end-user searching (1996) 0.01
```
0.008167865 = product of:
  0.01633573 = sum of:
    0.01633573 = product of:
      0.03267146 = sum of:
        0.03267146 = weight(_text_:systems in 4216) [ClassicSimilarity], result of:
          0.03267146 = score(doc=4216,freq=2.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.2037246 = fieldWeight in 4216, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.046875 = fieldNorm(doc=4216)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This article introduces a new concept and technique for information retrieval called 'colinked descriptors'. Borrowed from an analogous idea in bibliometrics - cocited references - colinked descriptors provide a theory and method for identifying search terms that, by hypothesis, will be superior to those entered initially by a searcher. The theory suggests a means of moving automatically from 2 or more initial search terms, to other terms that should be superior in retrieval performance to the 2 original terms. A research project designed to test this colinked descriptor hypothesis is reported. The results suggest that the approach is effective, although methodological problems in testing the idea are reported. Algorithms to generate colinked descriptors can be incorporated easily into system interfaces, front-end or pre-search systems, or help software, in any database that employs a thesaurus. The potential use of colinked descriptors is a strong argument for building richer and more complex thesauri that reflect as many legitimate links among descriptors as possible
Keen, E.M.: Designing and testing an interactive ranked retrieval system for professional searchers (1994) 0.01
```
0.0068065543 = product of:
  0.013613109 = sum of:
    0.013613109 = product of:
      0.027226217 = sum of:
        0.027226217 = weight(_text_:systems in 1066) [ClassicSimilarity], result of:
          0.027226217 = score(doc=1066,freq=2.0), product of:
            0.16037072 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.052184064 = queryNorm
            0.1697705 = fieldWeight in 1066, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1066)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Reports 3 explorations of ranked system design. 2 tests used a 'cystic fibrosis' test collection with 100 queries. Experiment 1 compared a Boolean with a ranked interactive system using a subject qualified trained searcher, and reporting recall and precision results. Experiment 2 compared 15 different ranked match algorithms in a batch mode using 2 test collections, and included some new proximate pairs and term weighting approaches. Experiment 3 is a design plan for an interactive ranked prototype offering mid search algorithm choices plus other manual search devices (such as obligatory and unwanted terms), as influenced by thinking aloud comments from experiment 1. Concludes that, in Boolean versus ranked using inverse collection frequency, the searcher inspected more records on ranked than Boolean and so achieved a higher recall but lower precision; however, the presentation order of the relevant records, was, on average, very similar in both systems. Concludes also that: query reformulation was quite strongly practised in ranked searching but does not appear to have been effective; the term pairs proximate weithing methods in experiment 2 enhanced precision on both test collections when used with inverse collection frequency weighting (ICF); and the design plan for an interactive prototype adds to a selection of match algorithms other devices, such as obligatory and unwanted term marking, evidence for this being found from think aloud comments

Search (14 results, page 1 of 1)

Authors

Languages

Themes