Search (60 results, page 1 of 3)

  • × theme_ss:"Indexierungsstudien"
  1. Rowley, J.: ¬The controlled versus natural indexing languages debate revisited : a perspective on information retrieval practice and research (1994) 0.02
    0.015723431 = product of:
      0.073376015 = sum of:
        0.032137483 = weight(_text_:wide in 7151) [ClassicSimilarity], result of:
          0.032137483 = score(doc=7151,freq=2.0), product of:
            0.1312982 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.029633347 = queryNorm
            0.24476713 = fieldWeight in 7151, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0390625 = fieldNorm(doc=7151)
        0.011280581 = weight(_text_:information in 7151) [ClassicSimilarity], result of:
          0.011280581 = score(doc=7151,freq=10.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.21684799 = fieldWeight in 7151, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=7151)
        0.029957948 = weight(_text_:retrieval in 7151) [ClassicSimilarity], result of:
          0.029957948 = score(doc=7151,freq=8.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.33420905 = fieldWeight in 7151, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=7151)
      0.21428572 = coord(3/14)
    
    Abstract
    This article revisits the debate concerning controlled and natural indexing languages, as used in searching the databases of the online hosts, in-house information retrieval systems, online public access catalogues and databases stored on CD-ROM. The debate was first formulated in the early days of information retrieval more than a century ago but, despite significant advance in technology, remains unresolved. The article divides the history of the debate into four eras. Era one was characterised by the introduction of controlled vocabulary. Era two focused on comparisons between different indexing languages in order to assess which was best. Era three saw a number of case studies of limited generalisability and a general recognition that the best search performance can be achieved by the parallel use of the two types of indexing languages. The emphasis in Era four has been on the development of end-user-based systems, including online public access catalogues and databases on CD-ROM. Recent developments in the use of expert systems techniques to support the representation of meaning may lead to systems which offer significant support to the user in end-user searching. In the meantime, however, information retrieval in practice involves a mixture of natural and controlled indexing languages used to search a wide variety of different kinds of databases
    Source
    Journal of information science. 20(1994) no.2, S.108-119
  2. Soergel, D.: Indexing and retrieval performance : the logical evidence (1994) 0.01
    0.009482353 = product of:
      0.06637647 = sum of:
        0.0070627616 = weight(_text_:information in 579) [ClassicSimilarity], result of:
          0.0070627616 = score(doc=579,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.13576832 = fieldWeight in 579, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=579)
        0.05931371 = weight(_text_:retrieval in 579) [ClassicSimilarity], result of:
          0.05931371 = score(doc=579,freq=16.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.6617001 = fieldWeight in 579, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=579)
      0.14285715 = coord(2/14)
    
    Abstract
    This article presents a logical analysis of the characteristics of indexing and their effects on retrieval performance.It establishes the ability to ask the questions one needs to ask as the foundation of performance evaluation, and recall and discrimination as the basic quantitative performance measures for binary noninteractive retrieval systems. It then defines the characteristics of indexing that affect retrieval - namely, indexing devices, viewpoint-based and importance-based indexing exhaustivity, indexing specifity, indexing correctness, and indexing consistency - and examines in detail their effects on retrieval. It concludes that retrieval performance depends chiefly on the match between indexing and the requirements of the individual query and on the adaption of the query formulation to the characteristics of the retrieval system, and that the ensuing complexity must be considered in the design and testing of retrieval systems
    Source
    Journal of the American Society for Information Science. 45(1994) no.8, S.589-599
  3. Cleverdon, C.W.: Evaluation tests of information retrieval systems (1970) 0.01
    0.009153739 = product of:
      0.06407617 = sum of:
        0.016143454 = weight(_text_:information in 2272) [ClassicSimilarity], result of:
          0.016143454 = score(doc=2272,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.3103276 = fieldWeight in 2272, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.125 = fieldNorm(doc=2272)
        0.047932718 = weight(_text_:retrieval in 2272) [ClassicSimilarity], result of:
          0.047932718 = score(doc=2272,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.5347345 = fieldWeight in 2272, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.125 = fieldNorm(doc=2272)
      0.14285715 = coord(2/14)
    
  4. Azubuike, A.A.; Umoh, J.S.: Computerized information storage and retrieval systems (1988) 0.01
    0.009153739 = product of:
      0.06407617 = sum of:
        0.016143454 = weight(_text_:information in 4153) [ClassicSimilarity], result of:
          0.016143454 = score(doc=4153,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.3103276 = fieldWeight in 4153, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.125 = fieldNorm(doc=4153)
        0.047932718 = weight(_text_:retrieval in 4153) [ClassicSimilarity], result of:
          0.047932718 = score(doc=4153,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.5347345 = fieldWeight in 4153, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.125 = fieldNorm(doc=4153)
      0.14285715 = coord(2/14)
    
  5. Krovetz, R.; Croft, W.B.: Lexical ambiguity and information retrieval (1992) 0.01
    0.008009522 = product of:
      0.05606665 = sum of:
        0.014125523 = weight(_text_:information in 4028) [ClassicSimilarity], result of:
          0.014125523 = score(doc=4028,freq=8.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.27153665 = fieldWeight in 4028, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4028)
        0.04194113 = weight(_text_:retrieval in 4028) [ClassicSimilarity], result of:
          0.04194113 = score(doc=4028,freq=8.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.46789268 = fieldWeight in 4028, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4028)
      0.14285715 = coord(2/14)
    
    Abstract
    Reports on an analysis of lexical ambiguity in information retrieval text collections and on experiments to determine the utility of word meanings for separating relevant from nonrelevant documents. Results show that there is considerable ambiguity even in a specialised database. Word senses provide a significant separation between relevant and nonrelevant documents, but several factors contribute to determining whether disambiguation will make an improvement in performance such as: resolving lexical ambiguity was found to have little impact on retrieval effectiveness for documents that have many words in common with the query. Discusses other uses of word sense disambiguation in an information retrieval context
    Source
    ACM transactions on information systems. 10(1992) no.2, S.115-141
  6. White, H.; Willis, C.; Greenberg, J.: HIVEing : the effect of a semantic web technology on inter-indexer consistency (2014) 0.01
    0.0077985805 = product of:
      0.036393374 = sum of:
        0.02465703 = weight(_text_:web in 1781) [ClassicSimilarity], result of:
          0.02465703 = score(doc=1781,freq=4.0), product of:
            0.09670874 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.029633347 = queryNorm
            0.25496176 = fieldWeight in 1781, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1781)
        0.0050448296 = weight(_text_:information in 1781) [ClassicSimilarity], result of:
          0.0050448296 = score(doc=1781,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.09697737 = fieldWeight in 1781, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1781)
        0.0066915164 = product of:
          0.020074548 = sum of:
            0.020074548 = weight(_text_:22 in 1781) [ClassicSimilarity], result of:
              0.020074548 = score(doc=1781,freq=2.0), product of:
                0.103770934 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.029633347 = queryNorm
                0.19345059 = fieldWeight in 1781, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1781)
          0.33333334 = coord(1/3)
      0.21428572 = coord(3/14)
    
    Abstract
    Purpose - The purpose of this paper is to examine the effect of the Helping Interdisciplinary Vocabulary Engineering (HIVE) system on the inter-indexer consistency of information professionals when assigning keywords to a scientific abstract. This study examined first, the inter-indexer consistency of potential HIVE users; second, the impact HIVE had on consistency; and third, challenges associated with using HIVE. Design/methodology/approach - A within-subjects quasi-experimental research design was used for this study. Data were collected using a task-scenario based questionnaire. Analysis was performed on consistency results using Hooper's and Rolling's inter-indexer consistency measures. A series of t-tests was used to judge the significance between consistency measure results. Findings - Results suggest that HIVE improves inter-indexing consistency. Working with HIVE increased consistency rates by 22 percent (Rolling's) and 25 percent (Hooper's) when selecting relevant terms from all vocabularies. A statistically significant difference exists between the assignment of free-text keywords and machine-aided keywords. Issues with homographs, disambiguation, vocabulary choice, and document structure were all identified as potential challenges. Research limitations/implications - Research limitations for this study can be found in the small number of vocabularies used for the study. Future research will include implementing HIVE into the Dryad Repository and studying its application in a repository system. Originality/value - This paper showcases several features used in HIVE system. By using traditional consistency measures to evaluate a semantic web technology, this paper emphasizes the link between traditional indexing and next generation machine-aided indexing (MAI) tools.
  7. Hersh, W.R.; Hickam, D.H.: ¬A comparison of two methods for indexing and retrieval from a full-text medical database (1992) 0.01
    0.0077391705 = product of:
      0.054174192 = sum of:
        0.012233062 = weight(_text_:information in 4526) [ClassicSimilarity], result of:
          0.012233062 = score(doc=4526,freq=6.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.23515764 = fieldWeight in 4526, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4526)
        0.04194113 = weight(_text_:retrieval in 4526) [ClassicSimilarity], result of:
          0.04194113 = score(doc=4526,freq=8.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.46789268 = fieldWeight in 4526, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4526)
      0.14285715 = coord(2/14)
    
    Abstract
    Reports results of a study of 2 information retrieval systems on a 2.000 document full text medical database. The first system, SAPHIRE, features concept based automatic indexing and statistical retrieval techniques, while the second system, SWORD, features traditional word based Boolean techniques, 16 medical students at Oregon Health Sciences Univ. each performed 10 searches and their results, recorded in terms of recall and precision, showed nearly equal performance for both systems. SAPHIRE was also compared with a version of SWORD modified to use automatic indexing and ranked retrieval. Using batch input of queries, the latter method performed slightly better
    Imprint
    Medford, NJ : Learned Information Inc.
    Source
    Proceedings of the 55th Annual Meeting of the American Society for Information Science, Pittsburgh, 26.-29.10.92. Ed.: D. Shaw
  8. Deaves, J.C.; Pache, J.E.: Chemical and numerical indexing for the INSPEC database (1989) 0.01
    0.0074184835 = product of:
      0.05192938 = sum of:
        0.009988253 = weight(_text_:information in 2289) [ClassicSimilarity], result of:
          0.009988253 = score(doc=2289,freq=4.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.1920054 = fieldWeight in 2289, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2289)
        0.04194113 = weight(_text_:retrieval in 2289) [ClassicSimilarity], result of:
          0.04194113 = score(doc=2289,freq=8.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.46789268 = fieldWeight in 2289, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2289)
      0.14285715 = coord(2/14)
    
    Abstract
    The wealth of chemical information on the INSPEC database is easily retrieved using the printed subject indexes to the associated abstract journals. However, this subject indexing is insufficient for machine retrieval, and free-text searching has special difficulties. An easy-to-use retrieval system has been developed which overcomes many problems, especially the retrieval of non-stoichiometric compositions, which are a feature solid-state chemistry. The scheme is limited to inorganic material, but allows flexibility and identification of dopants, interfaces and surfaces or substrates. At the same time, a system has been introduced for the online retrieval of numerical data included in the data base. This has successfully standardized the way in which such data is held for searching, enabling further refinement of searches where numerical information is significant
  9. Cleverdon, C.W.: ¬The Cranfield tests on index language devices (1967) 0.01
    0.006865305 = product of:
      0.04805713 = sum of:
        0.012107591 = weight(_text_:information in 1957) [ClassicSimilarity], result of:
          0.012107591 = score(doc=1957,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.23274569 = fieldWeight in 1957, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.09375 = fieldNorm(doc=1957)
        0.03594954 = weight(_text_:retrieval in 1957) [ClassicSimilarity], result of:
          0.03594954 = score(doc=1957,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.40105087 = fieldWeight in 1957, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.09375 = fieldNorm(doc=1957)
      0.14285715 = coord(2/14)
    
    Footnote
    Wiederabgedruckt in: Readings in information retrieval. Ed.: K. Sparck Jones u. P. Willett. San Francisco: Morgan Kaufmann 1997. S.47-58.
  10. Tseng, Y.-H.: Keyword extraction techniques and relevance feedback (1997) 0.01
    0.0066157626 = product of:
      0.046310335 = sum of:
        0.009988253 = weight(_text_:information in 1830) [ClassicSimilarity], result of:
          0.009988253 = score(doc=1830,freq=4.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.1920054 = fieldWeight in 1830, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1830)
        0.036322083 = weight(_text_:retrieval in 1830) [ClassicSimilarity], result of:
          0.036322083 = score(doc=1830,freq=6.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.40520695 = fieldWeight in 1830, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1830)
      0.14285715 = coord(2/14)
    
    Abstract
    Automatic keyword extraction is an important and fundamental technology in an advanced information retrieval systems. Briefly compares several major keyword extraction methods, lists their advantages and disadvantages, and reports recent research progress in Taiwan. Also describes the application of a keyword extraction algorithm in an information retrieval system for relevance feedback. Preliminary analysis shows that the error rate of extracting relevant keywords is 18%, and that the precision rate is over 50%. The main disadvantage of this approach is that the extraction results depend on the retrieval results, which in turn depend on the data held by the database. Apart from collecting more data, this problem can be alleviated by the application of a thesaurus constructed by the same keyword extraction algorithm
  11. Burgin, R.: ¬The effect of indexing exhaustivity on retrieval performance (1991) 0.01
    0.006606658 = product of:
      0.046246603 = sum of:
        0.0060537956 = weight(_text_:information in 5262) [ClassicSimilarity], result of:
          0.0060537956 = score(doc=5262,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.116372846 = fieldWeight in 5262, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=5262)
        0.04019281 = weight(_text_:retrieval in 5262) [ClassicSimilarity], result of:
          0.04019281 = score(doc=5262,freq=10.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.44838852 = fieldWeight in 5262, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=5262)
      0.14285715 = coord(2/14)
    
    Abstract
    The study was based on the collection examnined by W.H. Shaw (Inf. proc. man. 26(1990) no.6, S.693-703, 705-718), a test collection of 1239 articles, indexed with the term cystic fibrosis; and 100 queries with 3 sets of relevance evaluations from subject experts. The effect of variations in indexing exhaustivity on retrieval performance in a vector space retrieval system was investigated by using a term weight threshold to construct different document representations for a test collection. Retrieval results showed that retrieval performance, as measured by the mean optimal measure for all queries at a term weight threshold, was highest at the most exhaustive representation, and decreased slightly as terms were eliminated and the indexing representation became less exhaustive. The findings suggest that the vector space model is more robust against variations in indexing exhaustivity that is the single-link clustering model
    Source
    Information processing and management. 27(1991) no.6, S.623-628
  12. Boyce, B.R.; McLain, J.P.: Entry point depth and online search using a controlled vocabulary (1989) 0.01
    0.0061978353 = product of:
      0.043384846 = sum of:
        0.0070627616 = weight(_text_:information in 2287) [ClassicSimilarity], result of:
          0.0070627616 = score(doc=2287,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.13576832 = fieldWeight in 2287, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2287)
        0.036322083 = weight(_text_:retrieval in 2287) [ClassicSimilarity], result of:
          0.036322083 = score(doc=2287,freq=6.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.40520695 = fieldWeight in 2287, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2287)
      0.14285715 = coord(2/14)
    
    Abstract
    The depth of indexing, the number of terms assigned on average to each document in a retrieval system as entry points, has a significantly effect on the standard retrieval performance measures in modern commercial retrieval systems, just as it did in previous experimental work. Tests on the effect of basic index search, as opposed to controlled vocabulary search, in these real systems are quite different than traditional comparisons of free text searching with controlled vocabulary searching. In modern commercial systems the controlled vocabulary serves as a precision device, since the strucure of the default for unqualified search terms in these systems requires that it do so.
    Source
    Journal of the American Society for Information Science. 40(1989), S.273-276
  13. Harter, S.P.; Cheng, Y.-R.: Colinked descriptors : improving vocabulary selection for end-user searching (1996) 0.00
    0.0048545036 = product of:
      0.033981524 = sum of:
        0.00856136 = weight(_text_:information in 4216) [ClassicSimilarity], result of:
          0.00856136 = score(doc=4216,freq=4.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.16457605 = fieldWeight in 4216, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=4216)
        0.025420163 = weight(_text_:retrieval in 4216) [ClassicSimilarity], result of:
          0.025420163 = score(doc=4216,freq=4.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.2835858 = fieldWeight in 4216, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=4216)
      0.14285715 = coord(2/14)
    
    Abstract
    This article introduces a new concept and technique for information retrieval called 'colinked descriptors'. Borrowed from an analogous idea in bibliometrics - cocited references - colinked descriptors provide a theory and method for identifying search terms that, by hypothesis, will be superior to those entered initially by a searcher. The theory suggests a means of moving automatically from 2 or more initial search terms, to other terms that should be superior in retrieval performance to the 2 original terms. A research project designed to test this colinked descriptor hypothesis is reported. The results suggest that the approach is effective, although methodological problems in testing the idea are reported. Algorithms to generate colinked descriptors can be incorporated easily into system interfaces, front-end or pre-search systems, or help software, in any database that employs a thesaurus. The potential use of colinked descriptors is a strong argument for building richer and more complex thesauri that reflect as many legitimate links among descriptors as possible
    Source
    Journal of the American Society for Information Science. 47(1996) no.4, S.311-325
  14. Braam, R.R.; Bruil, J.: Quality of indexing information : authors' views on indexing of their articles in chemical abstracts online CA-file (1992) 0.00
    0.0048545036 = product of:
      0.033981524 = sum of:
        0.00856136 = weight(_text_:information in 2638) [ClassicSimilarity], result of:
          0.00856136 = score(doc=2638,freq=4.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.16457605 = fieldWeight in 2638, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2638)
        0.025420163 = weight(_text_:retrieval in 2638) [ClassicSimilarity], result of:
          0.025420163 = score(doc=2638,freq=4.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.2835858 = fieldWeight in 2638, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=2638)
      0.14285715 = coord(2/14)
    
    Abstract
    Studies the quality of subject indexing by Chemical Abstracts Indexing Service by confronting authors with the particular indexing terms attributed to their computer, for 270 articles published in 54 journals, 5 articles out of each journal. Responses (80%) indicate the superior quality of keywords, both as content descriptors and as retrieval tools. Author judgements on these 2 different aspects do not always converge, however. CAS's indexing policy to cover only 'new' aspects is reflected in author's judgements that index lists are somewhat incomplete, in particular in the case of thesaurus terms (index headings). The large effort expanded by CAS in maintaining and using a subject thesuaurs, in order to select valid index headings, as compared to quick and cheap keyword postings, does not lead to clear superior quality of thesaurus terms for document description nor in retrieval. Some 20% of papers were not placed in 'proper' CA main section, according to authors. As concerns the use of indexing data by third parties, in bibliometrics, users should be aware of the indexing policies behind the data, in order to prevent invalid interpretations
    Source
    Journal of information science. 18(1992) no.5, S.399-408
  15. Soergel, D.: Indexing and retrieval performance : the logical evidence (1997) 0.00
    0.0045768693 = product of:
      0.032038085 = sum of:
        0.008071727 = weight(_text_:information in 578) [ClassicSimilarity], result of:
          0.008071727 = score(doc=578,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.1551638 = fieldWeight in 578, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=578)
        0.023966359 = weight(_text_:retrieval in 578) [ClassicSimilarity], result of:
          0.023966359 = score(doc=578,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.26736724 = fieldWeight in 578, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=578)
      0.14285715 = coord(2/14)
    
    Imprint
    The Hague : International Federation for Information and Documentation (FID)
  16. Saarti, J.: Consistency of subject indexing of novels by public library professionals and patrons (2002) 0.00
    0.0045768693 = product of:
      0.032038085 = sum of:
        0.008071727 = weight(_text_:information in 4473) [ClassicSimilarity], result of:
          0.008071727 = score(doc=4473,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.1551638 = fieldWeight in 4473, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=4473)
        0.023966359 = weight(_text_:retrieval in 4473) [ClassicSimilarity], result of:
          0.023966359 = score(doc=4473,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.26736724 = fieldWeight in 4473, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=4473)
      0.14285715 = coord(2/14)
    
    Abstract
    The paper discusses the consistency of fiction indexing of library professionals and patrons based on an empirical test. Indexing was carried out with a Finnish fictional thesaurus and all of the test persons indexed the same five novels. The consistency of indexing was determined to be low; several reasons are postulated. Also an algorithm for typified indexing of fiction is given as well as some suggestions for the development of fiction information retrieval systems and content representation.
  17. Ellis, D.; Furner, J.; Willett, P.: On the creation of hypertext links in full-text documents : measurement of retrieval effectiveness (1996) 0.00
    0.004427025 = product of:
      0.030989174 = sum of:
        0.0050448296 = weight(_text_:information in 4214) [ClassicSimilarity], result of:
          0.0050448296 = score(doc=4214,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.09697737 = fieldWeight in 4214, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4214)
        0.025944345 = weight(_text_:retrieval in 4214) [ClassicSimilarity], result of:
          0.025944345 = score(doc=4214,freq=6.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.28943354 = fieldWeight in 4214, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4214)
      0.14285715 = coord(2/14)
    
    Abstract
    An important stage in the process or retrieval of objects from a hypertext database is the creation of a set of internodal links that are intended to represent the relationships existing between objects; this operation is often undertaken manually, just as index terms are often manually assigned to documents in a conventional retrieval system. In an earlier article (1994), the results were published of a study in which several different sets of links were inserted, each by a different person, between the paragraphs of each of a number of full-text documents. These results showed little similarity between the link-sets, a finding that was comparable with those of studies of inter-indexer consistency, which suggest that there is generally only a low level of agreement between the sets of index terms assigned to a document by different indexers. In this article, a description is provided of an investigation into the nature of the relationship existing between (i) the levels of inter-linker consistency obtaining among the group of hypertext databases used in our earlier experiments, and (ii) the levels of effectiveness of a number of searches carried out in those databases. An account is given of the implementation of the searches and of the methods used in the calculation of numerical values expressing their effectiveness. Analysis of the results of a comparison between recorded levels of consistency and those of effectiveness does not allow us to draw conclusions about the consistency - effectiveness relationship that are equivalent to those drawn in comparable studies of inter-indexer consistency
    Source
    Journal of the American Society for Information Science. 47(1996) no.4, S.287-300
  18. Wolfram, D.; Zhang, J.: ¬An investigation of the influence of indexing exhaustivity and term distributions on a document space (2002) 0.00
    0.0037468998 = product of:
      0.026228298 = sum of:
        0.0050448296 = weight(_text_:information in 5238) [ClassicSimilarity], result of:
          0.0050448296 = score(doc=5238,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.09697737 = fieldWeight in 5238, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5238)
        0.021183468 = weight(_text_:retrieval in 5238) [ClassicSimilarity], result of:
          0.021183468 = score(doc=5238,freq=4.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.23632148 = fieldWeight in 5238, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5238)
      0.14285715 = coord(2/14)
    
    Abstract
    Wolfram and Zhang are interested in the effect of different indexing exhaustivity, by which they mean the number of terms chosen, and of different index term distributions and different term weighting methods on the resulting document cluster organization. The Distance Angle Retrieval Environment, DARE, which provides a two dimensional display of retrieved documents was used to represent the document clusters based upon a document's distance from the searcher's main interest, and on the angle formed by the document, a point representing a minor interest, and the point representing the main interest. If the centroid and the origin of the document space are assigned as major and minor points the average distance between documents and the centroid can be measured providing an indication of cluster organization. in the form of a size normalized similarity measure. Using 500 records from NTIS and nine models created by intersecting low, observed, and high exhaustivity levels (based upon a negative binomial distribution) with shallow, observed, and steep term distributions (based upon a Zipf distribution) simulation runs were preformed using inverse document frequency, inter-document term frequency, and inverse document frequency based upon both inter and intra-document frequencies. Low exhaustivity and shallow distributions result in a more dense document space and less effective retrieval. High exhaustivity and steeper distributions result in a more diffuse space.
    Source
    Journal of the American Society for Information Science and Technology. 53(2002) no.11, S.944-952
  19. Larson, R.R.: Experiments in automatic Library of Congress Classification (1992) 0.00
    0.0034326524 = product of:
      0.024028566 = sum of:
        0.0060537956 = weight(_text_:information in 1054) [ClassicSimilarity], result of:
          0.0060537956 = score(doc=1054,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.116372846 = fieldWeight in 1054, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=1054)
        0.01797477 = weight(_text_:retrieval in 1054) [ClassicSimilarity], result of:
          0.01797477 = score(doc=1054,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.20052543 = fieldWeight in 1054, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=1054)
      0.14285715 = coord(2/14)
    
    Abstract
    This article presents the results of research into the automatic selection of Library of Congress Classification numbers based on the titles and subject headings in MARC records. The method used in this study was based on partial match retrieval techniques using various elements of new recors (i.e., those to be classified) as "queries", and a test database of classification clusters generated from previously classified MARC records. Sixty individual methods for automatic classification were tested on a set of 283 new records, using all combinations of four different partial match methods, five query types, and three representations of search terms. The results indicate that if the best method for a particular case can be determined, then up to 86% of the new records may be correctly classified. The single method with the best accuracy was able to select the correct classification for about 46% of the new records.
    Source
    Journal of the American Society for Information Science. 43(1992), S.130-148
  20. Chartron, G.; Dalbin, S.; Monteil, M.-G.; Verillon, M.: Indexation manuelle et indexation automatique : dépasser les oppositions (1989) 0.00
    0.0032137486 = product of:
      0.044992477 = sum of:
        0.044992477 = weight(_text_:wide in 3516) [ClassicSimilarity], result of:
          0.044992477 = score(doc=3516,freq=2.0), product of:
            0.1312982 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.029633347 = queryNorm
            0.342674 = fieldWeight in 3516, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3516)
      0.071428575 = coord(1/14)
    
    Abstract
    Report of a study comparing 2 methods of indexing: LEXINET, a computerised system for indexing titles and summaries only; and manual indexing of full texts, using the thesaurus developed by French Electricity (EDF). Both systems were applied to a collection of approximately 2.000 documents on artifical intelligence from the EDF data base. The results were then analysed to compare quantitative performance (number and range of terms) and qualitative performance (ambiguity of terms, specificity, variability, consistency). Overall, neither system proved ideal: LEXINET was deficient as regards lack of accessibility and excessive ambiguity; while the manual system gave rise to an over-wide variation of terms. The ideal system would appear to be a combination of automatic and manual systems, on the evidence produced here.

Authors

Languages

  • e 57
  • chi 1
  • d 1
  • f 1
  • More… Less…

Types

  • a 58
  • m 1
  • r 1
  • More… Less…