Search (20 results, page 1 of 1)

Leininger, K.: Interindexer consistency in PsychINFO (2000) 0.04
```
0.039546072 = product of:
  0.11863822 = sum of:
    0.11863822 = sum of:
      0.0775097 = weight(_text_:database in 2552) [ClassicSimilarity], result of:
        0.0775097 = score(doc=2552,freq=4.0), product of:
          0.20452234 = queryWeight, product of:
            4.042444 = idf(docFreq=2109, maxDocs=44218)
            0.050593734 = queryNorm
          0.37897915 = fieldWeight in 2552, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            4.042444 = idf(docFreq=2109, maxDocs=44218)
            0.046875 = fieldNorm(doc=2552)
      0.041128512 = weight(_text_:22 in 2552) [ClassicSimilarity], result of:
        0.041128512 = score(doc=2552,freq=2.0), product of:
          0.17717063 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.050593734 = queryNorm
          0.23214069 = fieldWeight in 2552, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=2552)
  0.33333334 = coord(1/3)
```
Abstract

Reports results of a study to examine interindexer consistency (the degree to which indexers, when assigning terms to a chosen record, will choose the same terms to reflect that record) in the PsycINFO database using 60 records that were inadvertently processed twice between 1996 and 1998. Five aspects of interindexer consistency were analysed. Two methods were used to calculate interindexer consistency: one posited by Hooper (1965) and the other by Rollin (1981). Aspects analysed were: checktag consistency (66.24% using Hooper's calculation and 77.17% using Rollin's); major-to-all term consistency (49.31% and 62.59% respectively); overall indexing consistency (49.02% and 63.32%); classification code consistency (44.17% and 45.00%); and major-to-major term consistency (43.24% and 56.09%). The average consistency across all categories was 50.4% using Hooper's method and 60.83% using Rollin's. Although comparison with previous studies is difficult due to methodological variations in the overall study of indexing consistency and the specific characteristics of the database, results generally support previous findings when trends and similar studies are analysed.

Date

9. 2.1997 18:44:22
Qin, J.: Semantic similarities between a keyword database and a controlled vocabulary database : an investigation in the antibiotic resistance literature (2000) 0.02
```
0.015224343 = product of:
  0.045673028 = sum of:
    0.045673028 = product of:
      0.091346055 = sum of:
        0.091346055 = weight(_text_:database in 4386) [ClassicSimilarity], result of:
          0.091346055 = score(doc=4386,freq=8.0), product of:
            0.20452234 = queryWeight, product of:
              4.042444 = idf(docFreq=2109, maxDocs=44218)
              0.050593734 = queryNorm
            0.4466312 = fieldWeight in 4386, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              4.042444 = idf(docFreq=2109, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4386)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

The 'KeyWords Plus' in the Science Citation Index database represents an approach to combining citation and semantic indexing in describing the document content. This paper explores the similariites or dissimilarities between citation-semantic and analytic indexing. The dataset consisted of over 400 matching records in the SCI and MEDLINE databases on antibiotic resistance in pneumonia. The degree of similarity in indexing terms was found to vary on a scale from completely different to completely identical with various levels in between. The within-document similarity in the 2 databases was measured by a variation on the Jaccard coefficient - the Inclusion Index. The average inclusion coefficient was 0,4134 for SCI and 0,3371 for Medline. The 20 terms occuring most frequently in each database were identified. The 2 groups of terms shared the same terms that consist of the 'intellectual base' for the subject. conceptual similarity was analyzed through scatterplots of matching and nonmatching terms vs. partially identical and broader/narrower terms. The study also found that both databases differed in assigning terms in various semantic categories. Implications of this research and further studies are suggested
Deaves, J.C.; Pache, J.E.: Chemical and numerical indexing for the INSPEC database (1989) 0.02
```
0.0150713315 = product of:
  0.045213994 = sum of:
    0.045213994 = product of:
      0.09042799 = sum of:
        0.09042799 = weight(_text_:database in 2289) [ClassicSimilarity], result of:
          0.09042799 = score(doc=2289,freq=4.0), product of:
            0.20452234 = queryWeight, product of:
              4.042444 = idf(docFreq=2109, maxDocs=44218)
              0.050593734 = queryNorm
            0.44214234 = fieldWeight in 2289, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.042444 = idf(docFreq=2109, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2289)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

The wealth of chemical information on the INSPEC database is easily retrieved using the printed subject indexes to the associated abstract journals. However, this subject indexing is insufficient for machine retrieval, and free-text searching has special difficulties. An easy-to-use retrieval system has been developed which overcomes many problems, especially the retrieval of non-stoichiometric compositions, which are a feature solid-state chemistry. The scheme is limited to inorganic material, but allows flexibility and identification of dopants, interfaces and surfaces or substrates. At the same time, a system has been introduced for the online retrieval of numerical data included in the data base. This has successfully standardized the way in which such data is held for searching, enabling further refinement of searches where numerical information is significant
Hersh, W.R.; Hickam, D.H.: ¬A comparison of two methods for indexing and retrieval from a full-text medical database (1992) 0.02
```
0.0150713315 = product of:
  0.045213994 = sum of:
    0.045213994 = product of:
      0.09042799 = sum of:
        0.09042799 = weight(_text_:database in 4526) [ClassicSimilarity], result of:
          0.09042799 = score(doc=4526,freq=4.0), product of:
            0.20452234 = queryWeight, product of:
              4.042444 = idf(docFreq=2109, maxDocs=44218)
              0.050593734 = queryNorm
            0.44214234 = fieldWeight in 4526, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.042444 = idf(docFreq=2109, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4526)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Reports results of a study of 2 information retrieval systems on a 2.000 document full text medical database. The first system, SAPHIRE, features concept based automatic indexing and statistical retrieval techniques, while the second system, SWORD, features traditional word based Boolean techniques, 16 medical students at Oregon Health Sciences Univ. each performed 10 searches and their results, recorded in terms of recall and precision, showed nearly equal performance for both systems. SAPHIRE was also compared with a version of SWORD modified to use automatic indexing and ranked retrieval. Using batch input of queries, the latter method performed slightly better

Cleverdon, C.W.: ASLIB Cranfield Research Project : Report on the first stage of an investigation into the comparative efficiency of indexing systems (1960) 0.01

0.013709504 = product of:
  0.041128512 = sum of:
    0.041128512 = product of:
      0.082257025 = sum of:
        0.082257025 = weight(_text_:22 in 6158) [ClassicSimilarity], result of:
          0.082257025 = score(doc=6158,freq=2.0), product of:
            0.17717063 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050593734 = queryNorm
            0.46428138 = fieldWeight in 6158, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=6158)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Footnote: Rez. in: College and research libraries 22(1961) no.3, S.228 (G. Jahoda)

Gil-Leiva, I.; Alonso-Arroyo, A.: Keywords given by authors of scientific articles in database descriptors (2007) 0.01
```
0.0131846685 = product of:
  0.039554004 = sum of:
    0.039554004 = product of:
      0.07910801 = sum of:
        0.07910801 = weight(_text_:database in 211) [ClassicSimilarity], result of:
          0.07910801 = score(doc=211,freq=6.0), product of:
            0.20452234 = queryWeight, product of:
              4.042444 = idf(docFreq=2109, maxDocs=44218)
              0.050593734 = queryNorm
            0.38679397 = fieldWeight in 211, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.042444 = idf(docFreq=2109, maxDocs=44218)
              0.0390625 = fieldNorm(doc=211)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

In this article, the authors analyze the keywords given by authors of scientific articles and the descriptors assigned to the articles to ascertain the presence of the keywords in the descriptors. Six-hundred forty INSPEC (Information Service for Physics, Engineering, and Computing), CAB (Current Agriculture Bibliography) abstracts, ISTA (Information Science and Technology Abstracts), and LISA (Library and Information Science Abstracts) database records were consulted. After detailed comparisons, it was found that keywords provided by authors have an important presence in the database descriptors studied; nearly 25% of all the keywords appeared in exactly the same form as descriptors, with another 21% though normalized, still detected in the descriptors. This means that almost 46% of keywords appear in the descriptors, either as such or after normalization. Elsewhere, three distinct indexing policies appear, one represented by INSPEC and LISA (indexers seem to have freedom to assign the descriptors they deem necessary); another is represented by CAB (no record has fewer than four descriptors and, in general, a large number of descriptors is employed). In contrast, in ISTA, a certain institutional code exists towards economy in indexing because 84% of records contain only four descriptors.

Edwards, S.: Indexing practices at the National Agricultural Library (1993) 0.01

0.012179474 = product of:
  0.036538422 = sum of:
    0.036538422 = product of:
      0.073076844 = sum of:
        0.073076844 = weight(_text_:database in 555) [ClassicSimilarity], result of:
          0.073076844 = score(doc=555,freq=2.0), product of:
            0.20452234 = queryWeight, product of:
              4.042444 = idf(docFreq=2109, maxDocs=44218)
              0.050593734 = queryNorm
            0.35730496 = fieldWeight in 555, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.042444 = idf(docFreq=2109, maxDocs=44218)
              0.0625 = fieldNorm(doc=555)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Abstract: This article discusses indexing practices at the National Agriculture Library. Indexers at NAL scan over 2,200 incoming journals for input into its bibliographic database, AGRICOLA. The National Agriculture Library's coverage extends worldwide covering a broad range of agriculture subjects. Access to AGRICOLA occurs in several ways: onsite search, commercial vendors, Dialog Information Services, Inc. and BRS Information Technologies. The National Agricultural Library uses CAB THESAURUS to describe the subject content of articles in AGRICOLA.

Huffman, G.D.; Vital, D.A.; Bivins, R.G.: Generating indices with lexical association methods : term uniqueness (1990) 0.01
```
0.010765236 = product of:
  0.032295708 = sum of:
    0.032295708 = product of:
      0.064591415 = sum of:
        0.064591415 = weight(_text_:database in 4152) [ClassicSimilarity], result of:
          0.064591415 = score(doc=4152,freq=4.0), product of:
            0.20452234 = queryWeight, product of:
              4.042444 = idf(docFreq=2109, maxDocs=44218)
              0.050593734 = queryNorm
            0.31581596 = fieldWeight in 4152, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.042444 = idf(docFreq=2109, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4152)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

A software system has been developed which orders citations retrieved from an online database in terms of relevancy. The system resulted from an effort generated by NASA's Technology Utilization Program to create new advanced software tools to largely automate the process of determining relevancy of database citations retrieved to support large technology transfer studies. The ranking is based on the generation of an enriched vocabulary using lexical association methods, a user assessment of the vocabulary and a combination of the user assessment and the lexical metric. One of the key elements in relevancy ranking is the enriched vocabulary -the terms mst be both unique and descriptive. This paper examines term uniqueness. Six lexical association methods were employed to generate characteristic word indices. A limited subset of the terms - the highest 20,40,60 and 7,5% of the uniquess words - we compared and uniquess factors developed. Computational times were also measured. It was found that methods based on occurrences and signal produced virtually the same terms. The limited subset of terms producedby the exact and centroid discrimination value were also nearly identical. Unique terms sets were produced by teh occurrence, variance and discrimination value (centroid), An end-user evaluation showed that the generated terms were largely distinct and had values of word precision which were consistent with values of the search precision.
Bade, D.: ¬The creation and persistence of misinformation in shared library catalogs : language and subject knowledge in a technological era (2002) 0.01
```
0.010659572 = product of:
  0.031978715 = sum of:
    0.031978715 = sum of:
      0.018269211 = weight(_text_:database in 1858) [ClassicSimilarity], result of:
        0.018269211 = score(doc=1858,freq=2.0), product of:
          0.20452234 = queryWeight, product of:
            4.042444 = idf(docFreq=2109, maxDocs=44218)
            0.050593734 = queryNorm
          0.08932624 = fieldWeight in 1858, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            4.042444 = idf(docFreq=2109, maxDocs=44218)
            0.015625 = fieldNorm(doc=1858)
      0.013709505 = weight(_text_:22 in 1858) [ClassicSimilarity], result of:
        0.013709505 = score(doc=1858,freq=2.0), product of:
          0.17717063 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.050593734 = queryNorm
          0.07738023 = fieldWeight in 1858, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.015625 = fieldNorm(doc=1858)
  0.33333334 = coord(1/3)
```
Date

22. 9.1997 19:16:05

Footnote

Bade begins his discussion of errors in subject analysis by summarizing the contents of seven records containing what he considers to be egregious errors. The examples were drawn only from items that he has encountered in the course of his work. Five of the seven records were full-level ("I" level) records for Eastern European materials created between 1996 and 2000 in the OCLC WorldCat database. The final two examples were taken from records created by Bade himself over an unspecified period of time. Although he is to be commended for examining the actual items cataloged and for examining mostly items that he claims to have adequate linguistic and subject expertise to evaluate reliably, Bade's methodology has major flaws. First and foremost, the number of examples provided is completely inadequate to draw any conclusions about the extent of the problem. Although an in-depth qualitative analysis of a small number of records might have yielded some valuable insight into factors that contribute to errors in subject analysis, Bade provides no Information about the circumstances under which the live OCLC records he critiques were created. Instead, he offers simplistic explanations for the errors based solely an his own assumptions. He supplements his analysis of examples with an extremely brief survey of other studies regarding errors in subject analysis, which consists primarily of criticism of work done by Sheila Intner. In the end, it is impossible to draw any reliable conclusions about the nature or extent of errors in subject analysis found in records in shared bibliographic databases based an Bade's analysis. In the final third of the essay, Bade finally reveals his true concern: the deintellectualization of cataloging. It would strengthen the essay tremendously to present this as the primary premise from the very beginning, as this section offers glimpses of a compelling argument. Bade laments, "Many librarians simply do not sec cataloging as an intellectual activity requiring an educated mind" (p. 20). Commenting an recent trends in copy cataloging practice, he declares, "The disaster of our time is that this work is being done more and more by people who can neither evaluate nor correct imported errors and offen are forbidden from even thinking about it" (p. 26). Bade argues that the most valuable content found in catalog records is the intellectual content contributed by knowledgeable catalogers, and he asserts that to perform intellectually demanding tasks such as subject analysis reliably and effectively, catalogers must have the linguistic and subject knowledge required to gain at least a rudimentary understanding of the materials that they describe. He contends that requiring catalogers to quickly dispense with materials in unfamiliar languages and subjects clearly undermines their ability to perform the intellectual work of cataloging and leads to an increasing number of errors in the bibliographic records contributed to shared databases.
Krovetz, R.; Croft, W.B.: Lexical ambiguity and information retrieval (1992) 0.01
```
0.01065704 = product of:
  0.03197112 = sum of:
    0.03197112 = product of:
      0.06394224 = sum of:
        0.06394224 = weight(_text_:database in 4028) [ClassicSimilarity], result of:
          0.06394224 = score(doc=4028,freq=2.0), product of:
            0.20452234 = queryWeight, product of:
              4.042444 = idf(docFreq=2109, maxDocs=44218)
              0.050593734 = queryNorm
            0.31264183 = fieldWeight in 4028, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.042444 = idf(docFreq=2109, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4028)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Reports on an analysis of lexical ambiguity in information retrieval text collections and on experiments to determine the utility of word meanings for separating relevant from nonrelevant documents. Results show that there is considerable ambiguity even in a specialised database. Word senses provide a significant separation between relevant and nonrelevant documents, but several factors contribute to determining whether disambiguation will make an improvement in performance such as: resolving lexical ambiguity was found to have little impact on retrieval effectiveness for documents that have many words in common with the query. Discusses other uses of word sense disambiguation in an information retrieval context
Tseng, Y.-H.: Keyword extraction techniques and relevance feedback (1997) 0.01
```
0.01065704 = product of:
  0.03197112 = sum of:
    0.03197112 = product of:
      0.06394224 = sum of:
        0.06394224 = weight(_text_:database in 1830) [ClassicSimilarity], result of:
          0.06394224 = score(doc=1830,freq=2.0), product of:
            0.20452234 = queryWeight, product of:
              4.042444 = idf(docFreq=2109, maxDocs=44218)
              0.050593734 = queryNorm
            0.31264183 = fieldWeight in 1830, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.042444 = idf(docFreq=2109, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1830)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Automatic keyword extraction is an important and fundamental technology in an advanced information retrieval systems. Briefly compares several major keyword extraction methods, lists their advantages and disadvantages, and reports recent research progress in Taiwan. Also describes the application of a keyword extraction algorithm in an information retrieval system for relevance feedback. Preliminary analysis shows that the error rate of extracting relevant keywords is 18%, and that the precision rate is over 50%. The main disadvantage of this approach is that the extraction results depend on the retrieval results, which in turn depend on the data held by the database. Apart from collecting more data, this problem can be alleviated by the application of a thesaurus constructed by the same keyword extraction algorithm

Veenema, F.: To index or not to index (1996) 0.01

0.00913967 = product of:
  0.02741901 = sum of:
    0.02741901 = product of:
      0.05483802 = sum of:
        0.05483802 = weight(_text_:22 in 7247) [ClassicSimilarity], result of:
          0.05483802 = score(doc=7247,freq=2.0), product of:
            0.17717063 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050593734 = queryNorm
            0.30952093 = fieldWeight in 7247, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=7247)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Canadian journal of information and library science. 21(1996) no.2, S.1-22

Larson, R.R.: Experiments in automatic Library of Congress Classification (1992) 0.01
```
0.009134606 = product of:
  0.027403818 = sum of:
    0.027403818 = product of:
      0.054807637 = sum of:
        0.054807637 = weight(_text_:database in 1054) [ClassicSimilarity], result of:
          0.054807637 = score(doc=1054,freq=2.0), product of:
            0.20452234 = queryWeight, product of:
              4.042444 = idf(docFreq=2109, maxDocs=44218)
              0.050593734 = queryNorm
            0.26797873 = fieldWeight in 1054, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.042444 = idf(docFreq=2109, maxDocs=44218)
              0.046875 = fieldNorm(doc=1054)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

This article presents the results of research into the automatic selection of Library of Congress Classification numbers based on the titles and subject headings in MARC records. The method used in this study was based on partial match retrieval techniques using various elements of new recors (i.e., those to be classified) as "queries", and a test database of classification clusters generated from previously classified MARC records. Sixty individual methods for automatic classification were tested on a set of 283 new records, using all combinations of four different partial match methods, five query types, and three representations of search terms. The results indicate that if the best method for a particular case can be determined, then up to 86% of the new records may be correctly classified. The single method with the best accuracy was able to select the correct classification for about 46% of the new records.
Harter, S.P.; Cheng, Y.-R.: Colinked descriptors : improving vocabulary selection for end-user searching (1996) 0.01
```
0.009134606 = product of:
  0.027403818 = sum of:
    0.027403818 = product of:
      0.054807637 = sum of:
        0.054807637 = weight(_text_:database in 4216) [ClassicSimilarity], result of:
          0.054807637 = score(doc=4216,freq=2.0), product of:
            0.20452234 = queryWeight, product of:
              4.042444 = idf(docFreq=2109, maxDocs=44218)
              0.050593734 = queryNorm
            0.26797873 = fieldWeight in 4216, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.042444 = idf(docFreq=2109, maxDocs=44218)
              0.046875 = fieldNorm(doc=4216)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

This article introduces a new concept and technique for information retrieval called 'colinked descriptors'. Borrowed from an analogous idea in bibliometrics - cocited references - colinked descriptors provide a theory and method for identifying search terms that, by hypothesis, will be superior to those entered initially by a searcher. The theory suggests a means of moving automatically from 2 or more initial search terms, to other terms that should be superior in retrieval performance to the 2 original terms. A research project designed to test this colinked descriptor hypothesis is reported. The results suggest that the approach is effective, although methodological problems in testing the idea are reported. Algorithms to generate colinked descriptors can be incorporated easily into system interfaces, front-end or pre-search systems, or help software, in any database that employs a thesaurus. The potential use of colinked descriptors is a strong argument for building richer and more complex thesauri that reflect as many legitimate links among descriptors as possible

Booth, A.: How consistent is MEDLINE indexing? (1990) 0.01

0.007997211 = product of:
  0.023991633 = sum of:
    0.023991633 = product of:
      0.047983266 = sum of:
        0.047983266 = weight(_text_:22 in 3510) [ClassicSimilarity], result of:
          0.047983266 = score(doc=3510,freq=2.0), product of:
            0.17717063 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050593734 = queryNorm
            0.2708308 = fieldWeight in 3510, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3510)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Health libraries review. 7(1990) no.1, S.22-26

Neshat, N.; Horri, A.: ¬A study of subject indexing consistency between the National Library of Iran and Humanities Libraries in the area of Iranian studies (2006) 0.01

0.007997211 = product of:
  0.023991633 = sum of:
    0.023991633 = product of:
      0.047983266 = sum of:
        0.047983266 = weight(_text_:22 in 230) [ClassicSimilarity], result of:
          0.047983266 = score(doc=230,freq=2.0), product of:
            0.17717063 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050593734 = queryNorm
            0.2708308 = fieldWeight in 230, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=230)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 4. 1.2007 10:22:26

Ellis, D.; Furner, J.; Willett, P.: On the creation of hypertext links in full-text documents : measurement of retrieval effectiveness (1996) 0.01
```
0.0076121716 = product of:
  0.022836514 = sum of:
    0.022836514 = product of:
      0.045673028 = sum of:
        0.045673028 = weight(_text_:database in 4214) [ClassicSimilarity], result of:
          0.045673028 = score(doc=4214,freq=2.0), product of:
            0.20452234 = queryWeight, product of:
              4.042444 = idf(docFreq=2109, maxDocs=44218)
              0.050593734 = queryNorm
            0.2233156 = fieldWeight in 4214, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.042444 = idf(docFreq=2109, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4214)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

An important stage in the process or retrieval of objects from a hypertext database is the creation of a set of internodal links that are intended to represent the relationships existing between objects; this operation is often undertaken manually, just as index terms are often manually assigned to documents in a conventional retrieval system. In an earlier article (1994), the results were published of a study in which several different sets of links were inserted, each by a different person, between the paragraphs of each of a number of full-text documents. These results showed little similarity between the link-sets, a finding that was comparable with those of studies of inter-indexer consistency, which suggest that there is generally only a low level of agreement between the sets of index terms assigned to a document by different indexers. In this article, a description is provided of an investigation into the nature of the relationship existing between (i) the levels of inter-linker consistency obtaining among the group of hypertext databases used in our earlier experiments, and (ii) the levels of effectiveness of a number of searches carried out in those databases. An account is given of the implementation of the searches and of the methods used in the calculation of numerical values expressing their effectiveness. Analysis of the results of a comparison between recorded levels of consistency and those of effectiveness does not allow us to draw conclusions about the consistency - effectiveness relationship that are equivalent to those drawn in comparable studies of inter-indexer consistency

Taniguchi, S.: Recording evidence in bibliographic records and descriptive metadata (2005) 0.01

0.006854752 = product of:
  0.020564256 = sum of:
    0.020564256 = product of:
      0.041128512 = sum of:
        0.041128512 = weight(_text_:22 in 3565) [ClassicSimilarity], result of:
          0.041128512 = score(doc=3565,freq=2.0), product of:
            0.17717063 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050593734 = queryNorm
            0.23214069 = fieldWeight in 3565, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=3565)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 18. 6.2005 13:16:22

Subrahmanyam, B.: Library of Congress Classification numbers : issues of consistency and their implications for union catalogs (2006) 0.01

0.005712294 = product of:
  0.017136881 = sum of:
    0.017136881 = product of:
      0.034273762 = sum of:
        0.034273762 = weight(_text_:22 in 5784) [ClassicSimilarity], result of:
          0.034273762 = score(doc=5784,freq=2.0), product of:
            0.17717063 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050593734 = queryNorm
            0.19345059 = fieldWeight in 5784, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5784)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 10. 9.2000 17:38:22

White, H.; Willis, C.; Greenberg, J.: HIVEing : the effect of a semantic web technology on inter-indexer consistency (2014) 0.01
```
0.005712294 = product of:
  0.017136881 = sum of:
    0.017136881 = product of:
      0.034273762 = sum of:
        0.034273762 = weight(_text_:22 in 1781) [ClassicSimilarity], result of:
          0.034273762 = score(doc=1781,freq=2.0), product of:
            0.17717063 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050593734 = queryNorm
            0.19345059 = fieldWeight in 1781, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1781)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Purpose - The purpose of this paper is to examine the effect of the Helping Interdisciplinary Vocabulary Engineering (HIVE) system on the inter-indexer consistency of information professionals when assigning keywords to a scientific abstract. This study examined first, the inter-indexer consistency of potential HIVE users; second, the impact HIVE had on consistency; and third, challenges associated with using HIVE. Design/methodology/approach - A within-subjects quasi-experimental research design was used for this study. Data were collected using a task-scenario based questionnaire. Analysis was performed on consistency results using Hooper's and Rolling's inter-indexer consistency measures. A series of t-tests was used to judge the significance between consistency measure results. Findings - Results suggest that HIVE improves inter-indexing consistency. Working with HIVE increased consistency rates by 22 percent (Rolling's) and 25 percent (Hooper's) when selecting relevant terms from all vocabularies. A statistically significant difference exists between the assignment of free-text keywords and machine-aided keywords. Issues with homographs, disambiguation, vocabulary choice, and document structure were all identified as potential challenges. Research limitations/implications - Research limitations for this study can be found in the small number of vocabularies used for the study. Future research will include implementing HIVE into the Dryad Repository and studying its application in a repository system. Originality/value - This paper showcases several features used in HIVE system. By using traditional consistency measures to evaluate a semantic web technology, this paper emphasizes the link between traditional indexing and next generation machine-aided indexing (MAI) tools.

Search (20 results, page 1 of 1)

Authors

Years

Languages

Types

Themes