Search (9 results, page 1 of 1)

Huffman, G.D.; Vital, D.A.; Bivins, R.G.: Generating indices with lexical association methods : term uniqueness (1990) 0.03
```
0.030214114 = product of:
  0.06042823 = sum of:
    0.06042823 = product of:
      0.12085646 = sum of:
        0.12085646 = weight(_text_:assessment in 4152) [ClassicSimilarity], result of:
          0.12085646 = score(doc=4152,freq=4.0), product of:
            0.2801951 = queryWeight, product of:
              5.52102 = idf(docFreq=480, maxDocs=44218)
              0.050750602 = queryNorm
            0.43132967 = fieldWeight in 4152, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.52102 = idf(docFreq=480, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4152)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

A software system has been developed which orders citations retrieved from an online database in terms of relevancy. The system resulted from an effort generated by NASA's Technology Utilization Program to create new advanced software tools to largely automate the process of determining relevancy of database citations retrieved to support large technology transfer studies. The ranking is based on the generation of an enriched vocabulary using lexical association methods, a user assessment of the vocabulary and a combination of the user assessment and the lexical metric. One of the key elements in relevancy ranking is the enriched vocabulary -the terms mst be both unique and descriptive. This paper examines term uniqueness. Six lexical association methods were employed to generate characteristic word indices. A limited subset of the terms - the highest 20,40,60 and 7,5% of the uniquess words - we compared and uniquess factors developed. Computational times were also measured. It was found that methods based on occurrences and signal produced virtually the same terms. The limited subset of terms producedby the exact and centroid discrimination value were also nearly identical. Unique terms sets were produced by teh occurrence, variance and discrimination value (centroid), An end-user evaluation showed that the generated terms were largely distinct and had values of word precision which were consistent with values of the search precision.
Westerman, S.J.; Cribbin, T.; Collins, J.: Human assessments of document similarity (2010) 0.03
```
0.025637524 = product of:
  0.05127505 = sum of:
    0.05127505 = product of:
      0.1025501 = sum of:
        0.1025501 = weight(_text_:assessment in 3915) [ClassicSimilarity], result of:
          0.1025501 = score(doc=3915,freq=2.0), product of:
            0.2801951 = queryWeight, product of:
              5.52102 = idf(docFreq=480, maxDocs=44218)
              0.050750602 = queryNorm
            0.36599535 = fieldWeight in 3915, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.52102 = idf(docFreq=480, maxDocs=44218)
              0.046875 = fieldNorm(doc=3915)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Two studies are reported that examined the reliability of human assessments of document similarity and the association between human ratings and the results of n-gram automatic text analysis (ATA). Human interassessor reliability (IAR) was moderate to poor. However, correlations between average human ratings and n-gram solutions were strong. The average correlation between ATA and individual human solutions was greater than IAR. N-gram length influenced the strength of association, but optimum string length depended on the nature of the text (technical vs. nontechnical). We conclude that the methodology applied in previous studies may have led to overoptimistic views on human reliability, but that an optimal n-gram solution can provide a good approximation of the average human assessment of document similarity, a result that has important implications for future development of document visualization systems.

Veenema, F.: To index or not to index (1996) 0.01

0.0137520125 = product of:
  0.027504025 = sum of:
    0.027504025 = product of:
      0.05500805 = sum of:
        0.05500805 = weight(_text_:22 in 7247) [ClassicSimilarity], result of:
          0.05500805 = score(doc=7247,freq=2.0), product of:
            0.17771997 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050750602 = queryNorm
            0.30952093 = fieldWeight in 7247, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=7247)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Canadian journal of information and library science. 21(1996) no.2, S.1-22

Booth, A.: How consistent is MEDLINE indexing? (1990) 0.01

0.012033011 = product of:
  0.024066022 = sum of:
    0.024066022 = product of:
      0.048132043 = sum of:
        0.048132043 = weight(_text_:22 in 3510) [ClassicSimilarity], result of:
          0.048132043 = score(doc=3510,freq=2.0), product of:
            0.17771997 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050750602 = queryNorm
            0.2708308 = fieldWeight in 3510, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3510)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Health libraries review. 7(1990) no.1, S.22-26

Neshat, N.; Horri, A.: ¬A study of subject indexing consistency between the National Library of Iran and Humanities Libraries in the area of Iranian studies (2006) 0.01

0.012033011 = product of:
  0.024066022 = sum of:
    0.024066022 = product of:
      0.048132043 = sum of:
        0.048132043 = weight(_text_:22 in 230) [ClassicSimilarity], result of:
          0.048132043 = score(doc=230,freq=2.0), product of:
            0.17771997 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050750602 = queryNorm
            0.2708308 = fieldWeight in 230, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=230)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 4. 1.2007 10:22:26

Taniguchi, S.: Recording evidence in bibliographic records and descriptive metadata (2005) 0.01

0.010314009 = product of:
  0.020628018 = sum of:
    0.020628018 = product of:
      0.041256037 = sum of:
        0.041256037 = weight(_text_:22 in 3565) [ClassicSimilarity], result of:
          0.041256037 = score(doc=3565,freq=2.0), product of:
            0.17771997 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050750602 = queryNorm
            0.23214069 = fieldWeight in 3565, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=3565)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 18. 6.2005 13:16:22

Leininger, K.: Interindexer consistency in PsychINFO (2000) 0.01

0.010314009 = product of:
  0.020628018 = sum of:
    0.020628018 = product of:
      0.041256037 = sum of:
        0.041256037 = weight(_text_:22 in 2552) [ClassicSimilarity], result of:
          0.041256037 = score(doc=2552,freq=2.0), product of:
            0.17771997 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050750602 = queryNorm
            0.23214069 = fieldWeight in 2552, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2552)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 9. 2.1997 18:44:22

Subrahmanyam, B.: Library of Congress Classification numbers : issues of consistency and their implications for union catalogs (2006) 0.01

0.0085950075 = product of:
  0.017190015 = sum of:
    0.017190015 = product of:
      0.03438003 = sum of:
        0.03438003 = weight(_text_:22 in 5784) [ClassicSimilarity], result of:
          0.03438003 = score(doc=5784,freq=2.0), product of:
            0.17771997 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050750602 = queryNorm
            0.19345059 = fieldWeight in 5784, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5784)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 10. 9.2000 17:38:22

White, H.; Willis, C.; Greenberg, J.: HIVEing : the effect of a semantic web technology on inter-indexer consistency (2014) 0.01
```
0.0085950075 = product of:
  0.017190015 = sum of:
    0.017190015 = product of:
      0.03438003 = sum of:
        0.03438003 = weight(_text_:22 in 1781) [ClassicSimilarity], result of:
          0.03438003 = score(doc=1781,freq=2.0), product of:
            0.17771997 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050750602 = queryNorm
            0.19345059 = fieldWeight in 1781, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1781)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Purpose - The purpose of this paper is to examine the effect of the Helping Interdisciplinary Vocabulary Engineering (HIVE) system on the inter-indexer consistency of information professionals when assigning keywords to a scientific abstract. This study examined first, the inter-indexer consistency of potential HIVE users; second, the impact HIVE had on consistency; and third, challenges associated with using HIVE. Design/methodology/approach - A within-subjects quasi-experimental research design was used for this study. Data were collected using a task-scenario based questionnaire. Analysis was performed on consistency results using Hooper's and Rolling's inter-indexer consistency measures. A series of t-tests was used to judge the significance between consistency measure results. Findings - Results suggest that HIVE improves inter-indexing consistency. Working with HIVE increased consistency rates by 22 percent (Rolling's) and 25 percent (Hooper's) when selecting relevant terms from all vocabularies. A statistically significant difference exists between the assignment of free-text keywords and machine-aided keywords. Issues with homographs, disambiguation, vocabulary choice, and document structure were all identified as potential challenges. Research limitations/implications - Research limitations for this study can be found in the small number of vocabularies used for the study. Future research will include implementing HIVE into the Dryad Repository and studying its application in a repository system. Originality/value - This paper showcases several features used in HIVE system. By using traditional consistency measures to evaluate a semantic web technology, this paper emphasizes the link between traditional indexing and next generation machine-aided indexing (MAI) tools.

Search (9 results, page 1 of 1)

Authors

Years

Themes