Search (23 results, page 1 of 2)

Hughes, A.V.; Rafferty, P.: Inter-indexer consistency in graphic materials indexing at the National Library of Wales (2011) 0.06
```
0.06119176 = product of:
  0.24476704 = sum of:
    0.24476704 = weight(_text_:graphic in 4488) [ClassicSimilarity], result of:
      0.24476704 = score(doc=4488,freq=10.0), product of:
        0.29924196 = queryWeight, product of:
          6.6217136 = idf(docFreq=159, maxDocs=44218)
          0.045191016 = queryNorm
        0.8179569 = fieldWeight in 4488, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          6.6217136 = idf(docFreq=159, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4488)
  0.25 = coord(1/4)
```
Abstract

Purpose - This paper seeks to report a project to investigate the degree of inter-indexer consistency in the assignment of controlled vocabulary topical subject index terms to identical graphical images by different indexers at the National Library of Wales (NLW). Design/methodology/approach - An experimental quantitative methodology was devised to investigate inter-indexer consistency. Additionally, the project investigated the relationship, if any, between indexing exhaustivity and consistency, and the relationship, if any, between indexing consistency/exhaustivity and broad category of graphic format. Findings - Inter-indexer consistency in the assignment of topical subject index terms to graphic materials at the NLW was found to be generally low and highly variable. Inter-indexer consistency fell within the range 10.8 per cent to 48.0 per cent. Indexing exhaustivity varied substantially from indexer to indexer, with a mean assignment of 3.8 terms by each indexer to each image, falling within the range 2.5 to 4.7 terms. The broad category of graphic format, whether photographic or non-photographic, was found to have little influence on either inter-indexer consistency or indexing exhaustivity. Indexing exhaustivity and inter-indexer consistency exhibited a tendency toward a direct, positive relationship. The findings are necessarily limited as this is a small-scale study within a single institution. Originality/value - Previous consistency studies have almost exclusively investigated the indexing of print materials, with very little research published for non-print media. With the literature also rich in discussion of the added complexities of subjectively representing the intellectual content of visual media, this study attempts to enrich existing knowledge on indexing consistency for graphic materials and to address a noticeable gap in information theory.
Leininger, K.: Interindexer consistency in PsychINFO (2000) 0.02
```
0.021289835 = product of:
  0.08515934 = sum of:
    0.08515934 = sum of:
      0.048422787 = weight(_text_:methods in 2552) [ClassicSimilarity], result of:
        0.048422787 = score(doc=2552,freq=2.0), product of:
          0.18168657 = queryWeight, product of:
            4.0204134 = idf(docFreq=2156, maxDocs=44218)
            0.045191016 = queryNorm
          0.26651827 = fieldWeight in 2552, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            4.0204134 = idf(docFreq=2156, maxDocs=44218)
            0.046875 = fieldNorm(doc=2552)
      0.03673655 = weight(_text_:22 in 2552) [ClassicSimilarity], result of:
        0.03673655 = score(doc=2552,freq=2.0), product of:
          0.15825124 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.045191016 = queryNorm
          0.23214069 = fieldWeight in 2552, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=2552)
  0.25 = coord(1/4)
```
Abstract

Reports results of a study to examine interindexer consistency (the degree to which indexers, when assigning terms to a chosen record, will choose the same terms to reflect that record) in the PsycINFO database using 60 records that were inadvertently processed twice between 1996 and 1998. Five aspects of interindexer consistency were analysed. Two methods were used to calculate interindexer consistency: one posited by Hooper (1965) and the other by Rollin (1981). Aspects analysed were: checktag consistency (66.24% using Hooper's calculation and 77.17% using Rollin's); major-to-all term consistency (49.31% and 62.59% respectively); overall indexing consistency (49.02% and 63.32%); classification code consistency (44.17% and 45.00%); and major-to-major term consistency (43.24% and 56.09%). The average consistency across all categories was 50.4% using Hooper's method and 60.83% using Rollin's. Although comparison with previous studies is difficult due to methodological variations in the overall study of indexing consistency and the specific characteristics of the database, results generally support previous findings when trends and similar studies are analysed.

Date

9. 2.1997 18:44:22
Lu, K.; Mao, J.; Li, G.: Toward effective automated weighted subject indexing : a comparison of different approaches in different environments (2018) 0.01
```
0.011278818 = product of:
  0.045115273 = sum of:
    0.045115273 = product of:
      0.09023055 = sum of:
        0.09023055 = weight(_text_:methods in 4292) [ClassicSimilarity], result of:
          0.09023055 = score(doc=4292,freq=10.0), product of:
            0.18168657 = queryWeight, product of:
              4.0204134 = idf(docFreq=2156, maxDocs=44218)
              0.045191016 = queryNorm
            0.4966275 = fieldWeight in 4292, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              4.0204134 = idf(docFreq=2156, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4292)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

Subject indexing plays an important role in supporting subject access to information resources. Current subject indexing systems do not make adequate distinctions on the importance of assigned subject descriptors. Assigning numeric weights to subject descriptors to distinguish their importance to the documents can strengthen the role of subject metadata. Automated methods are more cost-effective. This study compares different automated weighting methods in different environments. Two evaluation methods were used to assess the performance. Experiments on three datasets in the biomedical domain suggest the performance of different weighting methods depends on whether it is an abstract or full text environment. Mutual information with bag-of-words representation shows the best average performance in the full text environment, while cosine with bag-of-words representation is the best in an abstract environment. The cosine measure has relatively consistent and robust performance. A direct weighting method, IDF (Inverse Document Frequency), can produce quick and reasonable estimates of the weights. Bag-of-words representation generally outperforms the concept-based representation. Further improvement in performance can be obtained by using the learning-to-rank method to integrate different weighting methods. This study follows up Lu and Mao (Journal of the Association for Information Science and Technology, 66, 1776-1784, 2015), in which an automated weighted subject indexing method was proposed and validated. The findings from this study contribute to more effective weighted subject indexing.
Huffman, G.D.; Vital, D.A.; Bivins, R.G.: Generating indices with lexical association methods : term uniqueness (1990) 0.01
```
0.010088081 = product of:
  0.040352322 = sum of:
    0.040352322 = product of:
      0.080704644 = sum of:
        0.080704644 = weight(_text_:methods in 4152) [ClassicSimilarity], result of:
          0.080704644 = score(doc=4152,freq=8.0), product of:
            0.18168657 = queryWeight, product of:
              4.0204134 = idf(docFreq=2156, maxDocs=44218)
              0.045191016 = queryNorm
            0.4441971 = fieldWeight in 4152, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              4.0204134 = idf(docFreq=2156, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4152)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

A software system has been developed which orders citations retrieved from an online database in terms of relevancy. The system resulted from an effort generated by NASA's Technology Utilization Program to create new advanced software tools to largely automate the process of determining relevancy of database citations retrieved to support large technology transfer studies. The ranking is based on the generation of an enriched vocabulary using lexical association methods, a user assessment of the vocabulary and a combination of the user assessment and the lexical metric. One of the key elements in relevancy ranking is the enriched vocabulary -the terms mst be both unique and descriptive. This paper examines term uniqueness. Six lexical association methods were employed to generate characteristic word indices. A limited subset of the terms - the highest 20,40,60 and 7,5% of the uniquess words - we compared and uniquess factors developed. Computational times were also measured. It was found that methods based on occurrences and signal produced virtually the same terms. The limited subset of terms producedby the exact and centroid discrimination value were also nearly identical. Unique terms sets were produced by teh occurrence, variance and discrimination value (centroid), An end-user evaluation showed that the generated terms were largely distinct and had values of word precision which were consistent with values of the search precision.

Cleverdon, C.W.: ASLIB Cranfield Research Project : Report on the first stage of an investigation into the comparative efficiency of indexing systems (1960) 0.01

0.009184138 = product of:
  0.03673655 = sum of:
    0.03673655 = product of:
      0.0734731 = sum of:
        0.0734731 = weight(_text_:22 in 6158) [ClassicSimilarity], result of:
          0.0734731 = score(doc=6158,freq=2.0), product of:
            0.15825124 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045191016 = queryNorm
            0.46428138 = fieldWeight in 6158, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=6158)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Footnote: Rez. in: College and research libraries 22(1961) no.3, S.228 (G. Jahoda)

Larson, R.R.: Experiments in automatic Library of Congress Classification (1992) 0.01
```
0.0085600205 = product of:
  0.034240082 = sum of:
    0.034240082 = product of:
      0.068480164 = sum of:
        0.068480164 = weight(_text_:methods in 1054) [ClassicSimilarity], result of:
          0.068480164 = score(doc=1054,freq=4.0), product of:
            0.18168657 = queryWeight, product of:
              4.0204134 = idf(docFreq=2156, maxDocs=44218)
              0.045191016 = queryNorm
            0.37691376 = fieldWeight in 1054, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.0204134 = idf(docFreq=2156, maxDocs=44218)
              0.046875 = fieldNorm(doc=1054)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

This article presents the results of research into the automatic selection of Library of Congress Classification numbers based on the titles and subject headings in MARC records. The method used in this study was based on partial match retrieval techniques using various elements of new recors (i.e., those to be classified) as "queries", and a test database of classification clusters generated from previously classified MARC records. Sixty individual methods for automatic classification were tested on a set of 283 new records, using all combinations of four different partial match methods, five query types, and three representations of search terms. The results indicate that if the best method for a particular case can be determined, then up to 86% of the new records may be correctly classified. The single method with the best accuracy was able to select the correct classification for about 46% of the new records.

Hersh, W.R.; Hickam, D.H.: ¬A comparison of two methods for indexing and retrieval from a full-text medical database (1992) 0.01

0.0070616566 = product of:
  0.028246626 = sum of:
    0.028246626 = product of:
      0.056493253 = sum of:
        0.056493253 = weight(_text_:methods in 4526) [ClassicSimilarity], result of:
          0.056493253 = score(doc=4526,freq=2.0), product of:
            0.18168657 = queryWeight, product of:
              4.0204134 = idf(docFreq=2156, maxDocs=44218)
              0.045191016 = queryNorm
            0.31093797 = fieldWeight in 4526, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.0204134 = idf(docFreq=2156, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4526)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Chartron, G.; Dalbin, S.; Monteil, M.-G.; Verillon, M.: Indexation manuelle et indexation automatique : dépasser les oppositions (1989) 0.01
```
0.0070616566 = product of:
  0.028246626 = sum of:
    0.028246626 = product of:
      0.056493253 = sum of:
        0.056493253 = weight(_text_:methods in 3516) [ClassicSimilarity], result of:
          0.056493253 = score(doc=3516,freq=2.0), product of:
            0.18168657 = queryWeight, product of:
              4.0204134 = idf(docFreq=2156, maxDocs=44218)
              0.045191016 = queryNorm
            0.31093797 = fieldWeight in 3516, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.0204134 = idf(docFreq=2156, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3516)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

Report of a study comparing 2 methods of indexing: LEXINET, a computerised system for indexing titles and summaries only; and manual indexing of full texts, using the thesaurus developed by French Electricity (EDF). Both systems were applied to a collection of approximately 2.000 documents on artifical intelligence from the EDF data base. The results were then analysed to compare quantitative performance (number and range of terms) and qualitative performance (ambiguity of terms, specificity, variability, consistency). Overall, neither system proved ideal: LEXINET was deficient as regards lack of accessibility and excessive ambiguity; while the manual system gave rise to an over-wide variation of terms. The ideal system would appear to be a combination of automatic and manual systems, on the evidence produced here.
Tseng, Y.-H.: Keyword extraction techniques and relevance feedback (1997) 0.01
```
0.0070616566 = product of:
  0.028246626 = sum of:
    0.028246626 = product of:
      0.056493253 = sum of:
        0.056493253 = weight(_text_:methods in 1830) [ClassicSimilarity], result of:
          0.056493253 = score(doc=1830,freq=2.0), product of:
            0.18168657 = queryWeight, product of:
              4.0204134 = idf(docFreq=2156, maxDocs=44218)
              0.045191016 = queryNorm
            0.31093797 = fieldWeight in 1830, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.0204134 = idf(docFreq=2156, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1830)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

Automatic keyword extraction is an important and fundamental technology in an advanced information retrieval systems. Briefly compares several major keyword extraction methods, lists their advantages and disadvantages, and reports recent research progress in Taiwan. Also describes the application of a keyword extraction algorithm in an information retrieval system for relevance feedback. Preliminary analysis shows that the error rate of extracting relevant keywords is 18%, and that the precision rate is over 50%. The main disadvantage of this approach is that the extraction results depend on the retrieval results, which in turn depend on the data held by the database. Apart from collecting more data, this problem can be alleviated by the application of a thesaurus constructed by the same keyword extraction algorithm

Veenema, F.: To index or not to index (1996) 0.01

0.006122759 = product of:
  0.024491036 = sum of:
    0.024491036 = product of:
      0.048982073 = sum of:
        0.048982073 = weight(_text_:22 in 7247) [ClassicSimilarity], result of:
          0.048982073 = score(doc=7247,freq=2.0), product of:
            0.15825124 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045191016 = queryNorm
            0.30952093 = fieldWeight in 7247, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=7247)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Source: Canadian journal of information and library science. 21(1996) no.2, S.1-22

David, C.; Giroux, L.; Bertrand-Gastaldy, S.; Lanteigne, D.: Indexing as problem solving : a cognitive approach to consistency (1995) 0.01
```
0.0060528484 = product of:
  0.024211394 = sum of:
    0.024211394 = product of:
      0.048422787 = sum of:
        0.048422787 = weight(_text_:methods in 3609) [ClassicSimilarity], result of:
          0.048422787 = score(doc=3609,freq=2.0), product of:
            0.18168657 = queryWeight, product of:
              4.0204134 = idf(docFreq=2156, maxDocs=44218)
              0.045191016 = queryNorm
            0.26651827 = fieldWeight in 3609, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.0204134 = idf(docFreq=2156, maxDocs=44218)
              0.046875 = fieldNorm(doc=3609)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

Indexers differ in their judgement as to which terms reflect adequately the content of a document. Studies of interindexers' consistency identified several factors associated with low consistency, but failed to provide a comprehensive model of this phenomenon. Our research applies theories and methods from cognitive psychology to the study of indexing behavior. From a theoretical standpoint, indexing is considered as a problem solving situation. To access to the cognitive processes of indexers, 3 kinds of verbal reports are used. We will present results of an experiment in which 4 experienced indexers indexed the same documents. It will be shown that the 3 kinds of verbal reports provide complementary data on strategic behavior, and that it is of prime importance to consider the indexing task as an ill-defined problem, where the solution is partly defined by the indexer him(her)self
Hudon, M.: Conceptual compatibility in controlled language tools used to index and access the content of moving image collections (2004) 0.01
```
0.0060528484 = product of:
  0.024211394 = sum of:
    0.024211394 = product of:
      0.048422787 = sum of:
        0.048422787 = weight(_text_:methods in 2655) [ClassicSimilarity], result of:
          0.048422787 = score(doc=2655,freq=2.0), product of:
            0.18168657 = queryWeight, product of:
              4.0204134 = idf(docFreq=2156, maxDocs=44218)
              0.045191016 = queryNorm
            0.26651827 = fieldWeight in 2655, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.0204134 = idf(docFreq=2156, maxDocs=44218)
              0.046875 = fieldNorm(doc=2655)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

Five controlled vocabularies currently used for content representation in collections of non art moving images were examined to determine their level of conceptual compatibility. Methods borrowed from previous research in the area of indexing language compatibility were used. Quantitative data and qualitative observations allowed us to estimate more precisely and realistically the actual degree of conceptual redundancy in these indexing languages. It was found that the conceptual overlap is high enough to justify the pursuit of research and development work an a common basic indexing and access language that could be used to name objects, events, categories of persons, and relations most frequently depicted in non art moving image collections.

Booth, A.: How consistent is MEDLINE indexing? (1990) 0.01

0.005357414 = product of:
  0.021429656 = sum of:
    0.021429656 = product of:
      0.042859312 = sum of:
        0.042859312 = weight(_text_:22 in 3510) [ClassicSimilarity], result of:
          0.042859312 = score(doc=3510,freq=2.0), product of:
            0.15825124 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045191016 = queryNorm
            0.2708308 = fieldWeight in 3510, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3510)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Source: Health libraries review. 7(1990) no.1, S.22-26

Neshat, N.; Horri, A.: ¬A study of subject indexing consistency between the National Library of Iran and Humanities Libraries in the area of Iranian studies (2006) 0.01

0.005357414 = product of:
  0.021429656 = sum of:
    0.021429656 = product of:
      0.042859312 = sum of:
        0.042859312 = weight(_text_:22 in 230) [ClassicSimilarity], result of:
          0.042859312 = score(doc=230,freq=2.0), product of:
            0.15825124 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045191016 = queryNorm
            0.2708308 = fieldWeight in 230, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=230)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 4. 1.2007 10:22:26

Keen, E.M.: Designing and testing an interactive ranked retrieval system for professional searchers (1994) 0.01
```
0.0050440403 = product of:
  0.020176161 = sum of:
    0.020176161 = product of:
      0.040352322 = sum of:
        0.040352322 = weight(_text_:methods in 1066) [ClassicSimilarity], result of:
          0.040352322 = score(doc=1066,freq=2.0), product of:
            0.18168657 = queryWeight, product of:
              4.0204134 = idf(docFreq=2156, maxDocs=44218)
              0.045191016 = queryNorm
            0.22209854 = fieldWeight in 1066, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.0204134 = idf(docFreq=2156, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1066)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

Reports 3 explorations of ranked system design. 2 tests used a 'cystic fibrosis' test collection with 100 queries. Experiment 1 compared a Boolean with a ranked interactive system using a subject qualified trained searcher, and reporting recall and precision results. Experiment 2 compared 15 different ranked match algorithms in a batch mode using 2 test collections, and included some new proximate pairs and term weighting approaches. Experiment 3 is a design plan for an interactive ranked prototype offering mid search algorithm choices plus other manual search devices (such as obligatory and unwanted terms), as influenced by thinking aloud comments from experiment 1. Concludes that, in Boolean versus ranked using inverse collection frequency, the searcher inspected more records on ranked than Boolean and so achieved a higher recall but lower precision; however, the presentation order of the relevant records, was, on average, very similar in both systems. Concludes also that: query reformulation was quite strongly practised in ranked searching but does not appear to have been effective; the term pairs proximate weithing methods in experiment 2 enhanced precision on both test collections when used with inverse collection frequency weighting (ICF); and the design plan for an interactive prototype adds to a selection of match algorithms other devices, such as obligatory and unwanted term marking, evidence for this being found from think aloud comments
Ellis, D.; Furner, J.; Willett, P.: On the creation of hypertext links in full-text documents : measurement of retrieval effectiveness (1996) 0.01
```
0.0050440403 = product of:
  0.020176161 = sum of:
    0.020176161 = product of:
      0.040352322 = sum of:
        0.040352322 = weight(_text_:methods in 4214) [ClassicSimilarity], result of:
          0.040352322 = score(doc=4214,freq=2.0), product of:
            0.18168657 = queryWeight, product of:
              4.0204134 = idf(docFreq=2156, maxDocs=44218)
              0.045191016 = queryNorm
            0.22209854 = fieldWeight in 4214, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.0204134 = idf(docFreq=2156, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4214)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

An important stage in the process or retrieval of objects from a hypertext database is the creation of a set of internodal links that are intended to represent the relationships existing between objects; this operation is often undertaken manually, just as index terms are often manually assigned to documents in a conventional retrieval system. In an earlier article (1994), the results were published of a study in which several different sets of links were inserted, each by a different person, between the paragraphs of each of a number of full-text documents. These results showed little similarity between the link-sets, a finding that was comparable with those of studies of inter-indexer consistency, which suggest that there is generally only a low level of agreement between the sets of index terms assigned to a document by different indexers. In this article, a description is provided of an investigation into the nature of the relationship existing between (i) the levels of inter-linker consistency obtaining among the group of hypertext databases used in our earlier experiments, and (ii) the levels of effectiveness of a number of searches carried out in those databases. An account is given of the implementation of the searches and of the methods used in the calculation of numerical values expressing their effectiveness. Analysis of the results of a comparison between recorded levels of consistency and those of effectiveness does not allow us to draw conclusions about the consistency - effectiveness relationship that are equivalent to those drawn in comparable studies of inter-indexer consistency
Wolfram, D.; Zhang, J.: ¬An investigation of the influence of indexing exhaustivity and term distributions on a document space (2002) 0.01
```
0.0050440403 = product of:
  0.020176161 = sum of:
    0.020176161 = product of:
      0.040352322 = sum of:
        0.040352322 = weight(_text_:methods in 5238) [ClassicSimilarity], result of:
          0.040352322 = score(doc=5238,freq=2.0), product of:
            0.18168657 = queryWeight, product of:
              4.0204134 = idf(docFreq=2156, maxDocs=44218)
              0.045191016 = queryNorm
            0.22209854 = fieldWeight in 5238, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.0204134 = idf(docFreq=2156, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5238)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

Wolfram and Zhang are interested in the effect of different indexing exhaustivity, by which they mean the number of terms chosen, and of different index term distributions and different term weighting methods on the resulting document cluster organization. The Distance Angle Retrieval Environment, DARE, which provides a two dimensional display of retrieved documents was used to represent the document clusters based upon a document's distance from the searcher's main interest, and on the angle formed by the document, a point representing a minor interest, and the point representing the main interest. If the centroid and the origin of the document space are assigned as major and minor points the average distance between documents and the centroid can be measured providing an indication of cluster organization. in the form of a size normalized similarity measure. Using 500 records from NTIS and nine models created by intersecting low, observed, and high exhaustivity levels (based upon a negative binomial distribution) with shallow, observed, and steep term distributions (based upon a Zipf distribution) simulation runs were preformed using inverse document frequency, inter-document term frequency, and inverse document frequency based upon both inter and intra-document frequencies. Low exhaustivity and shallow distributions result in a more dense document space and less effective retrieval. High exhaustivity and steeper distributions result in a more diffuse space.
Ballard, R.M.: Indexing and its relevance to technical processing (1993) 0.01
```
0.0050440403 = product of:
  0.020176161 = sum of:
    0.020176161 = product of:
      0.040352322 = sum of:
        0.040352322 = weight(_text_:methods in 554) [ClassicSimilarity], result of:
          0.040352322 = score(doc=554,freq=2.0), product of:
            0.18168657 = queryWeight, product of:
              4.0204134 = idf(docFreq=2156, maxDocs=44218)
              0.045191016 = queryNorm
            0.22209854 = fieldWeight in 554, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.0204134 = idf(docFreq=2156, maxDocs=44218)
              0.0390625 = fieldNorm(doc=554)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

The development of regional on-line catalogs and in-house information systems for retrieval of references provide examples of the impact of indexing theory and applications on technical processing. More emphasis must be given to understanding the techniques for evaluating the effectiveness of a file, irrespective of whether that file was created as a library catalog or an index to information sources. The most significant advances in classification theory in recent decades has been as a result of efforts to improve effectiveness of indexing systems. Library classification systems are indexing languages or systems. Courses offered for the preparation of indexers in the United States and the United Kingdom are reviewed. A point of congruence for both the indexer and the library classifier would appear to be the need for a thorough preparation in the techniques of subject analysis. Any subject heading list will suffer from omissions as well as the inclusion of terms which the patron will never use. Indexing theory has provided the technical services department with methods for evaluation of effectiveness. The writer does not believe that these techniques are used, nor do current courses, workshops, and continuing education programs stress them. When theory is totally subjugated to practice, critical thinking and maximum effectiveness will suffer.

Taniguchi, S.: Recording evidence in bibliographic records and descriptive metadata (2005) 0.00

0.004592069 = product of:
  0.018368276 = sum of:
    0.018368276 = product of:
      0.03673655 = sum of:
        0.03673655 = weight(_text_:22 in 3565) [ClassicSimilarity], result of:
          0.03673655 = score(doc=3565,freq=2.0), product of:
            0.15825124 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045191016 = queryNorm
            0.23214069 = fieldWeight in 3565, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=3565)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 18. 6.2005 13:16:22

Lin, Y,-l.; Trattner, C.; Brusilovsky, P.; He, D.: ¬The impact of image descriptions on user tagging behavior : a study of the nature and functionality of crowdsourced tags (2015) 0.00
```
0.004035232 = product of:
  0.016140928 = sum of:
    0.016140928 = product of:
      0.032281857 = sum of:
        0.032281857 = weight(_text_:methods in 2159) [ClassicSimilarity], result of:
          0.032281857 = score(doc=2159,freq=2.0), product of:
            0.18168657 = queryWeight, product of:
              4.0204134 = idf(docFreq=2156, maxDocs=44218)
              0.045191016 = queryNorm
            0.17767884 = fieldWeight in 2159, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.0204134 = idf(docFreq=2156, maxDocs=44218)
              0.03125 = fieldNorm(doc=2159)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

Crowdsourcing has emerged as a way to harvest social wisdom from thousands of volunteers to perform a series of tasks online. However, little research has been devoted to exploring the impact of various factors such as the content of a resource or crowdsourcing interface design on user tagging behavior. Although images' titles and descriptions are frequently available in image digital libraries, it is not clear whether they should be displayed to crowdworkers engaged in tagging. This paper focuses on offering insight to the curators of digital image libraries who face this dilemma by examining (i) how descriptions influence the user in his/her tagging behavior and (ii) how this relates to the (a) nature of the tags, (b) the emergent folksonomy, and (c) the findability of the images in the tagging system. We compared two different methods for collecting image tags from Amazon's Mechanical Turk's crowdworkers-with and without image descriptions. Several properties of generated tags were examined from different perspectives: diversity, specificity, reusability, quality, similarity, descriptiveness, and so on. In addition, the study was carried out to examine the impact of image description on supporting users' information seeking with a tag cloud interface. The results showed that the properties of tags are affected by the crowdsourcing approach. Tags from the "with description" condition are more diverse and more specific than tags from the "without description" condition, while the latter has a higher tag reuse rate. A user study also revealed that different tag sets provided different support for search. Tags produced "with description" shortened the path to the target results, whereas tags produced without description increased user success in the search task.

Search (23 results, page 1 of 2)

Authors

Years

Languages

Types

Themes