Search (4 results, page 1 of 1)

Losee, R.M.: Term dependence : a basis for Luhn and Zipf models (2001) 0.01
```
0.013040888 = product of:
  0.06520444 = sum of:
    0.06520444 = weight(_text_:index in 6976) [ClassicSimilarity], result of:
      0.06520444 = score(doc=6976,freq=2.0), product of:
        0.2250935 = queryWeight, product of:
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.051511593 = queryNorm
        0.28967714 = fieldWeight in 6976, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.046875 = fieldNorm(doc=6976)
  0.2 = coord(1/5)
```
Abstract

There are regularities in the statistical information provided by natural language terms about neighboring terms. We find that when phrase rank increases, moving from common to less common phrases, the value of the expected mutual information measure (EMIM) between the terms regularly decreases. Luhn's model suggests that midrange terms are the best index terms and relevance discriminators. We suggest reasons for this principle based on the empirical relationships shown here between the rank of terms within phrases and the average mutual information between terms, which we refer to as the Inverse Representation- EMIM principle. We also suggest an Inverse EMIM term weight for indexing or retrieval applications that is consistent with Luhn's distribution. An information theoretic interpretation of Zipf's Law is provided. Using the regularity noted here, we suggest that Zipf's Law is a consequence of the statistical dependencies that exist between terms, described here using information theoretic concepts.
Losee, R.M.: Decisions in thesaurus construction and use (2007) 0.01
```
0.013040888 = product of:
  0.06520444 = sum of:
    0.06520444 = weight(_text_:index in 924) [ClassicSimilarity], result of:
      0.06520444 = score(doc=924,freq=2.0), product of:
        0.2250935 = queryWeight, product of:
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.051511593 = queryNorm
        0.28967714 = fieldWeight in 924, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.046875 = fieldNorm(doc=924)
  0.2 = coord(1/5)
```
Abstract

A thesaurus and an ontology provide a set of structured terms, phrases, and metadata, often in a hierarchical arrangement, that may be used to index, search, and mine documents. We describe the decisions that should be made when including a term, deciding whether a term should be subdivided into its subclasses, or determining which of more than one set of possible subclasses should be used. Based on retrospective measurements or estimates of future performance when using thesaurus terms in document ordering, decisions are made so as to maximize performance. These decisions may be used in the automatic construction of a thesaurus. The evaluation of an existing thesaurus is described, consistent with the decision criteria developed here. These kinds of user-focused decision-theoretic techniques may be applied to other hierarchical applications, such as faceted classification systems used in information architecture or the use of hierarchical terms in "breadcrumb navigation".
Losee, R.M.: ¬The effect of assigning a metadata or indexing term on document ordering (2013) 0.01
```
0.013040888 = product of:
  0.06520444 = sum of:
    0.06520444 = weight(_text_:index in 1100) [ClassicSimilarity], result of:
      0.06520444 = score(doc=1100,freq=2.0), product of:
        0.2250935 = queryWeight, product of:
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.051511593 = queryNorm
        0.28967714 = fieldWeight in 1100, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.046875 = fieldNorm(doc=1100)
  0.2 = coord(1/5)
```
Abstract

The assignment of indexing terms and metadata to documents, data, and other information representations is considered useful, but the utility of including a single term is seldom discussed. The author discusses a simple model of document ordering and then shows how assigning index and metadata labels improves or decreases retrieval performance. The Indexing and Metadata Advantage (IMA) factor measures how indexing or assigning a metadata term helps (or hurts) ordering performance. Performance values and the associated IMA expressions are computed, consistent with several different assumptions. The economic value associated with various term assignment decisions is developed. The IMA term advantage model itself is empirically validated with computer software that shows that the analytic results obtained agree completely with the actual performance gains and losses found when ordering all sets of 14 or fewer documents. When the formulas in the software are changed to differ from this model, the predictions of the actual performance are erroneous.

Losee, R.M.: Determining information retrieval and filtering performance without experimentation (1995) 0.01

0.009770754 = product of:
  0.04885377 = sum of:
    0.04885377 = weight(_text_:22 in 3368) [ClassicSimilarity], result of:
      0.04885377 = score(doc=3368,freq=2.0), product of:
        0.18038483 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.051511593 = queryNorm
        0.2708308 = fieldWeight in 3368, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3368)
  0.2 = coord(1/5)

Date: 22. 2.1996 13:14:10

Search (4 results, page 1 of 1)

Years

Themes