Search (23 results, page 1 of 2)

Losee, R.M.: Determining information retrieval and filtering performance without experimentation (1995) 0.01

0.014677027 = product of:
  0.06849279 = sum of:
    0.012233062 = weight(_text_:information in 3368) [ClassicSimilarity], result of:
      0.012233062 = score(doc=3368,freq=6.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.23515764 = fieldWeight in 3368, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3368)
    0.046891607 = weight(_text_:retrieval in 3368) [ClassicSimilarity], result of:
      0.046891607 = score(doc=3368,freq=10.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.5231199 = fieldWeight in 3368, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3368)
    0.009368123 = product of:
      0.028104367 = sum of:
        0.028104367 = weight(_text_:22 in 3368) [ClassicSimilarity], result of:
          0.028104367 = score(doc=3368,freq=2.0), product of:
            0.103770934 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.029633347 = queryNorm
            0.2708308 = fieldWeight in 3368, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3368)
      0.33333334 = coord(1/3)
  0.21428572 = coord(3/14)

Abstract: The performance of an information retrieval or text and media filtering system may be determined through analytic methods as well as by traditional simulation or experimental methods. These analytic methods can provide precise statements about expected performance. They can thus determine which of 2 similarly performing systems is superior. For both a single query terms and for a multiple query term retrieval model, a model for comparing the performance of different probabilistic retrieval methods is developed. This method may be used in computing the average search length for a query, given only knowledge of database parameter values. Describes predictive models for inverse document frequency, binary independence, and relevance feedback based retrieval and filtering. Simulation illustrate how the single term model performs and sample performance predictions are given for single term and multiple term problems
Date: 22. 2.1996 13:14:10
Source: Information processing and management. 31(1995) no.4, S.555-572

Losee, R.M.: Term dependence : truncating the Bahadur Lazarsfeld expansion (1994) 0.01

0.009709007 = product of:
  0.06796305 = sum of:
    0.01712272 = weight(_text_:information in 7390) [ClassicSimilarity], result of:
      0.01712272 = score(doc=7390,freq=4.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.3291521 = fieldWeight in 7390, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.09375 = fieldNorm(doc=7390)
    0.050840326 = weight(_text_:retrieval in 7390) [ClassicSimilarity], result of:
      0.050840326 = score(doc=7390,freq=4.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.5671716 = fieldWeight in 7390, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.09375 = fieldNorm(doc=7390)
  0.14285715 = coord(2/14)

Abstract: Studies the performance of probabilistic information retrieval systems where differing statistical dependence assumptions are used when estimating the probabilities inherent in the retrieval model. Uses the Bahadur Lazarsfeld expansion model
Source: Information processing and management. 30(1994) no.2, S.293-303

Losee, R.M.; Church Jr., L.: Are two document clusters better than one? : the cluster performance question for information retrieval (2005) 0.01

0.008009522 = product of:
  0.05606665 = sum of:
    0.014125523 = weight(_text_:information in 3270) [ClassicSimilarity], result of:
      0.014125523 = score(doc=3270,freq=8.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.27153665 = fieldWeight in 3270, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3270)
    0.04194113 = weight(_text_:retrieval in 3270) [ClassicSimilarity], result of:
      0.04194113 = score(doc=3270,freq=8.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.46789268 = fieldWeight in 3270, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3270)
  0.14285715 = coord(2/14)

Abstract: When do information retrieval systems using two document clusters provide better retrieval performance than systems using no clustering? We answer this question for one set of assumptions and suggest how this may be studied with other assumptions. The "Cluster Hypothesis" asks an empirical question about the relationships between documents and user-supplied relevance judgments, while the "Cluster Performance Question" proposed here focuses an the when and why of information retrieval or digital library performance for clustered and unclustered text databases. This may be generalized to study the relative performance of m versus n clusters.
Source: Journal of the American Society for Information Science and Technology. 56(2005) no.1, S.106-108

Losee, R.M.: Upper bounds for retrieval performance and their user measuring performance and generating optimal queries : can it get any better than this? (1994) 0.01

0.007658652 = product of:
  0.053610563 = sum of:
    0.0060537956 = weight(_text_:information in 7418) [ClassicSimilarity], result of:
      0.0060537956 = score(doc=7418,freq=2.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.116372846 = fieldWeight in 7418, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=7418)
    0.04755677 = weight(_text_:retrieval in 7418) [ClassicSimilarity], result of:
      0.04755677 = score(doc=7418,freq=14.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.5305404 = fieldWeight in 7418, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=7418)
  0.14285715 = coord(2/14)

Abstract: The best-case, random and worst-case document rankings and retrieval performance may be determined using a method discussed here. Knowledge of the best case performance allows users and system designers to determine how close to the optimum condition their search is and select queries and matching functions that will produce the best results. Suggests a method for deriving the optimal Boolean query for a given level of recall and a method for determining the quality of a Boolean query. Measures are proposed that modify conventional text retrieval measures such as precision, E, and average search length, so that the values for these measures are 1 when retrieval is optimal, 0 when retrieval is random, and -1 when worst-case. Tests using one of these measures show that many retrieval are optimal? Consequences for retrieval research are examined
Source: Information processing and management. 30(1994) no.2, S.193-203

Losee, R.M.: Evaluating retrieval performance given database and query characteristics : analytic determination of performance surfaces (1996) 0.01
```
0.006964882 = product of:
  0.04875417 = sum of:
    0.00856136 = weight(_text_:information in 4162) [ClassicSimilarity], result of:
      0.00856136 = score(doc=4162,freq=4.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.16457605 = fieldWeight in 4162, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=4162)
    0.04019281 = weight(_text_:retrieval in 4162) [ClassicSimilarity], result of:
      0.04019281 = score(doc=4162,freq=10.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.44838852 = fieldWeight in 4162, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=4162)
  0.14285715 = coord(2/14)
```
Abstract

An analytic method of information retrieval and filtering evaluation can quantitatively predict the expected number of documents examined in retrieving a relevant document. It also allows researchers and practioners to qualitatively understand how varying different estimates of query parameter values affects retrieval performance. The incoorporation of relevance feedback to increase our knowledge about the parameters of relevant documents and the robustness of parameter estimates is modeled. Single term and two term independence models, as well as a complete term dependence model, are developed. An economic model of retrieval performance may be used to study the effects of database size and to provide analytic answers to questions comparing retrieval from small and large databases, as well as questions about the number of terms in a query. Results are presented as a performance surface, a three dimensional graph showing the effects of two independent variables on performance.

Source

Journal of the American Society for Information Science. 47(1996) no.1, S.95-105

Losee, R.M.: Comparing Boolean and probabilistic information retrieval systems across queries and disciplines (1997) 0.01

0.0069364496 = product of:
  0.048555143 = sum of:
    0.012233062 = weight(_text_:information in 7709) [ClassicSimilarity], result of:
      0.012233062 = score(doc=7709,freq=6.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.23515764 = fieldWeight in 7709, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7709)
    0.036322083 = weight(_text_:retrieval in 7709) [ClassicSimilarity], result of:
      0.036322083 = score(doc=7709,freq=6.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.40520695 = fieldWeight in 7709, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7709)
  0.14285715 = coord(2/14)

Abstract: Suggests a method for comparison of the use of Boolean queries and ranking documents using document and term weights, and examines their relative merits. The performance of information retrieval may be determined either by using experimental simulation, or through the application of analytic techniques that estimate the retrieval performance, given values for query and database characteristics. Using these performance predicting techniques, sample performance figures are provided for queries using the Boolean operators and, and or, as well as for probabilistic systems assuming statistical term independence or term dependence. Examines the performance of models failing to meet statistical and other assumptions
Source: Journal of the American Society for Information Science. 48(1997) no.2, S.143-156

Spink, A.; Losee, R.M.: Feedback in information retrieval (1996) 0.01

0.00683917 = product of:
  0.04787419 = sum of:
    0.013980643 = weight(_text_:information in 7441) [ClassicSimilarity], result of:
      0.013980643 = score(doc=7441,freq=6.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.2687516 = fieldWeight in 7441, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0625 = fieldNorm(doc=7441)
    0.033893548 = weight(_text_:retrieval in 7441) [ClassicSimilarity], result of:
      0.033893548 = score(doc=7441,freq=4.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.37811437 = fieldWeight in 7441, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=7441)
  0.14285715 = coord(2/14)

Abstract: State of the art review of the mechanisms of feedback in information retrieval (IR) in terms of feedback concepts and models in cybernetics and social sciences. Critically evaluates feedback research based on the traditional IR models and comparing the different approaches to automatic relevance feedback techniques, and feedback research within the framework of interactive IR models. Calls for an extension of the concept of feedback beyond relevance feedback to interactive feedback. Cites specific examples of feedback models used within IR research and presents 6 challenges to future research
Source: Annual review of information science and technology. 31(1996), S.33-78

Losee, R.M.: ¬A Gray code based ordering for documents on shelves : classification for browsing and retrieval (1992) 0.01
```
0.005984274 = product of:
  0.041889917 = sum of:
    0.012233062 = weight(_text_:information in 2335) [ClassicSimilarity], result of:
      0.012233062 = score(doc=2335,freq=6.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.23515764 = fieldWeight in 2335, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2335)
    0.029656855 = weight(_text_:retrieval in 2335) [ClassicSimilarity], result of:
      0.029656855 = score(doc=2335,freq=4.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.33085006 = fieldWeight in 2335, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2335)
  0.14285715 = coord(2/14)
```
Abstract

A document classifier places documents together in a linear arrangement for browsing or high-speed access by human or computerised information retrieval systems. Requirements for document classification and browsing systems are developed from similarity measures, distance measures, and the notion of subject aboutness. A requirement that documents be arranged in decreasing order of similarity as the distance from a given document increases can often not be met. Based on these requirements, information-theoretic considerations, and the Gray code, a classification system is proposed that can classifiy documents without human intervention. A measure of classifier performance is developed, and used to evaluate experimental results comparing the distance between subject headings assigned to documents given classifications from the proposed system and the Library of Congress Classification (LCC) system

Source

Journal of the American Society for Information Science. 43(1992) no.4, S.312-322

Losee, R.M.: When information retrieval measures agree about the relative quality of document rankings (2000) 0.01

0.005984274 = product of:
  0.041889917 = sum of:
    0.012233062 = weight(_text_:information in 4860) [ClassicSimilarity], result of:
      0.012233062 = score(doc=4860,freq=6.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.23515764 = fieldWeight in 4860, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4860)
    0.029656855 = weight(_text_:retrieval in 4860) [ClassicSimilarity], result of:
      0.029656855 = score(doc=4860,freq=4.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.33085006 = fieldWeight in 4860, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4860)
  0.14285715 = coord(2/14)

Abstract: The variety of performance measures available for information retrieval systems, search engines, and network filtering agents can be confusing to both practitioners and scholars. Most discussions about these measures address their theoretical foundations and the characteristics of a measure that make it desirable for a particular application. In this work, we consider how measures of performance at a point in a search may be formally compared. Criteria are developed that allow one to determine the percent of time or conditions under which 2 different performance measures suggest that one document ordering is superior to another ordering, or when the 2 measures disagree about the relative value of document orderings. As an example, graphs provide illustrations of the relationships between precision and F
Source: Journal of the American Society for Information Science. 51(2000) no.9, S.834-840

Losee, R.M.: Browsing mixed structured and unstructured data (2006) 0.01
```
0.005129378 = product of:
  0.035905644 = sum of:
    0.0104854815 = weight(_text_:information in 173) [ClassicSimilarity], result of:
      0.0104854815 = score(doc=173,freq=6.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.20156369 = fieldWeight in 173, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=173)
    0.025420163 = weight(_text_:retrieval in 173) [ClassicSimilarity], result of:
      0.025420163 = score(doc=173,freq=4.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.2835858 = fieldWeight in 173, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=173)
  0.14285715 = coord(2/14)
```
Abstract

Both structured and unstructured data, as well as structured data representing several different types of tuples, may be integrated into a single list for browsing or retrieval. Data may be arranged in the Gray code order of the features and metadata, producing optimal ordering for browsing. We provide several metrics for evaluating the performance of systems supporting browsing, given some constraints. Metadata and indexing terms are used for sorting keys and attributes for structured data, as well as for semi-structured or unstructured documents, images, media, etc. Economic and information theoretic models are suggested that enable the ordering to adapt to user preferences. Different relational structures and unstructured data may be integrated into a single, optimal ordering for browsing or for displaying tables in digital libraries, database management systems, or information retrieval systems. Adaptive displays of data are discussed.

Source

Information processing and management. 42(2006) no.2, S.440-452

Losee, R.M.; Haas, S.W.: Sublanguage terms : dictionaries, usage, and automatic classification (1995) 0.01

0.005054501 = product of:
  0.035381503 = sum of:
    0.011415146 = weight(_text_:information in 2650) [ClassicSimilarity], result of:
      0.011415146 = score(doc=2650,freq=4.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.21943474 = fieldWeight in 2650, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0625 = fieldNorm(doc=2650)
    0.023966359 = weight(_text_:retrieval in 2650) [ClassicSimilarity], result of:
      0.023966359 = score(doc=2650,freq=2.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.26736724 = fieldWeight in 2650, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=2650)
  0.14285715 = coord(2/14)

Abstract: The use of terms from natural and social science titles and abstracts is studied from the perspective of sublanguages and their specialized dictionaries. Explores different notions of sublanguage distinctiveness. Object methods for separating hard and soft sciences are suggested based on measures of sublanguage use, dictionary characteristics, and sublanguage distinctiveness. Abstracts were automatically classified with a high degree of accuracy by using a formula that condsiders the degree of uniqueness of terms in each sublanguage. This may prove useful for text filtering of information retrieval systems
Source: Journal of the American Society for Information Science. 46(1995) no.7, S.519-529

Haas, S.W.; Losee, R.M.: Looking in text windows : their size and composition (1994) 0.00
```
0.0048545036 = product of:
  0.033981524 = sum of:
    0.00856136 = weight(_text_:information in 8525) [ClassicSimilarity], result of:
      0.00856136 = score(doc=8525,freq=4.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.16457605 = fieldWeight in 8525, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=8525)
    0.025420163 = weight(_text_:retrieval in 8525) [ClassicSimilarity], result of:
      0.025420163 = score(doc=8525,freq=4.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.2835858 = fieldWeight in 8525, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=8525)
  0.14285715 = coord(2/14)
```
Abstract

A text window is a group of words appearing in contiguous positions in text used to exploit a variety of lexical, syntactics, and semantic relationships without having to analyze the text explicitely for their structure. This supports the previously suggested idea that natural grouping of words are best treated as a unit of size 7 to 11 words, that is, plus or minus 3 to 5 words. The text retrieval experiments varying the size of windows, both with full text and with stopwords removed, support these size ranges. The characteristcs of windows that best match terms in queries are examined in detail, revealing intersting differences between those for queries with good results and those for queries with poorer results. Queries with good results tend to contain morte content word phrase and few terms with high frequency of use in the database. Information retrieval systems may benefit from expanding thesaurus-style relationships or incorporating statistical dependencies for terms within these windows

Source

Information processing and management. 30(1994) no.5, S.619-629
Losee, R.M.: Learning syntactic rules and tags with genetic algorithms for information retrieval and filtering : an empirical basis for grammatical rules (1996) 0.00
```
0.0048545036 = product of:
  0.033981524 = sum of:
    0.00856136 = weight(_text_:information in 4068) [ClassicSimilarity], result of:
      0.00856136 = score(doc=4068,freq=4.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.16457605 = fieldWeight in 4068, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=4068)
    0.025420163 = weight(_text_:retrieval in 4068) [ClassicSimilarity], result of:
      0.025420163 = score(doc=4068,freq=4.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.2835858 = fieldWeight in 4068, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=4068)
  0.14285715 = coord(2/14)
```
Abstract

The grammars of natural languages may be learned by using genetic algorithms that reproduce and mutate grammatical rules and parts of speech tags, improving the quality of later generations of grammatical components. Syntactic rules are randomly generated and then evolve; those rules resulting in improved parsing and occasionally improved filtering performance are allowed to further propagate. The LUST system learns the characteristics of the language or subkanguage used in document abstracts by learning from the document rankings obtained from the parsed abstracts. Unlike the application of traditional linguistic rules to retrieval and filtering applications, LUST develops grammatical structures and tags without the prior imposition of some common grammatical assumptions (e.g. part of speech assumptions), producing grammars that are empirically based and are optimized for this particular application

Source

Information processing and management. 32(1996) no.2, S.185-197
Losee, R.M.: Term dependence : a basis for Luhn and Zipf models (2001) 0.00
```
0.0046862117 = product of:
  0.03280348 = sum of:
    0.01482871 = weight(_text_:information in 6976) [ClassicSimilarity], result of:
      0.01482871 = score(doc=6976,freq=12.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.2850541 = fieldWeight in 6976, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=6976)
    0.01797477 = weight(_text_:retrieval in 6976) [ClassicSimilarity], result of:
      0.01797477 = score(doc=6976,freq=2.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.20052543 = fieldWeight in 6976, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=6976)
  0.14285715 = coord(2/14)
```
Abstract

There are regularities in the statistical information provided by natural language terms about neighboring terms. We find that when phrase rank increases, moving from common to less common phrases, the value of the expected mutual information measure (EMIM) between the terms regularly decreases. Luhn's model suggests that midrange terms are the best index terms and relevance discriminators. We suggest reasons for this principle based on the empirical relationships shown here between the rank of terms within phrases and the average mutual information between terms, which we refer to as the Inverse Representation- EMIM principle. We also suggest an Inverse EMIM term weight for indexing or retrieval applications that is consistent with Luhn's distribution. An information theoretic interpretation of Zipf's Law is provided. Using the regularity noted here, we suggest that Zipf's Law is a consequence of the statistical dependencies that exist between terms, described here using information theoretic concepts.

Source

Journal of the American Society for Information Science and technology. 52(2001) no.12, S.1019-1025

Losee, R.M.: Text windows and phrases differing by discipline, location in document, and syntactic structure (1996) 0.00

0.0044226884 = product of:
  0.030958816 = sum of:
    0.009988253 = weight(_text_:information in 6962) [ClassicSimilarity], result of:
      0.009988253 = score(doc=6962,freq=4.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.1920054 = fieldWeight in 6962, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=6962)
    0.020970564 = weight(_text_:retrieval in 6962) [ClassicSimilarity], result of:
      0.020970564 = score(doc=6962,freq=2.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.23394634 = fieldWeight in 6962, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=6962)
  0.14285715 = coord(2/14)

Abstract: Knowledge of window style, content, location, and grammatical structure may be used to classify documents as originating within a particular discipline or may be used to place a document on a theory vs. practice spectrum. Examines characteristics of phrases and text windows, including their number, location in documents, and grammatical construction, in addition to studying variations in these window characteristics across disciplines. Examines some of the linguistic regularities for individual disciplines, and suggests families of regularities that may provide helpful for the automatic classification of documents, as well as for information retrieval and filtering applications
Source: Information processing and management. 32(1996) no.6, S.747-767

Losee, R.M.: ¬The effect of assigning a metadata or indexing term on document ordering (2013) 0.00
```
0.003790876 = product of:
  0.02653613 = sum of:
    0.00856136 = weight(_text_:information in 1100) [ClassicSimilarity], result of:
      0.00856136 = score(doc=1100,freq=4.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.16457605 = fieldWeight in 1100, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=1100)
    0.01797477 = weight(_text_:retrieval in 1100) [ClassicSimilarity], result of:
      0.01797477 = score(doc=1100,freq=2.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.20052543 = fieldWeight in 1100, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=1100)
  0.14285715 = coord(2/14)
```
Abstract

The assignment of indexing terms and metadata to documents, data, and other information representations is considered useful, but the utility of including a single term is seldom discussed. The author discusses a simple model of document ordering and then shows how assigning index and metadata labels improves or decreases retrieval performance. The Indexing and Metadata Advantage (IMA) factor measures how indexing or assigning a metadata term helps (or hurts) ordering performance. Performance values and the associated IMA expressions are computed, consistent with several different assumptions. The economic value associated with various term assignment decisions is developed. The IMA term advantage model itself is empirically validated with computer software that shows that the analytic results obtained agree completely with the actual performance gains and losses found when ordering all sets of 14 or fewer documents. When the formulas in the software are changed to differ from this model, the predictions of the actual performance are erroneous.

Source

Journal of the American Society for Information Science and Technology. 64(2013) no.11, S.2191-2200
Losee, R.M.: Improving collection browsing : small world networking and Gray code ordering (2017) 0.00
```
0.0037468998 = product of:
  0.026228298 = sum of:
    0.0050448296 = weight(_text_:information in 5148) [ClassicSimilarity], result of:
      0.0050448296 = score(doc=5148,freq=2.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.09697737 = fieldWeight in 5148, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5148)
    0.021183468 = weight(_text_:retrieval in 5148) [ClassicSimilarity], result of:
      0.021183468 = score(doc=5148,freq=4.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.23632148 = fieldWeight in 5148, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5148)
  0.14285715 = coord(2/14)
```
Abstract

Documents in digital and paper libraries may be arranged, based on their topics, in order to facilitate browsing. It may seem intuitively obvious that ordering documents by their subject should improve browsing performance; the results presented in this article suggest that ordering library materials by their Gray code values and through using links consistent with the small world model of document relationships is consistent with improving browsing performance. Below, library circulation data, including ordering with Library of Congress Classification numbers and Library of Congress Subject Headings, are used to provide information useful in generating user-centered document arrangements, as well as user-independent arrangements. Documents may be linearly arranged so they can be placed in a line by topic, such as on a library shelf, or in a list on a computer display. Crossover links, jumps between a document and another document to which it is not adjacent, can be used in library databases to allow additional paths that one might take when browsing. The improvement that is obtained with different combinations of document orderings and different crossovers is examined and applications suggested.

Theme

Klassifikationssysteme im Online-Retrieval
Verbale Doksprachen im Online-Retrieval

Losee, R.M.: ¬The science of information : measurement and applications (1990) 0.00

0.0019338143 = product of:
  0.027073398 = sum of:
    0.027073398 = weight(_text_:information in 813) [ClassicSimilarity], result of:
      0.027073398 = score(doc=813,freq=10.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.5204352 = fieldWeight in 813, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.09375 = fieldNorm(doc=813)
  0.071428575 = coord(1/14)

COMPASS: Information science
Series: Library and information science
Subject: Information science
Theme: Information

Losee, R.M.: ¬A discipline independent definition of information (1997) 0.00
```
0.0012892094 = product of:
  0.01804893 = sum of:
    0.01804893 = weight(_text_:information in 380) [ClassicSimilarity], result of:
      0.01804893 = score(doc=380,freq=10.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.3469568 = fieldWeight in 380, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0625 = fieldNorm(doc=380)
  0.071428575 = coord(1/14)
```
Abstract

Information may be defined as the characteristics of the output of a process, these being informative about the process and the input. This discipline independent definition may be applied to all domains, from physics to epistemology. Hierarchies of processes linked together, provide a communication channel between each of the corresponding functions and layers in the hierarchies. Models of communication, perception, observation, belief, and knowledge are suggested that are consistent with this conceptual framework of information as the value of the output of any process in a hierarchy of processes. Misinformation and errors are considered

Source

Journal of the American Society for Information Science. 48(1997) no.3, S.254-269

Theme

Information

Losee, R.M.; Paris, L.A.H.: Measuring search-engine quality and query difficulty : ranking with Target and Freestyle (1999) 0.00

8.64828E-4 = product of:
  0.012107591 = sum of:
    0.012107591 = weight(_text_:information in 4310) [ClassicSimilarity], result of:
      0.012107591 = score(doc=4310,freq=2.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.23274569 = fieldWeight in 4310, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.09375 = fieldNorm(doc=4310)
  0.071428575 = coord(1/14)

Source: Journal of the American Society for Information Science. 50(1999) no.10, S.882-889

Search (23 results, page 1 of 2)

Authors

Years

Types

Themes