Search (8 results, page 1 of 1)

Dominich, S.; Kiezer, T.: ¬A measure theoretic approach to information retrieval (2007) 0.32
```
0.32321647 = product of:
  0.4309553 = sum of:
    0.23081185 = weight(_text_:vector in 445) [ClassicSimilarity], result of:
      0.23081185 = score(doc=445,freq=14.0), product of:
        0.30654848 = queryWeight, product of:
          6.439392 = idf(docFreq=191, maxDocs=44218)
          0.047605187 = queryNorm
        0.7529375 = fieldWeight in 445, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          6.439392 = idf(docFreq=191, maxDocs=44218)
          0.03125 = fieldNorm(doc=445)
    0.16204487 = weight(_text_:space in 445) [ClassicSimilarity], result of:
      0.16204487 = score(doc=445,freq=16.0), product of:
        0.24842183 = queryWeight, product of:
          5.2183776 = idf(docFreq=650, maxDocs=44218)
          0.047605187 = queryNorm
        0.6522972 = fieldWeight in 445, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          5.2183776 = idf(docFreq=650, maxDocs=44218)
          0.03125 = fieldNorm(doc=445)
    0.03809857 = product of:
      0.07619714 = sum of:
        0.07619714 = weight(_text_:model in 445) [ClassicSimilarity], result of:
          0.07619714 = score(doc=445,freq=12.0), product of:
            0.1830527 = queryWeight, product of:
              3.845226 = idf(docFreq=2569, maxDocs=44218)
              0.047605187 = queryNorm
            0.41625792 = fieldWeight in 445, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.845226 = idf(docFreq=2569, maxDocs=44218)
              0.03125 = fieldNorm(doc=445)
      0.5 = coord(1/2)
  0.75 = coord(3/4)
```
Abstract

The vector space model of information retrieval is one of the classical and widely applied retrieval models. Paradoxically, it has been characterized by a discrepancy between its formal framework and implementable form. The underlying concepts of the vector space model are mathematical terms: linear space, vector, and inner product. However, in the vector space model, the mathematical meaning of these concepts is not preserved. They are used as mere computational constructs or metaphors. Thus, the vector space model actually does not follow formally from the mathematical concepts on which it has been claimed to rest. This problem has been recognized for more than two decades, but no proper solution has emerged so far. The present article proposes a solution to this problem. First, the concept of retrieval is defined based on the mathematical measure theory. Then, retrieval is particularized using fuzzy set theory. As a result, the retrieval function is conceived as the cardinality of the intersection of two fuzzy sets. This view makes it possible to build a connection to linear spaces. It is shown that the classical and the generalized vector space models, as well as the latent semantic indexing model, gain a correct formal background with which they are consistent. At the same time it becomes clear that the inner product is not a necessary ingredient of the vector space model, and hence of Information Retrieval (IR). The Principle of Object Invariance is introduced to handle this situation. Moreover, this view makes it possible to consistently formulate new retrieval methods: in linear space with general basis, entropy-based, and probability-based. It is also shown that Information Retrieval may be viewed as integral calculus, and thus it gains a very compact and elegant mathematical way of writing. Also, Information Retrieval may thus be conceived as an application of mathematical measure theory.
Dominich, S.; Góth, J.; Kiezer, T.; Szlávik, Z.: ¬An entropy-based interpretation of retrieval status value-based retrieval, and its application to the computation of term and query discrimination value (2004) 0.29
```
0.28576547 = product of:
  0.38102064 = sum of:
    0.21809667 = weight(_text_:vector in 2237) [ClassicSimilarity], result of:
      0.21809667 = score(doc=2237,freq=8.0), product of:
        0.30654848 = queryWeight, product of:
          6.439392 = idf(docFreq=191, maxDocs=44218)
          0.047605187 = queryNorm
        0.711459 = fieldWeight in 2237, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          6.439392 = idf(docFreq=191, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2237)
    0.12403977 = weight(_text_:space in 2237) [ClassicSimilarity], result of:
      0.12403977 = score(doc=2237,freq=6.0), product of:
        0.24842183 = queryWeight, product of:
          5.2183776 = idf(docFreq=650, maxDocs=44218)
          0.047605187 = queryNorm
        0.49931106 = fieldWeight in 2237, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          5.2183776 = idf(docFreq=650, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2237)
    0.03888419 = product of:
      0.07776838 = sum of:
        0.07776838 = weight(_text_:model in 2237) [ClassicSimilarity], result of:
          0.07776838 = score(doc=2237,freq=8.0), product of:
            0.1830527 = queryWeight, product of:
              3.845226 = idf(docFreq=2569, maxDocs=44218)
              0.047605187 = queryNorm
            0.42484146 = fieldWeight in 2237, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.845226 = idf(docFreq=2569, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2237)
      0.5 = coord(1/2)
  0.75 = coord(3/4)
```
Abstract

The concepts of Shannon information and entropy have been applied to a number of information retrieval tasks such as to formalize the probabilistic model, to design practical retrieval systems, to cluster documents, and to model texture in image retrieval. In this report, the concept of entropy is used for a different purpose. It is shown that any positive Retrieval Status Value (RSV)based retrieval system may be conceived as a special probability space in which the amount of the associated Shannon information is being reduced; in this view, the retrieval system is referred to as Uncertainty Decreasing Operation (UDO). The concept of UDO is then proposed as a theoretical background for term and query discrimination Power, and it is applied to the computation of term and query discrimination values in the vector space retrieval model. Experimental evidence is given as regards such computation; the results obtained compare weIl to those obtained using vector-based calculation of term discrimination values. The UDO-based computation, however, presents advantages over the vectorbased calculation: It is faster, easier to assess and handle in practice, and its application is not restricted to the vector space model. Based an the ADI test collection, it is shown that the UDO-based Term Discrimination Value (TDV) weighting scheme yields better retrieval effectiveness than using the vector-based TDV weighting scheme. Also, experimental evidence is given to the intuition that the choice of an appropriate weighting scheure and similarity measure depends an collection properties, and thus the UDO approach may be used as a theoretical basis for this intuition.

Dominich, S.: ¬A unified mathematical definition of classical information retrieval (2000) 0.05

0.05452417 = product of:
  0.21809667 = sum of:
    0.21809667 = weight(_text_:vector in 4768) [ClassicSimilarity], result of:
      0.21809667 = score(doc=4768,freq=2.0), product of:
        0.30654848 = queryWeight, product of:
          6.439392 = idf(docFreq=191, maxDocs=44218)
          0.047605187 = queryNorm
        0.711459 = fieldWeight in 4768, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.439392 = idf(docFreq=191, maxDocs=44218)
          0.078125 = fieldNorm(doc=4768)
  0.25 = coord(1/4)

Abstract: A unified mathematical definition for the classical (vector and probabilistic) models of information retrieval is given. Also, a methematical structure (Diophantine set) behind relevance feedback is identified

Dominich, S.: ¬The interaction-based information retrieval paradigm (1997) 0.01

0.012899691 = product of:
  0.051598765 = sum of:
    0.051598765 = product of:
      0.10319753 = sum of:
        0.10319753 = weight(_text_:22 in 7782) [ClassicSimilarity], result of:
          0.10319753 = score(doc=7782,freq=2.0), product of:
            0.16670525 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047605187 = queryNorm
            0.61904186 = fieldWeight in 7782, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=7782)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Source: Encyclopedia of library and information science. Vol.59, [=Suppl.22]

Dominich, S.; Lalmas, M.; Rijsbergen, C.J.K. van: Special issue on model design, formulation and explanation in information retrieval using mathematics (2006) 0.01

0.011665257 = product of:
  0.046661027 = sum of:
    0.046661027 = product of:
      0.09332205 = sum of:
        0.09332205 = weight(_text_:model in 110) [ClassicSimilarity], result of:
          0.09332205 = score(doc=110,freq=2.0), product of:
            0.1830527 = queryWeight, product of:
              3.845226 = idf(docFreq=2569, maxDocs=44218)
              0.047605187 = queryNorm
            0.50980973 = fieldWeight in 110, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.845226 = idf(docFreq=2569, maxDocs=44218)
              0.09375 = fieldNorm(doc=110)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Dominich, S.: Interaction information retrieval (1994) 0.01
```
0.006804733 = product of:
  0.027218932 = sum of:
    0.027218932 = product of:
      0.054437865 = sum of:
        0.054437865 = weight(_text_:model in 8157) [ClassicSimilarity], result of:
          0.054437865 = score(doc=8157,freq=2.0), product of:
            0.1830527 = queryWeight, product of:
              3.845226 = idf(docFreq=2569, maxDocs=44218)
              0.047605187 = queryNorm
            0.29738903 = fieldWeight in 8157, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.845226 = idf(docFreq=2569, maxDocs=44218)
              0.0546875 = fieldNorm(doc=8157)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

In existing information retrieval models there are three different ways documents are represented for retrieval purposes: vectors of weights, collections of sentences and artificial neurons. Accordingly, retrieval depends on a similarity function, or means an inference, or is a spreading of activation. Relevancy is considered to be a critical modelling parameter which is either a priori or it is not treated at all. Assuming that relevancy may equally be an emergent entity, thus not requiring any a priori modelling, the paper proposes the Interaction Informatzion Retrieval model in which documents are interconnected, queries and documents are treated in the same way, and in which retrieval is the result of the interconnection between query and documents. Algorithms and experiences gained with practical applications are presented. A theoretical mathematical formulation of this type of retrieval is also given

Crestani, F.; Dominich, S.; Lalmas, M.; Rijsbergen, C.J.K. van: Mathematical, logical, and formal methods in information retrieval : an introduction to the special issue (2003) 0.00

0.004837384 = product of:
  0.019349536 = sum of:
    0.019349536 = product of:
      0.03869907 = sum of:
        0.03869907 = weight(_text_:22 in 1451) [ClassicSimilarity], result of:
          0.03869907 = score(doc=1451,freq=2.0), product of:
            0.16670525 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047605187 = queryNorm
            0.23214069 = fieldWeight in 1451, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=1451)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 22. 3.2003 19:27:36

Dominich, S.: Mathematical foundations of information retrieval (2001) 0.00

0.0040311534 = product of:
  0.016124614 = sum of:
    0.016124614 = product of:
      0.032249227 = sum of:
        0.032249227 = weight(_text_:22 in 1753) [ClassicSimilarity], result of:
          0.032249227 = score(doc=1753,freq=2.0), product of:
            0.16670525 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.047605187 = queryNorm
            0.19345059 = fieldWeight in 1753, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1753)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 22. 3.2008 12:26:32

Search (8 results, page 1 of 1)

Authors

Years

Types

Subjects

Classifications