Search (70 results, page 1 of 4)

Schiminovich, S.: Automatic classification and retrieval of documents by means of a bibliographic pattern discovery algorithm (1971) 0.04

0.03527055 = product of:
  0.1645959 = sum of:
    0.047104023 = weight(_text_:classification in 4846) [ClassicSimilarity], result of:
      0.047104023 = score(doc=4846,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.49260917 = fieldWeight in 4846, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.109375 = fieldNorm(doc=4846)
    0.070387855 = weight(_text_:bibliographic in 4846) [ClassicSimilarity], result of:
      0.070387855 = score(doc=4846,freq=2.0), product of:
        0.11688946 = queryWeight, product of:
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.03002521 = queryNorm
        0.6021745 = fieldWeight in 4846, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.109375 = fieldNorm(doc=4846)
    0.047104023 = weight(_text_:classification in 4846) [ClassicSimilarity], result of:
      0.047104023 = score(doc=4846,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.49260917 = fieldWeight in 4846, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.109375 = fieldNorm(doc=4846)
  0.21428572 = coord(3/14)

Srinivasan, P.: Intelligent information retrieval using rough set approximations (1989) 0.02

0.024607476 = product of:
  0.11483489 = sum of:
    0.04079328 = weight(_text_:classification in 2526) [ClassicSimilarity], result of:
      0.04079328 = score(doc=2526,freq=6.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.42661208 = fieldWeight in 2526, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2526)
    0.03324832 = product of:
      0.06649664 = sum of:
        0.06649664 = weight(_text_:schemes in 2526) [ClassicSimilarity], result of:
          0.06649664 = score(doc=2526,freq=2.0), product of:
            0.16067243 = queryWeight, product of:
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.03002521 = queryNorm
            0.41386467 = fieldWeight in 2526, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2526)
      0.5 = coord(1/2)
    0.04079328 = weight(_text_:classification in 2526) [ClassicSimilarity], result of:
      0.04079328 = score(doc=2526,freq=6.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.42661208 = fieldWeight in 2526, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2526)
  0.21428572 = coord(3/14)

Abstract: The theory of rough sets was introduced in 1982. It allows the classification of objects into sets of equivalent members based on their attributes. Any combination of the same objetcts (or even their attributes) may be examined using the resultant classification. The theory has direct applications in the design and evaluation of classification schemes and the selection of discriminating attributes. Introductory papers discuss its application in the domain of medical diagnostic systems and the design of information retrieval systems accessing collections of documents. Advantages offered by the theory are: the implicit inclusion of Boolean logic; term weighting; and the ability to rank retrieved documents.

Savoy, J.: Ranking schemes in hybrid Boolean systems : a new approach (1997) 0.02

0.020556571 = product of:
  0.095930666 = sum of:
    0.02546139 = weight(_text_:subject in 393) [ClassicSimilarity], result of:
      0.02546139 = score(doc=393,freq=2.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.23709705 = fieldWeight in 393, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.046875 = fieldNorm(doc=393)
    0.040303055 = product of:
      0.08060611 = sum of:
        0.08060611 = weight(_text_:schemes in 393) [ClassicSimilarity], result of:
          0.08060611 = score(doc=393,freq=4.0), product of:
            0.16067243 = queryWeight, product of:
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.03002521 = queryNorm
            0.5016798 = fieldWeight in 393, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.046875 = fieldNorm(doc=393)
      0.5 = coord(1/2)
    0.030166224 = weight(_text_:bibliographic in 393) [ClassicSimilarity], result of:
      0.030166224 = score(doc=393,freq=2.0), product of:
        0.11688946 = queryWeight, product of:
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.03002521 = queryNorm
        0.2580748 = fieldWeight in 393, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.046875 = fieldNorm(doc=393)
  0.21428572 = coord(3/14)

Abstract: In most commercial online systems, the retrieval system is based on the Boolean model and its inverted file organization. Since the investment in these systems is so great and changing them could be economically unfeasible, this article suggests a new ranking scheme especially adapted for hypertext environments in order to produce more effective retrieval results and yet maintain the effectiveness of the investment made to date in the Boolean model. To select the retrieved documents, the suggested ranking strategy uses multiple sources of document content evidence. The proposed scheme integrates both the information provided by the index and query terms, and the inherent relationships between documents such as bibliographic references or hypertext links. We will demonstrate that our scheme represents an integration of both subject and citation indexing, and results in a significant imporvement over classical ranking schemes uses in hybrid Boolean systems, while preserving its efficiency. Moreover, through knowing the nearest neighbor and the hypertext links which constitute additional sources of evidence, our strategy will take them into account in order to further improve retrieval effectiveness and to provide 'good' starting points for browsing in a hypertext or hypermedia environement

Guerrero-Bote, V.P.; Moya Anegón, F. de; Herrero Solana, V.: Document organization using Kohonen's algorithm (2002) 0.02

0.020154601 = product of:
  0.0940548 = sum of:
    0.026916584 = weight(_text_:classification in 2564) [ClassicSimilarity], result of:
      0.026916584 = score(doc=2564,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.28149095 = fieldWeight in 2564, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0625 = fieldNorm(doc=2564)
    0.04022163 = weight(_text_:bibliographic in 2564) [ClassicSimilarity], result of:
      0.04022163 = score(doc=2564,freq=2.0), product of:
        0.11688946 = queryWeight, product of:
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.03002521 = queryNorm
        0.34409973 = fieldWeight in 2564, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.0625 = fieldNorm(doc=2564)
    0.026916584 = weight(_text_:classification in 2564) [ClassicSimilarity], result of:
      0.026916584 = score(doc=2564,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.28149095 = fieldWeight in 2564, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0625 = fieldNorm(doc=2564)
  0.21428572 = coord(3/14)

Abstract: The classification of documents from a bibliographic database is a task that is linked to processes of information retrieval based on partial matching. A method is described of vectorizing reference documents from LISA which permits their topological organization using Kohonen's algorithm. As an example a map is generated of 202 documents from LISA, and an analysis is made of the possibilities of this type of neural network with respect to the development of information retrieval systems based on graphical browsing.

Kang, I.-H.; Kim, G.C.: Integration of multiple evidences based on a query type for web search (2004) 0.02

0.017829034 = product of:
  0.08320216 = sum of:
    0.029138058 = weight(_text_:classification in 2568) [ClassicSimilarity], result of:
      0.029138058 = score(doc=2568,freq=6.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.3047229 = fieldWeight in 2568, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2568)
    0.029138058 = weight(_text_:classification in 2568) [ClassicSimilarity], result of:
      0.029138058 = score(doc=2568,freq=6.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.3047229 = fieldWeight in 2568, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2568)
    0.024926046 = product of:
      0.04985209 = sum of:
        0.04985209 = weight(_text_:texts in 2568) [ClassicSimilarity], result of:
          0.04985209 = score(doc=2568,freq=2.0), product of:
            0.16460659 = queryWeight, product of:
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.03002521 = queryNorm
            0.302856 = fieldWeight in 2568, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2568)
      0.5 = coord(1/2)
  0.21428572 = coord(3/14)

Abstract: The massive and heterogeneous Web exacerbates IR problems and short user queries make them worse. The contents of web pages are not enough to find answer pages. PageRank compensates for the insufficiencies of content information. The content information and PageRank are combined to get better results. However, static combination of multiple evidences may lower the retrieval performance. We have to use different strategies to meet the need of a user. We can classify user queries as three categories according to users' intent, the topic relevance task, the homepage finding task, and the service finding task. In this paper, we present a user query classification method. The difference of distribution, mutual information, the usage rate as anchor texts and the POS information are used for the classification. After we classified a user query, we apply different algorithms and information for the better results. For the topic relevance task, we emphasize the content information, on the other hand, for the homepage finding task, we emphasize the Link information and the URL information. We could get the best performance when our proposed classification method with the OKAPI scoring algorithm was used.

Faloutsos, C.: Signature files (1992) 0.02

0.015022537 = product of:
  0.07010517 = sum of:
    0.026916584 = weight(_text_:classification in 3499) [ClassicSimilarity], result of:
      0.026916584 = score(doc=3499,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.28149095 = fieldWeight in 3499, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0625 = fieldNorm(doc=3499)
    0.026916584 = weight(_text_:classification in 3499) [ClassicSimilarity], result of:
      0.026916584 = score(doc=3499,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.28149095 = fieldWeight in 3499, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0625 = fieldNorm(doc=3499)
    0.016272005 = product of:
      0.03254401 = sum of:
        0.03254401 = weight(_text_:22 in 3499) [ClassicSimilarity], result of:
          0.03254401 = score(doc=3499,freq=2.0), product of:
            0.10514317 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03002521 = queryNorm
            0.30952093 = fieldWeight in 3499, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=3499)
      0.5 = coord(1/2)
  0.21428572 = coord(3/14)

Abstract: Presents a survey and discussion on signature-based text retrieval methods. It describes the main idea behind the signature approach and its advantages over other text retrieval methods, it provides a classification of the signature methods that have appeared in the literature, it describes the main representatives of each class, together with the relative advantages and drawbacks, and it gives a list of applications as well as commercial or university prototypes that use the signature approach
Date: 7. 5.1999 15:22:48

Koumenides, C.L.; Shadbolt, N.R.: Ranking methods for entity-oriented semantic web search (2014) 0.01

0.014758593 = product of:
  0.068873435 = sum of:
    0.02018744 = weight(_text_:classification in 1280) [ClassicSimilarity], result of:
      0.02018744 = score(doc=1280,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.21111822 = fieldWeight in 1280, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=1280)
    0.02849856 = product of:
      0.05699712 = sum of:
        0.05699712 = weight(_text_:schemes in 1280) [ClassicSimilarity], result of:
          0.05699712 = score(doc=1280,freq=2.0), product of:
            0.16067243 = queryWeight, product of:
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.03002521 = queryNorm
            0.35474116 = fieldWeight in 1280, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.046875 = fieldNorm(doc=1280)
      0.5 = coord(1/2)
    0.02018744 = weight(_text_:classification in 1280) [ClassicSimilarity], result of:
      0.02018744 = score(doc=1280,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.21111822 = fieldWeight in 1280, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=1280)
  0.21428572 = coord(3/14)

Abstract: This article provides a technical review of semantic search methods used to support text-based search over formal Semantic Web knowledge bases. Our focus is on ranking methods and auxiliary processes explored by existing semantic search systems, outlined within broad areas of classification. We present reflective examples from the literature in some detail, which should appeal to readers interested in a deeper perspective on the various methods and systems implemented in the outlined literature. The presentation covers graph exploration and propagation methods, adaptations of classic probabilistic retrieval models, and query-independent link analysis via flexible extensions to the PageRank algorithm. Future research directions are discussed, including development of more cohesive retrieval models to unlock further potentials and uses, data indexing schemes, integration with user interfaces, and building community consensus for more systematic evaluation and gradual development.

Green, R.: Topical relevance relationships : 2: an exploratory study and preliminary typology (1995) 0.01

0.012545095 = product of:
  0.08781566 = sum of:
    0.036007844 = weight(_text_:subject in 3724) [ClassicSimilarity], result of:
      0.036007844 = score(doc=3724,freq=4.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.33530587 = fieldWeight in 3724, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.046875 = fieldNorm(doc=3724)
    0.051807817 = product of:
      0.103615634 = sum of:
        0.103615634 = weight(_text_:texts in 3724) [ClassicSimilarity], result of:
          0.103615634 = score(doc=3724,freq=6.0), product of:
            0.16460659 = queryWeight, product of:
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.03002521 = queryNorm
            0.6294744 = fieldWeight in 3724, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.046875 = fieldNorm(doc=3724)
      0.5 = coord(1/2)
  0.14285715 = coord(2/14)

Abstract: The assumption of topic matching between user needs and texts topically relevant to those needs is often erroneous. Reports an emprical investigantion of the question 'what relationship types actually account for topical relevance'? In order to avoid the bias to topic matching search strategies, user needs are back generated from a randomly selected subset of the subject headings employed in a user oriented topical concordance. The corresponding relevant texts are those indicated in the concordance under the subject heading. Compares the topics of the user needs with the topics of the relevant texts to determine the relationships between them. Topical relevance relationships include a large variety of relationships, only some of which are matching relationships. Others are examples of paradigmatic or syntagmatic relationships. There appear to be no constraints on the kinds of relationships that can function as topical relevance relationships. They are distinguishable from other types of relationships only on functional grounds

Carpineto, C.; Romano, G.: Information retrieval through hybrid navigation of lattice representations (1996) 0.01

0.010012914 = product of:
  0.0700904 = sum of:
    0.035193928 = weight(_text_:bibliographic in 7434) [ClassicSimilarity], result of:
      0.035193928 = score(doc=7434,freq=2.0), product of:
        0.11688946 = queryWeight, product of:
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.03002521 = queryNorm
        0.30108726 = fieldWeight in 7434, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7434)
    0.034896467 = product of:
      0.069792934 = sum of:
        0.069792934 = weight(_text_:texts in 7434) [ClassicSimilarity], result of:
          0.069792934 = score(doc=7434,freq=2.0), product of:
            0.16460659 = queryWeight, product of:
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.03002521 = queryNorm
            0.42399842 = fieldWeight in 7434, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7434)
      0.5 = coord(1/2)
  0.14285715 = coord(2/14)

Abstract: Presents a comprehensive approach to automatic organization and hybrid navigation of text databases. An organizing stage builds a particular lattice representation of the data, through text indexing followed by lattice clustering of the indexed texts. The lattice representation supports the navigation state of the system, a visual retrieval interface that combines 3 main retrieval strategies: browsing, querying, and bounding. Such a hybrid paradigm permits high flexibility in trading off information exploration and retrieval, and had good retrieval performance. Compares information retrieval using lattice-based hybrid navigation with conventional Boolean querying. Experiments conducted on 2 medium-sized bibliographic databases showed that the performance of lattice retrieval was comparable to or better than Boolean retrieval

Bauckhage, C.: Marginalizing over the PageRank damping factor (2014) 0.01

0.009613066 = product of:
  0.06729146 = sum of:
    0.03364573 = weight(_text_:classification in 928) [ClassicSimilarity], result of:
      0.03364573 = score(doc=928,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.35186368 = fieldWeight in 928, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.078125 = fieldNorm(doc=928)
    0.03364573 = weight(_text_:classification in 928) [ClassicSimilarity], result of:
      0.03364573 = score(doc=928,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.35186368 = fieldWeight in 928, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.078125 = fieldNorm(doc=928)
  0.14285715 = coord(2/14)

Abstract: In this note, we show how to marginalize over the damping parameter of the PageRank equation so as to obtain a parameter-free version known as TotalRank. Our discussion is meant as a reference and intended to provide a guided tour towards an interesting result that has applications in information retrieval and classification.

González-Ibáñez, R.; Esparza-Villamán, A.; Vargas-Godoy, J.C.; Shah, C.: ¬A comparison of unimodal and multimodal models for implicit detection of relevance in interactive IR (2019) 0.01
```
0.00832516 = product of:
  0.058276117 = sum of:
    0.029138058 = weight(_text_:classification in 5417) [ClassicSimilarity], result of:
      0.029138058 = score(doc=5417,freq=6.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.3047229 = fieldWeight in 5417, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5417)
    0.029138058 = weight(_text_:classification in 5417) [ClassicSimilarity], result of:
      0.029138058 = score(doc=5417,freq=6.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.3047229 = fieldWeight in 5417, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5417)
  0.14285715 = coord(2/14)
```
Abstract

Implicit detection of relevance has been approached by many during the last decade. From the use of individual measures to the use of multiple features from different sources (multimodality), studies have shown the feasibility to automatically detect whether a document is relevant. Despite promising results, it is not clear yet to what extent multimodality constitutes an effective approach compared to unimodality. In this article, we hypothesize that it is possible to build unimodal models capable of outperforming multimodal models in the detection of perceived relevance. To test this hypothesis, we conducted three experiments to compare unimodal and multimodal classification models built using a combination of 24 features. Our classification experiments showed that a univariate unimodal model based on the left-click feature supports our hypothesis. On the other hand, our prediction experiment suggests that multimodality slightly improves early classification compared to the best unimodal models. Based on our results, we argue that the feasibility for practical applications of state-of-the-art multimodal approaches may be strongly constrained by technology, cultural, ethical, and legal aspects, in which case unimodality may offer a better alternative today for supporting relevance detection in interactive information retrieval systems.

Biskri, I.; Rompré, L.: Using association rules for query reformulation (2012) 0.01

0.008156957 = product of:
  0.057098698 = sum of:
    0.028549349 = weight(_text_:classification in 92) [ClassicSimilarity], result of:
      0.028549349 = score(doc=92,freq=4.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.29856625 = fieldWeight in 92, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=92)
    0.028549349 = weight(_text_:classification in 92) [ClassicSimilarity], result of:
      0.028549349 = score(doc=92,freq=4.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.29856625 = fieldWeight in 92, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=92)
  0.14285715 = coord(2/14)

Abstract: In this paper the authors will present research on the combination of two methods of data mining: text classification and maximal association rules. Text classification has been the focus of interest of many researchers for a long time. However, the results take the form of lists of words (classes) that people often do not know what to do with. The use of maximal association rules induced a number of advantages: (1) the detection of dependencies and correlations between the relevant units of information (words) of different classes, (2) the extraction of hidden knowledge, often relevant, from a large volume of data. The authors will show how this combination can improve the process of information retrieval.

Baeza-Yates, R.; Navarro, G.: Block addressing indices for approximate text retrieval (2000) 0.01
```
0.007708565 = product of:
  0.05395995 = sum of:
    0.02546139 = weight(_text_:subject in 4295) [ClassicSimilarity], result of:
      0.02546139 = score(doc=4295,freq=2.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.23709705 = fieldWeight in 4295, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.046875 = fieldNorm(doc=4295)
    0.02849856 = product of:
      0.05699712 = sum of:
        0.05699712 = weight(_text_:schemes in 4295) [ClassicSimilarity], result of:
          0.05699712 = score(doc=4295,freq=2.0), product of:
            0.16067243 = queryWeight, product of:
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.03002521 = queryNorm
            0.35474116 = fieldWeight in 4295, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.046875 = fieldNorm(doc=4295)
      0.5 = coord(1/2)
  0.14285715 = coord(2/14)
```
Abstract

The issue of reducing the space overhead when indexing large text databases is becoming more and more important, as the text collection grow in size. Another subject, which is gaining importance as text databases grow and get more heterogeneous and error prone, is that of flexible string matching. One of the best tools to make the search more flexible is to allow a limited number of differences between the words found and those sought. This is called 'approximate text searching'. which is becoming more and more popular. In recent years some indexing schemes with very low space overhead have appeared, some of them dealing with approximate searching. These low overhead indices (whose most notorious exponent is Glimpse) are modified inverted files, where space is saved by making the lists of occurences point to text blocks instead of exact word positions. Despite their existence, little is known about the expected behaviour of these 'block addressing' indices, and even less is known when it comes to cope with approximate search. Our main contribution is an analytical study of the space-time trade-offs for indexed text searching

Hofferer, M.: Heuristic search in information retrieval (1994) 0.01

0.007690453 = product of:
  0.053833168 = sum of:
    0.026916584 = weight(_text_:classification in 1070) [ClassicSimilarity], result of:
      0.026916584 = score(doc=1070,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.28149095 = fieldWeight in 1070, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0625 = fieldNorm(doc=1070)
    0.026916584 = weight(_text_:classification in 1070) [ClassicSimilarity], result of:
      0.026916584 = score(doc=1070,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.28149095 = fieldWeight in 1070, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0625 = fieldNorm(doc=1070)
  0.14285715 = coord(2/14)

Abstract: Describes an adaptive information retrieval system: Information Retrieval Algorithm System (IRAS); that uses heuristic searching to sample a document space and retrieve relevant documents according to users' requests; and also a learning module based on a knowledge representation system and an approximate probabilistic characterization of relevant documents; to reproduce a user classification of relevant documents and to provide a rule controlled ranking

Soulier, L.; Jabeur, L.B.; Tamine, L.; Bahsoun, W.: On ranking relevant entities in heterogeneous networks using a language-based model (2013) 0.01
```
0.007673029 = product of:
  0.0537112 = sum of:
    0.043541197 = weight(_text_:bibliographic in 664) [ClassicSimilarity], result of:
      0.043541197 = score(doc=664,freq=6.0), product of:
        0.11688946 = queryWeight, product of:
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.03002521 = queryNorm
        0.3724989 = fieldWeight in 664, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.0390625 = fieldNorm(doc=664)
    0.010170003 = product of:
      0.020340007 = sum of:
        0.020340007 = weight(_text_:22 in 664) [ClassicSimilarity], result of:
          0.020340007 = score(doc=664,freq=2.0), product of:
            0.10514317 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03002521 = queryNorm
            0.19345059 = fieldWeight in 664, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=664)
      0.5 = coord(1/2)
  0.14285715 = coord(2/14)
```
Abstract

A new challenge, accessing multiple relevant entities, arises from the availability of linked heterogeneous data. In this article, we address more specifically the problem of accessing relevant entities, such as publications and authors within a bibliographic network, given an information need. We propose a novel algorithm, called BibRank, that estimates a joint relevance of documents and authors within a bibliographic network. This model ranks each type of entity using a score propagation algorithm with respect to the query topic and the structure of the underlying bi-type information entity network. Evidence sources, namely content-based and network-based scores, are both used to estimate the topical similarity between connected entities. For this purpose, authorship relationships are analyzed through a language model-based score on the one hand and on the other hand, non topically related entities of the same type are detected through marginal citations. The article reports the results of experiments using the Bibrank algorithm for an information retrieval task. The CiteSeerX bibliographic data set forms the basis for the topical query automatic generation and evaluation. We show that a statistically significant improvement over closely related ranking models is achieved.

Date

22. 3.2013 19:34:49

Langville, A.N.; Meyer, C.D.: Google's PageRank and beyond : the science of search engine rankings (2006) 0.01

0.0073792967 = product of:
  0.034436718 = sum of:
    0.01009372 = weight(_text_:classification in 6) [ClassicSimilarity], result of:
      0.01009372 = score(doc=6,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.10555911 = fieldWeight in 6, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0234375 = fieldNorm(doc=6)
    0.01424928 = product of:
      0.02849856 = sum of:
        0.02849856 = weight(_text_:schemes in 6) [ClassicSimilarity], result of:
          0.02849856 = score(doc=6,freq=2.0), product of:
            0.16067243 = queryWeight, product of:
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.03002521 = queryNorm
            0.17737058 = fieldWeight in 6, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.0234375 = fieldNorm(doc=6)
      0.5 = coord(1/2)
    0.01009372 = weight(_text_:classification in 6) [ClassicSimilarity], result of:
      0.01009372 = score(doc=6,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.10555911 = fieldWeight in 6, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.0234375 = fieldNorm(doc=6)
  0.21428572 = coord(3/14)

Content: Chapter 9. Accelerating the Computation of PageRank: 9.1 An Adaptive Power Method - 9.2 Extrapolation - 9.3 Aggregation - 9.4 Other Numerical Methods Chapter 10. Updating the PageRank Vector: 10.1 The Two Updating Problems and their History - 10.2 Restarting the Power Method - 10.3 Approximate Updating Using Approximate Aggregation - 10.4 Exact Aggregation - 10.5 Exact vs. Approximate Aggregation - 10.6 Updating with Iterative Aggregation - 10.7 Determining the Partition - 10.8 Conclusions Chapter 11. The HITS Method for Ranking Webpages: 11.1 The HITS Algorithm - 11.2 HITS Implementation - 11.3 HITS Convergence - 11.4 HITS Example - 11.5 Strengths and Weaknesses of HITS - 11.6 HITS's Relationship to Bibliometrics - 11.7 Query-Independent HITS - 11.8 Accelerating HITS - 11.9 HITS Sensitivity Chapter 12. Other Link Methods for Ranking Webpages: 12.1 SALSA - 12.2 Hybrid Ranking Methods - 12.3 Rankings based on Traffic Flow Chapter 13. The Future of Web Information Retrieval: 13.1 Spam - 13.2 Personalization - 13.3 Clustering - 13.4 Intelligent Agents - 13.5 Trends and Time-Sensitive Search - 13.6 Privacy and Censorship - 13.7 Library Classification Schemes - 13.8 Data Fusion Chapter 14. Resources for Web Information Retrieval: 14.1 Resources for Getting Started - 14.2 Resources for Serious Study Chapter 15. The Mathematics Guide: 15.1 Linear Algebra - 15.2 Perron-Frobenius Theory - 15.3 Markov Chains - 15.4 Perron Complementation - 15.5 Stochastic Complementation - 15.6 Censoring - 15.7 Aggregation - 15.8 Disaggregation

Efron, M.: Query expansion and dimensionality reduction : Notions of optimality in Rocchio relevance feedback and latent semantic indexing (2008) 0.01

0.00576784 = product of:
  0.04037488 = sum of:
    0.02018744 = weight(_text_:classification in 2020) [ClassicSimilarity], result of:
      0.02018744 = score(doc=2020,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.21111822 = fieldWeight in 2020, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=2020)
    0.02018744 = weight(_text_:classification in 2020) [ClassicSimilarity], result of:
      0.02018744 = score(doc=2020,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.21111822 = fieldWeight in 2020, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=2020)
  0.14285715 = coord(2/14)

Abstract: Rocchio relevance feedback and latent semantic indexing (LSI) are well-known extensions of the vector space model for information retrieval (IR). This paper analyzes the statistical relationship between these extensions. The analysis focuses on each method's basis in least-squares optimization. Noting that LSI and Rocchio relevance feedback both alter the vector space model in a way that is in some sense least-squares optimal, we ask: what is the relationship between LSI's and Rocchio's notions of optimality? What does this relationship imply for IR? Using an analytical approach, we argue that Rocchio relevance feedback is optimal if we understand retrieval as a simplified classification problem. On the other hand, LSI's motivation comes to the fore if we understand it as a biased regression technique, where projection onto a low-dimensional orthogonal subspace of the documents reduces model variance.

Zhang, W.; Yoshida, T.; Tang, X.: ¬A comparative study of TF*IDF, LSI and multi-words for text classification (2011) 0.01

0.00576784 = product of:
  0.04037488 = sum of:
    0.02018744 = weight(_text_:classification in 1165) [ClassicSimilarity], result of:
      0.02018744 = score(doc=1165,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.21111822 = fieldWeight in 1165, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=1165)
    0.02018744 = weight(_text_:classification in 1165) [ClassicSimilarity], result of:
      0.02018744 = score(doc=1165,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.21111822 = fieldWeight in 1165, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=1165)
  0.14285715 = coord(2/14)

Jindal, V.; Bawa, S.; Batra, S.: ¬A review of ranking approaches for semantic search on Web (2014) 0.01
```
0.00576784 = product of:
  0.04037488 = sum of:
    0.02018744 = weight(_text_:classification in 2799) [ClassicSimilarity], result of:
      0.02018744 = score(doc=2799,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.21111822 = fieldWeight in 2799, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=2799)
    0.02018744 = weight(_text_:classification in 2799) [ClassicSimilarity], result of:
      0.02018744 = score(doc=2799,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.21111822 = fieldWeight in 2799, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.046875 = fieldNorm(doc=2799)
  0.14285715 = coord(2/14)
```
Abstract

With ever increasing information being available to the end users, search engines have become the most powerful tools for obtaining useful information scattered on the Web. However, it is very common that even most renowned search engines return result sets with not so useful pages to the user. Research on semantic search aims to improve traditional information search and retrieval methods where the basic relevance criteria rely primarily on the presence of query keywords within the returned pages. This work is an attempt to explore different relevancy ranking approaches based on semantics which are considered appropriate for the retrieval of relevant information. In this paper, various pilot projects and their corresponding outcomes have been investigated based on methodologies adopted and their most distinctive characteristics towards ranking. An overview of selected approaches and their comparison by means of the classification criteria has been presented. With the help of this comparison, some common concepts and outstanding features have been identified.

Crestani, F.; Dominich, S.; Lalmas, M.; Rijsbergen, C.J.K. van: Mathematical, logical, and formal methods in information retrieval : an introduction to the special issue (2003) 0.01

0.0053807707 = product of:
  0.037665393 = sum of:
    0.02546139 = weight(_text_:subject in 1451) [ClassicSimilarity], result of:
      0.02546139 = score(doc=1451,freq=2.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.23709705 = fieldWeight in 1451, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.046875 = fieldNorm(doc=1451)
    0.0122040035 = product of:
      0.024408007 = sum of:
        0.024408007 = weight(_text_:22 in 1451) [ClassicSimilarity], result of:
          0.024408007 = score(doc=1451,freq=2.0), product of:
            0.10514317 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03002521 = queryNorm
            0.23214069 = fieldWeight in 1451, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=1451)
      0.5 = coord(1/2)
  0.14285715 = coord(2/14)

Abstract: Research an the use of mathematical, logical, and formal methods, has been central to Information Retrieval research for a long time. Research in this area is important not only because it helps enhancing retrieval effectiveness, but also because it helps clarifying the underlying concepts of Information Retrieval. In this article we outline some of the major aspects of the subject, and summarize the papers of this special issue with respect to how they relate to these aspects. We conclude by highlighting some directions of future research, which are needed to better understand the formal characteristics of Information Retrieval.
Date: 22. 3.2003 19:27:36

Search (70 results, page 1 of 4)

Authors

Years

Languages

Types

Themes

Subjects

Classifications