Search (83 results, page 1 of 5)

  • × theme_ss:"Retrievalalgorithmen"
  1. Wilhelmy, A.: Phonetische Ähnlichkeitssuche in Datenbanken (1991) 0.01
    0.008485726 = product of:
      0.118800156 = sum of:
        0.118800156 = weight(_text_:1938 in 5684) [ClassicSimilarity], result of:
          0.118800156 = score(doc=5684,freq=2.0), product of:
            0.21236381 = queryWeight, product of:
              8.43879 = idf(docFreq=25, maxDocs=44218)
              0.025165197 = queryNorm
            0.5594181 = fieldWeight in 5684, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.43879 = idf(docFreq=25, maxDocs=44218)
              0.046875 = fieldNorm(doc=5684)
      0.071428575 = coord(1/14)
    
    Abstract
    In dialoggesteuerten Systemen zur Informationswiedergewinnung (Information Retrieval Systems, IRS) kann man - vergröbernd - das Wechselspiel zwischen Mensch und Computer als iterativen Prozess zur Erhöhung von Genauigkeit (Precision) auf der einen und Vollständigkeit (Recall) der Nachweise auf der anderen Seite verstehen. Vorgestellt wird ein maschinell anwendbares Verfahren, das auf phonologische Untersuchungen des Sprachwissenschaftlers Nikolaj S. Trubetzkoy (1890-1938) zurückgeht. In den Grundzügen kann es erheblich zur Verbesserung der Nachweisvollständigkeit beitragen. Dadurch, daß es die 'Ähnlichkeitsumgebungen' von Suchbegriffen in die Recherche mit einbezieht, zeigt es sich vor allem für Systeme mit koordinativer maschineller Indexierung als vorteilhaft. Bei alphabetischen Begriffen erweist sich die Einführung eines solchen zunächst nur auf den Benutzer hin orientierten Verfahrens auch aus technischer Sicht als günstig, da damit die Anzahl der Zugriffe bei den Suchvorgängen auch für große Datenvolumina niedrig gehalten werden kann
  2. Song, D.; Bruza, P.D.: Towards context sensitive information inference (2003) 0.01
    0.0067573944 = product of:
      0.047301758 = sum of:
        0.041619197 = weight(_text_:representation in 1428) [ClassicSimilarity], result of:
          0.041619197 = score(doc=1428,freq=4.0), product of:
            0.11578492 = queryWeight, product of:
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.025165197 = queryNorm
            0.35945266 = fieldWeight in 1428, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1428)
        0.0056825615 = product of:
          0.017047685 = sum of:
            0.017047685 = weight(_text_:22 in 1428) [ClassicSimilarity], result of:
              0.017047685 = score(doc=1428,freq=2.0), product of:
                0.08812423 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.025165197 = queryNorm
                0.19345059 = fieldWeight in 1428, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1428)
          0.33333334 = coord(1/3)
      0.14285715 = coord(2/14)
    
    Abstract
    Humans can make hasty, but generally robust judgements about what a text fragment is, or is not, about. Such judgements are termed information inference. This article furnishes an account of information inference from a psychologistic stance. By drawing an theories from nonclassical logic and applied cognition, an information inference mechanism is proposed that makes inferences via computations of information flow through an approximation of a conceptual space. Within a conceptual space information is represented geometrically. In this article, geometric representations of words are realized as vectors in a high dimensional semantic space, which is automatically constructed from a text corpus. Two approaches were presented for priming vector representations according to context. The first approach uses a concept combination heuristic to adjust the vector representation of a concept in the light of the representation of another concept. The second approach computes a prototypical concept an the basis of exemplar trace texts and moves it in the dimensional space according to the context. Information inference is evaluated by measuring the effectiveness of query models derived by information flow computations. Results show that information flow contributes significantly to query model effectiveness, particularly with respect to precision. Moreover, retrieval effectiveness compares favorably with two probabilistic query models, and another based an semantic association. More generally, this article can be seen as a contribution towards realizing operational systems that mimic text-based human reasoning.
    Date
    22. 3.2003 19:35:46
  3. Furner, J.: ¬A unifying model of document relatedness for hybrid search engines (2003) 0.01
    0.006019162 = product of:
      0.042134132 = sum of:
        0.03531506 = weight(_text_:representation in 2717) [ClassicSimilarity], result of:
          0.03531506 = score(doc=2717,freq=2.0), product of:
            0.11578492 = queryWeight, product of:
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.025165197 = queryNorm
            0.3050057 = fieldWeight in 2717, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.046875 = fieldNorm(doc=2717)
        0.006819073 = product of:
          0.02045722 = sum of:
            0.02045722 = weight(_text_:22 in 2717) [ClassicSimilarity], result of:
              0.02045722 = score(doc=2717,freq=2.0), product of:
                0.08812423 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.025165197 = queryNorm
                0.23214069 = fieldWeight in 2717, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2717)
          0.33333334 = coord(1/3)
      0.14285715 = coord(2/14)
    
    Date
    11. 9.2004 17:32:22
    Source
    Challenges in knowledge representation and organization for the 21st century: Integration of knowledge across boundaries. Proceedings of the 7th ISKO International Conference Granada, Spain, July 10-13, 2002. Ed.: M. López-Huertas
  4. Klas, C.-P.; Fuhr, N.; Schaefer, A.: Evaluating strategic support for information access in the DAFFODIL system (2004) 0.01
    0.006019162 = product of:
      0.042134132 = sum of:
        0.03531506 = weight(_text_:representation in 2419) [ClassicSimilarity], result of:
          0.03531506 = score(doc=2419,freq=2.0), product of:
            0.11578492 = queryWeight, product of:
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.025165197 = queryNorm
            0.3050057 = fieldWeight in 2419, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.046875 = fieldNorm(doc=2419)
        0.006819073 = product of:
          0.02045722 = sum of:
            0.02045722 = weight(_text_:22 in 2419) [ClassicSimilarity], result of:
              0.02045722 = score(doc=2419,freq=2.0), product of:
                0.08812423 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.025165197 = queryNorm
                0.23214069 = fieldWeight in 2419, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2419)
          0.33333334 = coord(1/3)
      0.14285715 = coord(2/14)
    
    Abstract
    The digital library system Daffodil is targeted at strategic support of users during the information search process. For searching, exploring and managing digital library objects it provides user-customisable information seeking patterns over a federation of heterogeneous digital libraries. In this paper evaluation results with respect to retrieval effectiveness, efficiency and user satisfaction are presented. The analysis focuses on strategic support for the scientific work-flow. Daffodil supports the whole work-flow, from data source selection over information seeking to the representation, organisation and reuse of information. By embedding high level search functionality into the scientific work-flow, the user experiences better strategic system support due to a more systematic work process. These ideas have been implemented in Daffodil followed by a qualitative evaluation. The evaluation has been conducted with 28 participants, ranging from information seeking novices to experts. The results are promising, as they support the chosen model.
    Date
    16.11.2008 16:22:48
  5. Liddy, E.D.: ¬An alternative representation for documents and queries (1993) 0.01
    0.005045009 = product of:
      0.07063012 = sum of:
        0.07063012 = weight(_text_:representation in 7813) [ClassicSimilarity], result of:
          0.07063012 = score(doc=7813,freq=8.0), product of:
            0.11578492 = queryWeight, product of:
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.025165197 = queryNorm
            0.6100114 = fieldWeight in 7813, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.046875 = fieldNorm(doc=7813)
      0.071428575 = coord(1/14)
    
    Abstract
    Describes an alternative method of representation for documents and queries in information retrieval systems to the 2 most common methods: free text, natural language representation and controlled language representation. The alternative method combines the advantage of both traditional approaches and overcomes the difficulties associated with each. The scheme was developed for use with Longman's Dictionary of Contemporary English and uses a computerized version of the dictionary for the automatic generation of summary level semantic representations of each document and query. The system tags each word in a document with the appropriate Subject Field Code (SFC) from the dictionary and the SFCs are summed and normalized to produce a weighted, fixed length vector of the SFC. The search system matches the query SFC vector to the document SFC vectors in the database. The documents are then ranked on the basis of their vector's similarity to the query
  6. Dannenberg, R.B.; Birmingham, W.P.; Pardo, B.; Hu, N.; Meek, C.; Tzanetakis, G.: ¬A comparative evaluation of search techniques for query-by-humming using the MUSART testbed (2007) 0.01
    0.005023338 = product of:
      0.035163365 = sum of:
        0.02942922 = weight(_text_:representation in 269) [ClassicSimilarity], result of:
          0.02942922 = score(doc=269,freq=2.0), product of:
            0.11578492 = queryWeight, product of:
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.025165197 = queryNorm
            0.25417143 = fieldWeight in 269, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.0390625 = fieldNorm(doc=269)
        0.005734144 = product of:
          0.017202431 = sum of:
            0.017202431 = weight(_text_:29 in 269) [ClassicSimilarity], result of:
              0.017202431 = score(doc=269,freq=2.0), product of:
                0.08852329 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.025165197 = queryNorm
                0.19432661 = fieldWeight in 269, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=269)
          0.33333334 = coord(1/3)
      0.14285715 = coord(2/14)
    
    Abstract
    Query-by-humming systems offer content-based searching for melodies and require no special musical training or knowledge. Many such systems have been built, but there has not been much useful evaluation and comparison in the literature due to the lack of shared databases and queries. The MUSART project testbed allows various search algorithms to be compared using a shared framework that automatically runs experiments and summarizes results. Using this testbed, the authors compared algorithms based on string alignment, melodic contour matching, a hidden Markov model, n-grams, and CubyHum. Retrieval performance is very sensitive to distance functions and the representation of pitch and rhythm, which raises questions about some previously published conclusions. Some algorithms are particularly sensitive to the quality of queries. Our queries, which are taken from human subjects in a realistic setting, are quite difficult, especially for n-gram models. Finally, simulations on query-by-humming performance as a function of database size indicate that retrieval performance falls only slowly as the database size increases.
    Date
    29. 4.2007 19:45:32
  7. Tsai, C.-F.; Hu, Y.-H.; Chen, Z.-Y.: Factors affecting rocchio-based pseudorelevance feedback in image retrieval (2015) 0.00
    0.0047004092 = product of:
      0.065805726 = sum of:
        0.065805726 = weight(_text_:representation in 1607) [ClassicSimilarity], result of:
          0.065805726 = score(doc=1607,freq=10.0), product of:
            0.11578492 = queryWeight, product of:
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.025165197 = queryNorm
            0.56834453 = fieldWeight in 1607, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1607)
      0.071428575 = coord(1/14)
    
    Abstract
    Pseudorelevance feedback (PRF) was proposed to solve the limitation of relevance feedback (RF), which is based on the user-in-the-loop process. In PRF, the top-k retrieved images are regarded as PRF. Although the PRF set contains noise, PRF has proven effective for automatically improving the overall retrieval result. To implement PRF, the Rocchio algorithm has been considered as a reasonable and well-established baseline. However, the performance of Rocchio-based PRF is subject to various representation choices (or factors). In this article, we examine these factors that affect the performance of Rocchio-based PRF, including image-feature representation, the number of top-ranked images, the weighting parameters of Rocchio, and similarity measure. We offer practical insights on how to optimize the performance of Rocchio-based PRF by choosing appropriate representation choices. Our extensive experiments on NUS-WIDE-LITE and Caltech 101 + Corel 5000 data sets show that the optimal feature representation is color moment + wavelet texture in terms of retrieval efficiency and effectiveness. Other representation choices are that using top-20 ranked images as pseudopositive and pseudonegative feedback sets with the equal weight (i.e., 0.5) by the correlation and cosine distance functions can produce the optimal retrieval result.
  8. Zhang, W.; Yoshida, T.; Tang, X.: ¬A comparative study of TF*IDF, LSI and multi-words for text classification (2011) 0.00
    0.004369106 = product of:
      0.061167482 = sum of:
        0.061167482 = weight(_text_:representation in 1165) [ClassicSimilarity], result of:
          0.061167482 = score(doc=1165,freq=6.0), product of:
            0.11578492 = queryWeight, product of:
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.025165197 = queryNorm
            0.5282854 = fieldWeight in 1165, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.046875 = fieldNorm(doc=1165)
      0.071428575 = coord(1/14)
    
    Abstract
    One of the main themes in text mining is text representation, which is fundamental and indispensable for text-based intellegent information processing. Generally, text representation inludes two tasks: indexing and weighting. This paper has comparatively studied TF*IDF, LSI and multi-word for text representation. We used a Chinese and an English document collection to respectively evaluate the three methods in information retreival and text categorization. Experimental results have demonstrated that in text categorization, LSI has better performance than other methods in both document collections. Also, LSI has produced the best performance in retrieving English documents. This outcome has shown that LSI has both favorable semantic and statistical quality and is different with the claim that LSI can not produce discriminative power for indexing.
  9. Carpineto, C.; Romano, G.: Information retrieval through hybrid navigation of lattice representations (1996) 0.00
    0.00416192 = product of:
      0.058266878 = sum of:
        0.058266878 = weight(_text_:representation in 7434) [ClassicSimilarity], result of:
          0.058266878 = score(doc=7434,freq=4.0), product of:
            0.11578492 = queryWeight, product of:
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.025165197 = queryNorm
            0.50323373 = fieldWeight in 7434, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7434)
      0.071428575 = coord(1/14)
    
    Abstract
    Presents a comprehensive approach to automatic organization and hybrid navigation of text databases. An organizing stage builds a particular lattice representation of the data, through text indexing followed by lattice clustering of the indexed texts. The lattice representation supports the navigation state of the system, a visual retrieval interface that combines 3 main retrieval strategies: browsing, querying, and bounding. Such a hybrid paradigm permits high flexibility in trading off information exploration and retrieval, and had good retrieval performance. Compares information retrieval using lattice-based hybrid navigation with conventional Boolean querying. Experiments conducted on 2 medium-sized bibliographic databases showed that the performance of lattice retrieval was comparable to or better than Boolean retrieval
  10. Lalmas, M.; Ruthven, I.: Representing and retrieving structured documents using the Dempster-Shafer theory of evidence : modelling and evaluation (1998) 0.00
    0.00416192 = product of:
      0.058266878 = sum of:
        0.058266878 = weight(_text_:representation in 1076) [ClassicSimilarity], result of:
          0.058266878 = score(doc=1076,freq=4.0), product of:
            0.11578492 = queryWeight, product of:
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.025165197 = queryNorm
            0.50323373 = fieldWeight in 1076, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1076)
      0.071428575 = coord(1/14)
    
    Abstract
    Reports on a theoretical model of structured document indexing and retrieval based on the Dempster-Schafer Theory of Evidence. Includes a description of the model of structured document retrieval, the representation of structured documents, the representation of individual components, how components are combined, details of the combination process, and how relevance is captured within the model. Also presents a detailed account of an implementation of the model, and an evaluation scheme designed to test the effectiveness of the model
  11. Ayadi, H.; Torjmen-Khemakhem, M.; Daoud, M.; Xiangji Huang, J.; Ben Jemaa, M.: MF-Re-Rank : a modality feature-based re-ranking model for medical image retrieval (2018) 0.00
    0.00401867 = product of:
      0.02813069 = sum of:
        0.023543375 = weight(_text_:representation in 4459) [ClassicSimilarity], result of:
          0.023543375 = score(doc=4459,freq=2.0), product of:
            0.11578492 = queryWeight, product of:
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.025165197 = queryNorm
            0.20333713 = fieldWeight in 4459, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.03125 = fieldNorm(doc=4459)
        0.004587315 = product of:
          0.013761944 = sum of:
            0.013761944 = weight(_text_:29 in 4459) [ClassicSimilarity], result of:
              0.013761944 = score(doc=4459,freq=2.0), product of:
                0.08852329 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.025165197 = queryNorm
                0.15546128 = fieldWeight in 4459, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03125 = fieldNorm(doc=4459)
          0.33333334 = coord(1/3)
      0.14285715 = coord(2/14)
    
    Abstract
    One of the main challenges in medical image retrieval is the increasing volume of image data, which render it difficult for domain experts to find relevant information from large data sets. Effective and efficient medical image retrieval systems are required to better manage medical image information. Text-based image retrieval (TBIR) was very successful in retrieving images with textual descriptions. Several TBIR approaches rely on models based on bag-of-words approaches, in which the image retrieval problem turns into one of standard text-based information retrieval; where the meanings and values of specific medical entities in the text and metadata are ignored in the image representation and retrieval process. However, we believe that TBIR should extract specific medical entities and terms and then exploit these elements to achieve better image retrieval results. Therefore, we propose a novel reranking method based on medical-image-dependent features. These features are manually selected by a medical expert from imaging modalities and medical terminology. First, we represent queries and images using only medical-image-dependent features such as image modality and image scale. Second, we exploit the defined features in a new reranking method for medical image retrieval. Our motivation is the large influence of image modality in medical image retrieval and its impact on image-relevance scores. To evaluate our approach, we performed a series of experiments on the medical ImageCLEF data sets from 2009 to 2013. The BM25 model, a language model, and an image-relevance feedback model are used as baselines to evaluate our approach. The experimental results show that compared to the BM25 model, the proposed model significantly enhances image retrieval performance. We also compared our approach with other state-of-the-art approaches and show that our approach performs comparably to those of the top three runs in the official ImageCLEF competition.
    Date
    29. 9.2018 11:43:31
  12. Hofferer, M.: Heuristic search in information retrieval (1994) 0.00
    0.0033633395 = product of:
      0.04708675 = sum of:
        0.04708675 = weight(_text_:representation in 1070) [ClassicSimilarity], result of:
          0.04708675 = score(doc=1070,freq=2.0), product of:
            0.11578492 = queryWeight, product of:
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.025165197 = queryNorm
            0.40667427 = fieldWeight in 1070, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.0625 = fieldNorm(doc=1070)
      0.071428575 = coord(1/14)
    
    Abstract
    Describes an adaptive information retrieval system: Information Retrieval Algorithm System (IRAS); that uses heuristic searching to sample a document space and retrieve relevant documents according to users' requests; and also a learning module based on a knowledge representation system and an approximate probabilistic characterization of relevant documents; to reproduce a user classification of relevant documents and to provide a rule controlled ranking
  13. Hoenkamp, E.: Unitary operators on the document space (2003) 0.00
    0.0029727998 = product of:
      0.041619197 = sum of:
        0.041619197 = weight(_text_:representation in 3457) [ClassicSimilarity], result of:
          0.041619197 = score(doc=3457,freq=4.0), product of:
            0.11578492 = queryWeight, product of:
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.025165197 = queryNorm
            0.35945266 = fieldWeight in 3457, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3457)
      0.071428575 = coord(1/14)
    
    Abstract
    When people search for documents, they eventually want content, not words. Hence, search engines should relate documents more by their underlying concepts than by the words they contain. One promising technique to do so is Latent Semantic Indexing (LSI). LSI dramatically reduces the dimension of the document space by mapping it into a space spanned by conceptual indices. Empirically, the number of concepts that can represent the documents are far fewer than the great variety of words in the textual representation. Although this almost obviates the problem of lexical matching, the mapping incurs a high computational cost compared to document parsing, indexing, query matching, and updating. This article accomplishes several things. First, it shows how the technique underlying LSI is just one example of a unitary operator, for which there are computationally more attractive alternatives. Second, it proposes the Haar transform as such an alternative, as it is memory efficient, and can be computed in linear to sublinear time. Third, it generalizes LSI by a multiresolution representation of the document space. The approach not only preserves the advantages of LSI at drastically reduced computational costs, it also opens a spectrum of possibilities for new research.
  14. Liddy, E.D.; Paik, W.; McKenna, M.; Yu, E.S.: ¬A natural language text retrieval system with relevance feedback (1995) 0.00
    0.0029429218 = product of:
      0.041200902 = sum of:
        0.041200902 = weight(_text_:representation in 3131) [ClassicSimilarity], result of:
          0.041200902 = score(doc=3131,freq=2.0), product of:
            0.11578492 = queryWeight, product of:
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.025165197 = queryNorm
            0.35583997 = fieldWeight in 3131, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3131)
      0.071428575 = coord(1/14)
    
    Abstract
    Outlines a fully integrated retrieval engine that processes documents and queries at the multiple, complex linguistic levels that humans use to construe meaning. Currently undergoing beta site trials, the DR-LINK natural language text retrieval system allows searchers to state queries as fully formed, natural sentences. The meaning and matching of both queries and documents is accomplished at the conceptual level of human expression, not by the simple concurrence of keywords. Furthermore, the natural browsing behaviour of information searchers is accomodated by allowing documents identified as potentially relevant by the explicit semantics of the system to be used as relevance feedback queries which provide an appropriate implicit semantic representation of the information seeker's need
  15. Herrera-Viedma, E.; Cordón, O.; Herrera, J.C.; Luqe, M.: ¬An IRS based on multi-granular lnguistic information (2003) 0.00
    0.0025225044 = product of:
      0.03531506 = sum of:
        0.03531506 = weight(_text_:representation in 2740) [ClassicSimilarity], result of:
          0.03531506 = score(doc=2740,freq=2.0), product of:
            0.11578492 = queryWeight, product of:
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.025165197 = queryNorm
            0.3050057 = fieldWeight in 2740, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.046875 = fieldNorm(doc=2740)
      0.071428575 = coord(1/14)
    
    Source
    Challenges in knowledge representation and organization for the 21st century: Integration of knowledge across boundaries. Proceedings of the 7th ISKO International Conference Granada, Spain, July 10-13, 2002. Ed.: M. López-Huertas
  16. Li, H.; Wu, H.; Li, D.; Lin, S.; Su, Z.; Luo, X.: PSI: A probabilistic semantic interpretable framework for fine-grained image ranking (2018) 0.00
    0.0025225044 = product of:
      0.03531506 = sum of:
        0.03531506 = weight(_text_:representation in 4577) [ClassicSimilarity], result of:
          0.03531506 = score(doc=4577,freq=2.0), product of:
            0.11578492 = queryWeight, product of:
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.025165197 = queryNorm
            0.3050057 = fieldWeight in 4577, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.046875 = fieldNorm(doc=4577)
      0.071428575 = coord(1/14)
    
    Abstract
    Image Ranking is one of the key problems in information science research area. However, most current methods focus on increasing the performance, leaving the semantic gap problem, which refers to the learned ranking models are hard to be understood, remaining intact. Therefore, in this article, we aim at learning an interpretable ranking model to tackle the semantic gap in fine-grained image ranking. We propose to combine attribute-based representation and online passive-aggressive (PA) learning based ranking models to achieve this goal. Besides, considering the highly localized instances in fine-grained image ranking, we introduce a supervised constrained clustering method to gather class-balanced training instances for local PA-based models, and incorporate the learned local models into a unified probabilistic framework. Extensive experiments on the benchmark demonstrate that the proposed framework outperforms state-of-the-art methods in terms of accuracy and speed.
  17. Hoenkamp, E.; Bruza, P.D.; Song, D.; Huang, Q.: ¬An effective approach to verbose queries using a limited dependencies language model (2009) 0.00
    0.00237824 = product of:
      0.03329536 = sum of:
        0.03329536 = weight(_text_:representation in 2122) [ClassicSimilarity], result of:
          0.03329536 = score(doc=2122,freq=4.0), product of:
            0.11578492 = queryWeight, product of:
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.025165197 = queryNorm
            0.28756213 = fieldWeight in 2122, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.03125 = fieldNorm(doc=2122)
      0.071428575 = coord(1/14)
    
    Abstract
    Intuitively, any 'bag of words' approach in IR should benefit from taking term dependencies into account. Unfortunately, for years the results of exploiting such dependencies have been mixed or inconclusive. To improve the situation, this paper shows how the natural language properties of the target documents can be used to transform and enrich the term dependencies to more useful statistics. This is done in three steps. The term co-occurrence statistics of queries and documents are each represented by a Markov chain. The paper proves that such a chain is ergodic, and therefore its asymptotic behavior is unique, stationary, and independent of the initial state. Next, the stationary distribution is taken to model queries and documents, rather than their initial distributions. Finally, ranking is achieved following the customary language modeling paradigm. The main contribution of this paper is to argue why the asymptotic behavior of the document model is a better representation then just the document's initial distribution. A secondary contribution is to investigate the practical application of this representation in case the queries become increasingly verbose. In the experiments (based on Lemur's search engine substrate) the default query model was replaced by the stable distribution of the query. Just modeling the query this way already resulted in significant improvements over a standard language model baseline. The results were on a par or better than more sophisticated algorithms that use fine-tuned parameters or extensive training. Moreover, the more verbose the query, the more effective the approach seems to become.
  18. Carpineto, C.; Romano, G.: Order-theoretical ranking (2000) 0.00
    0.0021020873 = product of:
      0.02942922 = sum of:
        0.02942922 = weight(_text_:representation in 4766) [ClassicSimilarity], result of:
          0.02942922 = score(doc=4766,freq=2.0), product of:
            0.11578492 = queryWeight, product of:
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.025165197 = queryNorm
            0.25417143 = fieldWeight in 4766, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4766)
      0.071428575 = coord(1/14)
    
    Abstract
    Current best-match ranking (BMR) systems perform well but cannot handle word mismatch between a query and a document. The best known alternative ranking method, hierarchical clustering-based ranking (HCR), seems to be more robust than BMR with respect to this problem, but it is hampered by theoretical and practical limitations. We present an approach to document ranking that explicitly addresses the word mismatch problem by exploiting interdocument similarity information in a novel way. Document ranking is seen as a query-document transformation driven by a conceptual representation of the whole document collection, into which the query is merged. Our approach is nased on the theory of concept (or Galois) lattices, which, er argue, provides a powerful, well-founded, and conputationally-tractable framework to model the space in which documents and query are represented and to compute such a transformation. We compared information retrieval using concept lattice-based ranking (CLR) to BMR and HCR. The results showed that HCR was outperformed by CLR as well as BMR, and suggested that, of the two best methods, BMR achieved better performance than CLR on the whole document set, whereas CLR compared more favorably when only the first retrieved documents were used for evaluation. We also evaluated the three methods' specific ability to rank documents that did not match the query, in which case the speriority of CLR over BMR and HCR was apparent
  19. Picard, J.; Savoy, J.: Enhancing retrieval with hyperlinks : a general model based on propositional argumentation systems (2003) 0.00
    0.0021020873 = product of:
      0.02942922 = sum of:
        0.02942922 = weight(_text_:representation in 1427) [ClassicSimilarity], result of:
          0.02942922 = score(doc=1427,freq=2.0), product of:
            0.11578492 = queryWeight, product of:
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.025165197 = queryNorm
            0.25417143 = fieldWeight in 1427, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1427)
      0.071428575 = coord(1/14)
    
    Abstract
    Fast, effective, and adaptable techniques are needed to automatically organize and retrieve information an the ever-increasing World Wide Web. In that respect, different strategies have been suggested to take hypertext links into account. For example, hyperlinks have been used to (1) enhance document representation, (2) improve document ranking by propagating document score, (3) provide an indicator of popularity, and (4) find hubs and authorities for a given topic. Although the TREC experiments have not demonstrated the usefulness of hyperlinks for retrieval, the hypertext structure is nevertheless an essential aspect of the Web, and as such, should not be ignored. The development of abstract models of the IR task was a key factor to the improvement of search engines. However, at this time conceptual tools for modeling the hypertext retrieval task are lacking, making it difficult to compare, improve, and reason an the existing techniques. This article proposes a general model for using hyperlinks based an Probabilistic Argumentation Systems, in which each of the above-mentioned techniques can be stated. This model will allow to discover some inconsistencies in the mentioned techniques, and to take a higher level and systematic approach for using hyperlinks for retrieval.
  20. Quiroga, L.M.; Mostafa, J.: ¬An experiment in building profiles in information filtering : the role of context of user relevance feedback (2002) 0.00
    0.0021020873 = product of:
      0.02942922 = sum of:
        0.02942922 = weight(_text_:representation in 2579) [ClassicSimilarity], result of:
          0.02942922 = score(doc=2579,freq=2.0), product of:
            0.11578492 = queryWeight, product of:
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.025165197 = queryNorm
            0.25417143 = fieldWeight in 2579, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2579)
      0.071428575 = coord(1/14)
    
    Abstract
    An experiment was conducted to see how relevance feedback could be used to build and adjust profiles to improve the performance of filtering systems. Data was collected during the system interaction of 18 graduate students with SIFTER (Smart Information Filtering Technology for Electronic Resources), a filtering system that ranks incoming information based on users' profiles. The data set came from a collection of 6000 records concerning consumer health. In the first phase of the study, three different modes of profile acquisition were compared. The explicit mode allowed users to directly specify the profile; the implicit mode utilized relevance feedback to create and refine the profile; and the combined mode allowed users to initialize the profile and to continuously refine it using relevance feedback. Filtering performance, measured in terms of Normalized Precision, showed that the three approaches were significantly different ( [small alpha, Greek] =0.05 and p =0.012). The explicit mode of profile acquisition consistently produced superior results. Exclusive reliance on relevance feedback in the implicit mode resulted in inferior performance. The low performance obtained by the implicit acquisition mode motivated the second phase of the study, which aimed to clarify the role of context in relevance feedback judgments. An inductive content analysis of thinking aloud protocols showed dimensions that were highly situational, establishing the importance context plays in feedback relevance assessments. Results suggest the need for better representation of documents, profiles, and relevance feedback mechanisms that incorporate dimensions identified in this research.

Years

Languages

  • e 70
  • d 12
  • pt 1
  • More… Less…

Types

  • a 78
  • m 3
  • el 1
  • r 1
  • s 1
  • x 1
  • More… Less…