Search (149 results, page 1 of 8)

  • × theme_ss:"Retrievalalgorithmen"
  1. Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (2005) 0.05
    0.048646726 = product of:
      0.11350903 = sum of:
        0.042067256 = weight(_text_:processing in 7) [ClassicSimilarity], result of:
          0.042067256 = score(doc=7,freq=4.0), product of:
            0.1662677 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.04107254 = queryNorm
            0.2530092 = fieldWeight in 7, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.03125 = fieldNorm(doc=7)
        0.03522524 = weight(_text_:techniques in 7) [ClassicSimilarity], result of:
          0.03522524 = score(doc=7,freq=2.0), product of:
            0.18093403 = queryWeight, product of:
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.04107254 = queryNorm
            0.19468555 = fieldWeight in 7, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.03125 = fieldNorm(doc=7)
        0.03621654 = product of:
          0.07243308 = sum of:
            0.07243308 = weight(_text_:mathematics in 7) [ClassicSimilarity], result of:
              0.07243308 = score(doc=7,freq=2.0), product of:
                0.25945482 = queryWeight, product of:
                  6.31699 = idf(docFreq=216, maxDocs=44218)
                  0.04107254 = queryNorm
                0.27917415 = fieldWeight in 7, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  6.31699 = idf(docFreq=216, maxDocs=44218)
                  0.03125 = fieldNorm(doc=7)
          0.5 = coord(1/2)
      0.42857143 = coord(3/7)
    
    Abstract
    The second edition of Understanding Search Engines: Mathematical Modeling and Text Retrieval follows the basic premise of the first edition by discussing many of the key design issues for building search engines and emphasizing the important role that applied mathematics can play in improving information retrieval. The authors discuss important data structures, algorithms, and software as well as user-centered issues such as interfaces, manual indexing, and document preparation. Significant changes bring the text up to date on current information retrieval methods: for example the addition of a new chapter on link-structure algorithms used in search engines such as Google. The chapter on user interface has been rewritten to specifically focus on search engine usability. In addition the authors have added new recommendations for further reading and expanded the bibliography, and have updated and streamlined the index to make it more reader friendly.
    Content
    Inhalt: Introduction Document File Preparation - Manual Indexing - Information Extraction - Vector Space Modeling - Matrix Decompositions - Query Representations - Ranking and Relevance Feedback - Searching by Link Structure - User Interface - Book Format Document File Preparation Document Purification and Analysis - Text Formatting - Validation - Manual Indexing - Automatic Indexing - Item Normalization - Inverted File Structures - Document File - Dictionary List - Inversion List - Other File Structures Vector Space Models Construction - Term-by-Document Matrices - Simple Query Matching - Design Issues - Term Weighting - Sparse Matrix Storage - Low-Rank Approximations Matrix Decompositions QR Factorization - Singular Value Decomposition - Low-Rank Approximations - Query Matching - Software - Semidiscrete Decomposition - Updating Techniques Query Management Query Binding - Types of Queries - Boolean Queries - Natural Language Queries - Thesaurus Queries - Fuzzy Queries - Term Searches - Probabilistic Queries Ranking and Relevance Feedback Performance Evaluation - Precision - Recall - Average Precision - Genetic Algorithms - Relevance Feedback Searching by Link Structure HITS Method - HITS Implementation - HITS Summary - PageRank Method - PageRank Adjustments - PageRank Implementation - PageRank Summary User Interface Considerations General Guidelines - Search Engine Interfaces - Form Fill-in - Display Considerations - Progress Indication - No Penalties for Error - Results - Test and Retest - Final Considerations Further Reading
    LCSH
    Text processing (Computer science)
    Subject
    Text processing (Computer science)
  2. Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.05
    0.046714935 = product of:
      0.16350226 = sum of:
        0.11898416 = weight(_text_:processing in 402) [ClassicSimilarity], result of:
          0.11898416 = score(doc=402,freq=2.0), product of:
            0.1662677 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.04107254 = queryNorm
            0.7156181 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
        0.044518095 = product of:
          0.08903619 = sum of:
            0.08903619 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
              0.08903619 = score(doc=402,freq=2.0), product of:
                0.14382903 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04107254 = queryNorm
                0.61904186 = fieldWeight in 402, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=402)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Source
    Information processing and management. 22(1986) no.6, S.465-476
  3. Back, J.: ¬An evaluation of relevancy ranking techniques used by Internet search engines (2000) 0.05
    0.046354763 = product of:
      0.16224167 = sum of:
        0.12328834 = weight(_text_:techniques in 3445) [ClassicSimilarity], result of:
          0.12328834 = score(doc=3445,freq=2.0), product of:
            0.18093403 = queryWeight, product of:
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.04107254 = queryNorm
            0.6813994 = fieldWeight in 3445, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.109375 = fieldNorm(doc=3445)
        0.038953334 = product of:
          0.07790667 = sum of:
            0.07790667 = weight(_text_:22 in 3445) [ClassicSimilarity], result of:
              0.07790667 = score(doc=3445,freq=2.0), product of:
                0.14382903 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04107254 = queryNorm
                0.5416616 = fieldWeight in 3445, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=3445)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Date
    25. 8.2005 17:42:22
  4. Paris, L.A.H.; Tibbo, H.R.: Freestyle vs. Boolean : a comparison of partial and exact match retrieval systems (1998) 0.05
    0.045378976 = product of:
      0.15882641 = sum of:
        0.05205557 = weight(_text_:processing in 3329) [ClassicSimilarity], result of:
          0.05205557 = score(doc=3329,freq=2.0), product of:
            0.1662677 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.04107254 = queryNorm
            0.3130829 = fieldWeight in 3329, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3329)
        0.106770836 = weight(_text_:techniques in 3329) [ClassicSimilarity], result of:
          0.106770836 = score(doc=3329,freq=6.0), product of:
            0.18093403 = queryWeight, product of:
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.04107254 = queryNorm
            0.5901092 = fieldWeight in 3329, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3329)
      0.2857143 = coord(2/7)
    
    Abstract
    Compares the performance of partial match options, LEXIS/NEXIS's Freestyle, with that of traditional Boolean retrieval. Defines natural language and the natural language search engines currently available. Although the Boolean searches had better results more often than the Freestyle searches, neither mechanism demonstrated superior performance for every query. These results do not in any way prove the superiority of partial match techniques or exact match techniques, but they do suggest that different queries demand different techniques. Further study and analysis are needed to determine which elements of a query make it best suited for partial match or exact match retrieval
    Source
    Information processing and management. 34(1998) nos.2/3, S.175-190
  5. Pfeifer, U.; Pennekamp, S.: Incremental processing of vague queries in interactive retrieval systems (1997) 0.04
    0.044167142 = product of:
      0.15458499 = sum of:
        0.08413451 = weight(_text_:processing in 735) [ClassicSimilarity], result of:
          0.08413451 = score(doc=735,freq=4.0), product of:
            0.1662677 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.04107254 = queryNorm
            0.5060184 = fieldWeight in 735, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0625 = fieldNorm(doc=735)
        0.07045048 = weight(_text_:techniques in 735) [ClassicSimilarity], result of:
          0.07045048 = score(doc=735,freq=2.0), product of:
            0.18093403 = queryWeight, product of:
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.04107254 = queryNorm
            0.3893711 = fieldWeight in 735, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.0625 = fieldNorm(doc=735)
      0.2857143 = coord(2/7)
    
    Abstract
    The application of information retrieval techniques in interactive environments requires systems capable of effeciently processing vague queries. To reach reasonable response times, new data structures and algorithms have to be developed. In this paper we describe an approach taking advantage of the conditions of interactive usage and special access paths. To have a reference we investigate text queries and compared our algorithms to the well known 'Buckley/Lewit' algorithm. We achieved significant improvements for the response times
  6. Na, S.-H.; Kang, I.-S.; Roh, J.-E.; Lee, J.-H.: ¬An empirical study of query expansion and cluster-based retrieval in language modeling approach (2007) 0.04
    0.039781027 = product of:
      0.13923359 = sum of:
        0.05205557 = weight(_text_:processing in 906) [ClassicSimilarity], result of:
          0.05205557 = score(doc=906,freq=2.0), product of:
            0.1662677 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.04107254 = queryNorm
            0.3130829 = fieldWeight in 906, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0546875 = fieldNorm(doc=906)
        0.08717802 = weight(_text_:techniques in 906) [ClassicSimilarity], result of:
          0.08717802 = score(doc=906,freq=4.0), product of:
            0.18093403 = queryWeight, product of:
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.04107254 = queryNorm
            0.48182213 = fieldWeight in 906, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.0546875 = fieldNorm(doc=906)
      0.2857143 = coord(2/7)
    
    Abstract
    The term mismatch problem in information retrieval is a critical problem, and several techniques have been developed, such as query expansion, cluster-based retrieval and dimensionality reduction to resolve this issue. Of these techniques, this paper performs an empirical study on query expansion and cluster-based retrieval. We examine the effect of using parsimony in query expansion and the effect of clustering algorithms in cluster-based retrieval. In addition, query expansion and cluster-based retrieval are compared, and their combinations are evaluated in terms of retrieval performance by performing experimentations on seven test collections of NTCIR and TREC.
    Source
    Information processing and management. 43(2007) no.2, S.302-314
  7. Zhu, B.; Chen, H.: Validating a geographical image retrieval system (2000) 0.04
    0.039378542 = product of:
      0.1378249 = sum of:
        0.06310088 = weight(_text_:processing in 4769) [ClassicSimilarity], result of:
          0.06310088 = score(doc=4769,freq=4.0), product of:
            0.1662677 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.04107254 = queryNorm
            0.3795138 = fieldWeight in 4769, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046875 = fieldNorm(doc=4769)
        0.07472401 = weight(_text_:techniques in 4769) [ClassicSimilarity], result of:
          0.07472401 = score(doc=4769,freq=4.0), product of:
            0.18093403 = queryWeight, product of:
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.04107254 = queryNorm
            0.4129904 = fieldWeight in 4769, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.046875 = fieldNorm(doc=4769)
      0.2857143 = coord(2/7)
    
    Abstract
    This paper summarizes a prototype geographical image retrieval system that demonstrates how to integrate image processing and information analysis techniques to support large-scale content-based image retrieval. By using an image as its interface, the prototype system addresses a troublesome aspect of traditional retrieval models, which require users to have complete knowledge of the low-level features of an image. In addition we describe an experiment to validate against that of human subjects in an effort to address the scarcity of research evaluating performance of an algorithm against that of human beings. The results of the experiment indicate that the system could do as well as human subjects in accomplishing the tasks of similarity analysis and image categorization. We also found that under some circumstances texture features of an image are insufficient to represent an geographic image. We believe, however, that our image retrieval system provides a promising approach to integrating image processing techniques and information retrieval algorithms
  8. Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (1999) 0.03
    0.0335502 = product of:
      0.117425695 = sum of:
        0.06310088 = weight(_text_:processing in 5777) [ClassicSimilarity], result of:
          0.06310088 = score(doc=5777,freq=4.0), product of:
            0.1662677 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.04107254 = queryNorm
            0.3795138 = fieldWeight in 5777, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046875 = fieldNorm(doc=5777)
        0.05432481 = product of:
          0.10864962 = sum of:
            0.10864962 = weight(_text_:mathematics in 5777) [ClassicSimilarity], result of:
              0.10864962 = score(doc=5777,freq=2.0), product of:
                0.25945482 = queryWeight, product of:
                  6.31699 = idf(docFreq=216, maxDocs=44218)
                  0.04107254 = queryNorm
                0.41876122 = fieldWeight in 5777, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  6.31699 = idf(docFreq=216, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5777)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    This book discusses many of the key design issues for building search engines and emphazises the important role that applied mathematics can play in improving information retrieval. The authors discuss not only important data structures, algorithms, and software but also user-centered issues such as interfaces, manual indexing, and document preparation. They also present some of the current problems in information retrieval that many not be familiar to applied mathematicians and computer scientists and some of the driving computational methods (SVD, SDD) for automated conceptual indexing
    LCSH
    Text processing (Computer science)
    Subject
    Text processing (Computer science)
  9. Abdelali, A.; Cowie, J.; Soliman, H.S.: Improving query precision using semantic expansion (2007) 0.03
    0.03248564 = product of:
      0.11369974 = sum of:
        0.05205557 = weight(_text_:processing in 917) [ClassicSimilarity], result of:
          0.05205557 = score(doc=917,freq=2.0), product of:
            0.1662677 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.04107254 = queryNorm
            0.3130829 = fieldWeight in 917, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0546875 = fieldNorm(doc=917)
        0.06164417 = weight(_text_:techniques in 917) [ClassicSimilarity], result of:
          0.06164417 = score(doc=917,freq=2.0), product of:
            0.18093403 = queryWeight, product of:
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.04107254 = queryNorm
            0.3406997 = fieldWeight in 917, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.0546875 = fieldNorm(doc=917)
      0.2857143 = coord(2/7)
    
    Abstract
    Query Expansion (QE) is one of the most important mechanisms in the information retrieval field. A typical short Internet query will go through a process of refinement to improve its retrieval power. Most of the existing QE techniques suffer from retrieval performance degradation due to imprecise choice of query's additive terms in the QE process. In this paper, we introduce a novel automated QE mechanism. The new expansion process is guided by the semantics relations between the original query and the expanding words, in the context of the utilized corpus. Experimental results of our "controlled" query expansion, using the Arabic TREC-10 data, show a significant enhancement of recall and precision over current existing mechanisms in the field.
    Source
    Information processing and management. 43(2007) no.3, S.705-716
  10. Torra, V.; Miyamoto, S.; Lanau, S.: Exploration of textual document archives using a fuzzy hierarchical clustering algorithm in the GAMBAL system (2005) 0.03
    0.03248564 = product of:
      0.11369974 = sum of:
        0.05205557 = weight(_text_:processing in 1028) [ClassicSimilarity], result of:
          0.05205557 = score(doc=1028,freq=2.0), product of:
            0.1662677 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.04107254 = queryNorm
            0.3130829 = fieldWeight in 1028, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1028)
        0.06164417 = weight(_text_:techniques in 1028) [ClassicSimilarity], result of:
          0.06164417 = score(doc=1028,freq=2.0), product of:
            0.18093403 = queryWeight, product of:
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.04107254 = queryNorm
            0.3406997 = fieldWeight in 1028, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1028)
      0.2857143 = coord(2/7)
    
    Abstract
    The Internet, together with the large amount of textual information available in document archives, has increased the relevance of information retrieval related tools. In this work we present an extension of the Gambal system for clustering and visualization of documents based on fuzzy clustering techniques. The tool allows to structure the set of documents in a hierarchical way (using a fuzzy hierarchical structure) and represent this structure in a graphical interface (a 3D sphere) over which the user can navigate. Gambal allows the analysis of the documents and the computation of their similarity not only on the basis of the syntactic similarity between words but also based on a dictionary (Wordnet 1.7) and latent semantics analysis.
    Source
    Information processing and management. 41(2005) no.3, S.587-598
  11. López-Pujalte, C.; Guerrero-Bote, V.P.; Moya-Anegón, F. de: Genetic algorithms in relevance feedback : a second test and new contributions (2003) 0.03
    0.03248564 = product of:
      0.11369974 = sum of:
        0.05205557 = weight(_text_:processing in 1076) [ClassicSimilarity], result of:
          0.05205557 = score(doc=1076,freq=2.0), product of:
            0.1662677 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.04107254 = queryNorm
            0.3130829 = fieldWeight in 1076, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1076)
        0.06164417 = weight(_text_:techniques in 1076) [ClassicSimilarity], result of:
          0.06164417 = score(doc=1076,freq=2.0), product of:
            0.18093403 = queryWeight, product of:
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.04107254 = queryNorm
            0.3406997 = fieldWeight in 1076, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1076)
      0.2857143 = coord(2/7)
    
    Abstract
    The present work is the continuation of an earlier study which reviewed the literature on relevance feedback genetic techniques that follow the vector space model (the model that is most commonly used in this type of application), and implemented them so that they could be compared with each other as well as with one of the best traditional methods of relevance feedback--the Ide dec-hi method. We here carry out the comparisons on more test collections (Cranfield, CISI, Medline, and NPL), using the residual collection method for their evaluation as is recommended in this type of technique. We also add some fitness functions of our own design.
    Source
    Information processing and management. 39(2003) no.5, S.669-687
  12. Otterbacher, J.; Erkan, G.; Radev, D.R.: Biased LexRank : passage retrieval using random walks with question-based priors (2009) 0.03
    0.03248564 = product of:
      0.11369974 = sum of:
        0.05205557 = weight(_text_:processing in 2450) [ClassicSimilarity], result of:
          0.05205557 = score(doc=2450,freq=2.0), product of:
            0.1662677 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.04107254 = queryNorm
            0.3130829 = fieldWeight in 2450, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2450)
        0.06164417 = weight(_text_:techniques in 2450) [ClassicSimilarity], result of:
          0.06164417 = score(doc=2450,freq=2.0), product of:
            0.18093403 = queryWeight, product of:
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.04107254 = queryNorm
            0.3406997 = fieldWeight in 2450, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2450)
      0.2857143 = coord(2/7)
    
    Abstract
    We present Biased LexRank, a method for semi-supervised passage retrieval in the context of question answering. We represent a text as a graph of passages linked based on their pairwise lexical similarity. We use traditional passage retrieval techniques to identify passages that are likely to be relevant to a user's natural language question. We then perform a random walk on the lexical similarity graph in order to recursively retrieve additional passages that are similar to other relevant passages. We present results on several benchmarks that show the applicability of our work to question answering and topic-focused text summarization.
    Source
    Information processing and management. 45(2009) no.1, S.42-54
  13. García Cumbreras, M.A.; Perea-Ortega, J.M.; García Vega, M.; Ureña López, L.A.: Information retrieval with geographical references : relevant documents filtering vs. query expansion (2009) 0.03
    0.03248564 = product of:
      0.11369974 = sum of:
        0.05205557 = weight(_text_:processing in 4222) [ClassicSimilarity], result of:
          0.05205557 = score(doc=4222,freq=2.0), product of:
            0.1662677 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.04107254 = queryNorm
            0.3130829 = fieldWeight in 4222, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4222)
        0.06164417 = weight(_text_:techniques in 4222) [ClassicSimilarity], result of:
          0.06164417 = score(doc=4222,freq=2.0), product of:
            0.18093403 = queryWeight, product of:
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.04107254 = queryNorm
            0.3406997 = fieldWeight in 4222, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4222)
      0.2857143 = coord(2/7)
    
    Abstract
    This is a thorough analysis of two techniques applied to Geographic Information Retrieval (GIR). Previous studies have researched the application of query expansion to improve the selection process of information retrieval systems. This paper emphasizes the effectiveness of the filtering of relevant documents applied to a GIR system, instead of query expansion. Based on the CLEF (Cross Language Evaluation Forum) framework available, several experiments have been run. Some based on query expansion, some on the filtering of relevant documents. The results show that filtering works better in a GIR environment, because relevant documents are not reordered in the final list.
    Source
    Information processing and management. 45(2009) no.5, S.605-614
  14. MacFarlane, A.; Robertson, S.E.; McCann, J.A.: Parallel computing for passage retrieval (2004) 0.03
    0.030398162 = product of:
      0.10639356 = sum of:
        0.08413451 = weight(_text_:processing in 5108) [ClassicSimilarity], result of:
          0.08413451 = score(doc=5108,freq=4.0), product of:
            0.1662677 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.04107254 = queryNorm
            0.5060184 = fieldWeight in 5108, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0625 = fieldNorm(doc=5108)
        0.022259047 = product of:
          0.044518095 = sum of:
            0.044518095 = weight(_text_:22 in 5108) [ClassicSimilarity], result of:
              0.044518095 = score(doc=5108,freq=2.0), product of:
                0.14382903 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04107254 = queryNorm
                0.30952093 = fieldWeight in 5108, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5108)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    In this paper methods for both speeding up passage processing and examining more passages using parallel computers are explored. The number of passages processed are varied in order to examine the effect on retrieval effectiveness and efficiency. The particular algorithm applied has previously been used to good effect in Okapi experiments at TREC. This algorithm and the mechanism for applying parallel computing to speed up processing are described.
    Date
    20. 1.2007 18:30:22
  15. Klas, C.-P.; Fuhr, N.; Schaefer, A.: Evaluating strategic support for information access in the DAFFODIL system (2004) 0.03
    0.028978147 = product of:
      0.10142351 = sum of:
        0.084729224 = weight(_text_:digital in 2419) [ClassicSimilarity], result of:
          0.084729224 = score(doc=2419,freq=8.0), product of:
            0.16201277 = queryWeight, product of:
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.04107254 = queryNorm
            0.52297866 = fieldWeight in 2419, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.046875 = fieldNorm(doc=2419)
        0.016694285 = product of:
          0.03338857 = sum of:
            0.03338857 = weight(_text_:22 in 2419) [ClassicSimilarity], result of:
              0.03338857 = score(doc=2419,freq=2.0), product of:
                0.14382903 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04107254 = queryNorm
                0.23214069 = fieldWeight in 2419, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2419)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    The digital library system Daffodil is targeted at strategic support of users during the information search process. For searching, exploring and managing digital library objects it provides user-customisable information seeking patterns over a federation of heterogeneous digital libraries. In this paper evaluation results with respect to retrieval effectiveness, efficiency and user satisfaction are presented. The analysis focuses on strategic support for the scientific work-flow. Daffodil supports the whole work-flow, from data source selection over information seeking to the representation, organisation and reuse of information. By embedding high level search functionality into the scientific work-flow, the user experiences better strategic system support due to a more systematic work process. These ideas have been implemented in Daffodil followed by a qualitative evaluation. The evaluation has been conducted with 28 participants, ranging from information seeking novices to experts. The results are promising, as they support the chosen model.
    Date
    16.11.2008 16:22:48
    Source
    Research and advanced technology for digital libraries : 8th European conference, ECDL 2004, Bath, UK, September 12-17, 2004 : proceedings. Eds.: Heery, R. u. E. Lyon
  16. Shah, B.; Raghavan, V.; Dhatric, P.; Zhao, X.: ¬A cluster-based approach for efficient content-based image retrieval using a similarity-preserving space transformation method (2006) 0.03
    0.02841502 = product of:
      0.09945257 = sum of:
        0.03718255 = weight(_text_:processing in 6118) [ClassicSimilarity], result of:
          0.03718255 = score(doc=6118,freq=2.0), product of:
            0.1662677 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.04107254 = queryNorm
            0.22363065 = fieldWeight in 6118, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6118)
        0.062270015 = weight(_text_:techniques in 6118) [ClassicSimilarity], result of:
          0.062270015 = score(doc=6118,freq=4.0), product of:
            0.18093403 = queryWeight, product of:
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.04107254 = queryNorm
            0.34415868 = fieldWeight in 6118, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6118)
      0.2857143 = coord(2/7)
    
    Abstract
    The techniques of clustering and space transformation have been successfully used in the past to solve a number of pattern recognition problems. In this article, the authors propose a new approach to content-based image retrieval (CBIR) that uses (a) a newly proposed similarity-preserving space transformation method to transform the original low-level image space into a highlevel vector space that enables efficient query processing, and (b) a clustering scheme that further improves the efficiency of our retrieval system. This combination is unique and the resulting system provides synergistic advantages of using both clustering and space transformation. The proposed space transformation method is shown to preserve the order of the distances in the transformed feature space. This strategy makes this approach to retrieval generic as it can be applied to object types, other than images, and feature spaces more general than metric spaces. The CBIR approach uses the inexpensive "estimated" distance in the transformed space, as opposed to the computationally inefficient "real" distance in the original space, to retrieve the desired results for a given query image. The authors also provide a theoretical analysis of the complexity of their CBIR approach when used for color-based retrieval, which shows that it is computationally more efficient than other comparable approaches. An extensive set of experiments to test the efficiency and effectiveness of the proposed approach has been performed. The results show that the approach offers superior response time (improvement of 1-2 orders of magnitude compared to retrieval approaches that either use pruning techniques like indexing, clustering, etc., or space transformation, but not both) with sufficiently high retrieval accuracy.
  17. Lee, J.; Min, J.-K.; Oh, A.; Chung, C.-W.: Effective ranking and search techniques for Web resources considering semantic relationships (2014) 0.03
    0.02841502 = product of:
      0.09945257 = sum of:
        0.03718255 = weight(_text_:processing in 2670) [ClassicSimilarity], result of:
          0.03718255 = score(doc=2670,freq=2.0), product of:
            0.1662677 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.04107254 = queryNorm
            0.22363065 = fieldWeight in 2670, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2670)
        0.062270015 = weight(_text_:techniques in 2670) [ClassicSimilarity], result of:
          0.062270015 = score(doc=2670,freq=4.0), product of:
            0.18093403 = queryWeight, product of:
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.04107254 = queryNorm
            0.34415868 = fieldWeight in 2670, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2670)
      0.2857143 = coord(2/7)
    
    Abstract
    On the Semantic Web, the types of resources and the semantic relationships between resources are defined in an ontology. By using that information, the accuracy of information retrieval can be improved. In this paper, we present effective ranking and search techniques considering the semantic relationships in an ontology. Our technique retrieves top-k resources which are the most relevant to query keywords through the semantic relationships. To do this, we propose a weighting measure for the semantic relationship. Based on this measure, we propose a novel ranking method which considers the number of meaningful semantic relationships between a resource and keywords as well as the coverage and discriminating power of keywords. In order to improve the efficiency of the search, we prune the unnecessary search space using the length and weight thresholds of the semantic relationship path. In addition, we exploit Threshold Algorithm based on an extended inverted index to answer top-k results efficiently. The experimental results using real data sets demonstrate that our retrieval method using the semantic information generates accurate results efficiently compared to the traditional methods.
    Source
    Information processing and management. 50(2014) no.1, S.132-155
  18. Chen, H.; Zhang, Y.; Houston, A.L.: Semantic indexing and searching using a Hopfield net (1998) 0.03
    0.027844835 = product of:
      0.09745692 = sum of:
        0.04461906 = weight(_text_:processing in 5704) [ClassicSimilarity], result of:
          0.04461906 = score(doc=5704,freq=2.0), product of:
            0.1662677 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.04107254 = queryNorm
            0.26835677 = fieldWeight in 5704, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046875 = fieldNorm(doc=5704)
        0.052837856 = weight(_text_:techniques in 5704) [ClassicSimilarity], result of:
          0.052837856 = score(doc=5704,freq=2.0), product of:
            0.18093403 = queryWeight, product of:
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.04107254 = queryNorm
            0.2920283 = fieldWeight in 5704, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.046875 = fieldNorm(doc=5704)
      0.2857143 = coord(2/7)
    
    Abstract
    Presents a neural network approach to document semantic indexing. Reports results of a study to apply a Hopfield net algorithm to simulate human associative memory for concept exploration in the domain of computer science and engineering. The INSPEC database, consisting of 320.000 abstracts from leading periodical articles was used as the document test bed. Benchmark tests conformed that 3 parameters: maximum number of activated nodes; maximum allowable error; and maximum number of iterations; were useful in positively influencing network convergence behaviour without negatively impacting central processing unit performance. Another series of benchmark tests was performed to determine the effectiveness of various filtering techniques in reducing the negative impact of noisy input terms. Preliminary user tests conformed expectations that the Hopfield net is potentially useful as an associative memory technique to improve document recall and precision by solving discrepancies between indexer vocabularies and end user vocabularies
  19. Beitzel, S.M.; Jensen, E.C.; Chowdhury, A.; Grossman, D.; Frieder, O; Goharian, N.: Fusion of effective retrieval strategies in the same information retrieval system (2004) 0.03
    0.027844835 = product of:
      0.09745692 = sum of:
        0.04461906 = weight(_text_:processing in 2502) [ClassicSimilarity], result of:
          0.04461906 = score(doc=2502,freq=2.0), product of:
            0.1662677 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.04107254 = queryNorm
            0.26835677 = fieldWeight in 2502, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046875 = fieldNorm(doc=2502)
        0.052837856 = weight(_text_:techniques in 2502) [ClassicSimilarity], result of:
          0.052837856 = score(doc=2502,freq=2.0), product of:
            0.18093403 = queryWeight, product of:
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.04107254 = queryNorm
            0.2920283 = fieldWeight in 2502, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.046875 = fieldNorm(doc=2502)
      0.2857143 = coord(2/7)
    
    Abstract
    Prior efforts have shown that under certain situations retrieval effectiveness may be improved via the use of data fusion techniques. Although these improvements have been observed from the fusion of result sets from several distinct information retrieval systems, it has often been thought that fusing different document retrieval strategies in a single information retrieval system will lead to similar improvements. In this study, we show that this is not the case. We hold constant systemic differences such as parsing, stemming, phrase processing, and relevance feedback, and fuse result sets generated from highly effective retrieval strategies in the same information retrieval system. From this, we show that data fusion of highly effective retrieval strategies alone shows little or no improvement in retrieval effectiveness. Furthermore, we present a detailed analysis of the performance of modern data fusion approaches, and demonstrate the reasons why they do not perform weIl when applied to this problem. Detailed results and analyses are included to support our conclusions.
  20. Lin, J.; Katz, B.: Building a reusable test collection for question answering (2006) 0.03
    0.027844835 = product of:
      0.09745692 = sum of:
        0.04461906 = weight(_text_:processing in 5045) [ClassicSimilarity], result of:
          0.04461906 = score(doc=5045,freq=2.0), product of:
            0.1662677 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.04107254 = queryNorm
            0.26835677 = fieldWeight in 5045, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046875 = fieldNorm(doc=5045)
        0.052837856 = weight(_text_:techniques in 5045) [ClassicSimilarity], result of:
          0.052837856 = score(doc=5045,freq=2.0), product of:
            0.18093403 = queryWeight, product of:
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.04107254 = queryNorm
            0.2920283 = fieldWeight in 5045, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.046875 = fieldNorm(doc=5045)
      0.2857143 = coord(2/7)
    
    Abstract
    In contrast to traditional information retrieval systems, which return ranked lists of documents that users must manually browse through, a question answering system attempts to directly answer natural language questions posed by the user. Although such systems possess language-processing capabilities, they still rely on traditional document retrieval techniques to generate an initial candidate set of documents. In this article, the authors argue that document retrieval for question answering represents a task different from retrieving documents in response to more general retrospective information needs. Thus, to guide future system development, specialized question answering test collections must be constructed. They show that the current evaluation resources have major shortcomings; to remedy the situation, they have manually created a small, reusable question answering test collection for research purposes. In this article they describe their methodology for building this test collection and discuss issues they encountered regarding the notion of "answer correctness."

Years

Languages

  • e 142
  • d 5
  • chi 1
  • More… Less…

Types

  • a 137
  • m 6
  • el 3
  • s 3
  • r 2
  • x 1
  • More… Less…