Search (140 results, page 1 of 7)

Dominich, S.: Mathematical foundations of information retrieval (2001) 0.08

0.08417838 = product of:
  0.12626757 = sum of:
    0.10910148 = weight(_text_:book in 1753) [ClassicSimilarity], result of:
      0.10910148 = score(doc=1753,freq=8.0), product of:
        0.2237077 = queryWeight, product of:
          4.414126 = idf(docFreq=1454, maxDocs=44218)
          0.050679956 = queryNorm
        0.4876966 = fieldWeight in 1753, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          4.414126 = idf(docFreq=1454, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1753)
    0.017166087 = product of:
      0.034332175 = sum of:
        0.034332175 = weight(_text_:22 in 1753) [ClassicSimilarity], result of:
          0.034332175 = score(doc=1753,freq=2.0), product of:
            0.17747258 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050679956 = queryNorm
            0.19345059 = fieldWeight in 1753, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1753)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: This book offers a comprehensive and consistent mathematical approach to information retrieval (IR) without which no implementation is possible, and sheds an entirely new light upon the structure of IR models. It contains the descriptions of all IR models in a unified formal style and language, along with examples for each, thus offering a comprehensive overview of them. The book also creates mathematical foundations and a consistent mathematical theory (including all mathematical results achieved so far) of IR as a stand-alone mathematical discipline, which thus can be read and taught independently. Also, the book contains all necessary mathematical knowledge on which IR relies, to help the reader avoid searching different sources. The book will be of interest to computer or information scientists, librarians, mathematicians, undergraduate students and researchers whose work involves information retrieval.
Date: 22. 3.2008 12:26:32

Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (1999) 0.07

0.07069763 = product of:
  0.10604644 = sum of:
    0.0654609 = weight(_text_:book in 5777) [ClassicSimilarity], result of:
      0.0654609 = score(doc=5777,freq=2.0), product of:
        0.2237077 = queryWeight, product of:
          4.414126 = idf(docFreq=1454, maxDocs=44218)
          0.050679956 = queryNorm
        0.29261798 = fieldWeight in 5777, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.414126 = idf(docFreq=1454, maxDocs=44218)
          0.046875 = fieldNorm(doc=5777)
    0.04058554 = product of:
      0.08117108 = sum of:
        0.08117108 = weight(_text_:search in 5777) [ClassicSimilarity], result of:
          0.08117108 = score(doc=5777,freq=8.0), product of:
            0.17614716 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.050679956 = queryNorm
            0.460814 = fieldWeight in 5777, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.046875 = fieldNorm(doc=5777)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: This book discusses many of the key design issues for building search engines and emphazises the important role that applied mathematics can play in improving information retrieval. The authors discuss not only important data structures, algorithms, and software but also user-centered issues such as interfaces, manual indexing, and document preparation. They also present some of the current problems in information retrieval that many not be familiar to applied mathematicians and computer scientists and some of the driving computational methods (SVD, SDD) for automated conceptual indexing
LCSH: Web search engines
Subject: Web search engines

Habernal, I.; Konopík, M.; Rohlík, O.: Question answering (2012) 0.07

0.067072675 = product of:
  0.100609004 = sum of:
    0.0654609 = weight(_text_:book in 101) [ClassicSimilarity], result of:
      0.0654609 = score(doc=101,freq=2.0), product of:
        0.2237077 = queryWeight, product of:
          4.414126 = idf(docFreq=1454, maxDocs=44218)
          0.050679956 = queryNorm
        0.29261798 = fieldWeight in 101, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.414126 = idf(docFreq=1454, maxDocs=44218)
          0.046875 = fieldNorm(doc=101)
    0.03514811 = product of:
      0.07029622 = sum of:
        0.07029622 = weight(_text_:search in 101) [ClassicSimilarity], result of:
          0.07029622 = score(doc=101,freq=6.0), product of:
            0.17614716 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.050679956 = queryNorm
            0.39907667 = fieldWeight in 101, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.046875 = fieldNorm(doc=101)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: Question Answering is an area of information retrieval with the added challenge of applying sophisticated techniques to identify the complex syntactic and semantic relationships present in text in order to provide a more sophisticated and satisfactory response to the user's information needs. For this reason, the authors see question answering as the next step beyond standard information retrieval. In this chapter state of the art question answering is covered focusing on providing an overview of systems, techniques and approaches that are likely to be employed in the next generations of search engines. Special attention is paid to question answering using the World Wide Web as the data source and to question answering exploiting the possibilities of Semantic Web. Considerations about the current issues and prospects for promising future research are also provided.
Footnote: Vgl.: http://www.igi-global.com/book/next-generation-search-engines/64431.
Source: Next generation search engines: advanced models for information retrieval. Eds.: C. Jouis, u.a

Back, J.: ¬An evaluation of relevancy ranking techniques used by Internet search engines (2000) 0.06

0.0636099 = product of:
  0.19082968 = sum of:
    0.19082968 = sum of:
      0.0946996 = weight(_text_:search in 3445) [ClassicSimilarity], result of:
        0.0946996 = score(doc=3445,freq=2.0), product of:
          0.17614716 = queryWeight, product of:
            3.475677 = idf(docFreq=3718, maxDocs=44218)
            0.050679956 = queryNorm
          0.5376164 = fieldWeight in 3445, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.475677 = idf(docFreq=3718, maxDocs=44218)
            0.109375 = fieldNorm(doc=3445)
      0.09613008 = weight(_text_:22 in 3445) [ClassicSimilarity], result of:
        0.09613008 = score(doc=3445,freq=2.0), product of:
          0.17747258 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.050679956 = queryNorm
          0.5416616 = fieldWeight in 3445, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.109375 = fieldNorm(doc=3445)
  0.33333334 = coord(1/3)

Date: 25. 8.2005 17:42:22

Biskri, I.; Rompré, L.: Using association rules for query reformulation (2012) 0.06

0.06277281 = product of:
  0.09415921 = sum of:
    0.0654609 = weight(_text_:book in 92) [ClassicSimilarity], result of:
      0.0654609 = score(doc=92,freq=2.0), product of:
        0.2237077 = queryWeight, product of:
          4.414126 = idf(docFreq=1454, maxDocs=44218)
          0.050679956 = queryNorm
        0.29261798 = fieldWeight in 92, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.414126 = idf(docFreq=1454, maxDocs=44218)
          0.046875 = fieldNorm(doc=92)
    0.02869831 = product of:
      0.05739662 = sum of:
        0.05739662 = weight(_text_:search in 92) [ClassicSimilarity], result of:
          0.05739662 = score(doc=92,freq=4.0), product of:
            0.17614716 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.050679956 = queryNorm
            0.3258447 = fieldWeight in 92, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.046875 = fieldNorm(doc=92)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Footnote: Vgl.: http://www.igi-global.com/book/next-generation-search-engines/64430.
Source: Next generation search engines: advanced models for information retrieval. Eds.: C. Jouis, u.a

Information retrieval : data structures and algorithms (1992) 0.06
```
0.0627047 = product of:
  0.094057046 = sum of:
    0.0771464 = weight(_text_:book in 3495) [ClassicSimilarity], result of:
      0.0771464 = score(doc=3495,freq=4.0), product of:
        0.2237077 = queryWeight, product of:
          4.414126 = idf(docFreq=1454, maxDocs=44218)
          0.050679956 = queryNorm
        0.34485358 = fieldWeight in 3495, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.414126 = idf(docFreq=1454, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3495)
    0.016910642 = product of:
      0.033821285 = sum of:
        0.033821285 = weight(_text_:search in 3495) [ClassicSimilarity], result of:
          0.033821285 = score(doc=3495,freq=2.0), product of:
            0.17614716 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.050679956 = queryNorm
            0.19200584 = fieldWeight in 3495, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3495)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

The book consists of separate chapters by some 20 different authors. It covers many of the information retrieval algorithms, including methods of file organization, file search and access, and query processing

Content

An edited volume containing data structures and algorithms for information retrieval including a disk with examples written in C. for prgrammers and students interested in parsing text, automated indexing, its the first collection in book form of the basic data structures and algorithms that are critical to the storage and retrieval of documents. ------------------Enthält die Kapitel: FRAKES, W.B.: Introduction to information storage and retrieval systems; BAEZA-YATES, R.S.: Introduction to data structures and algorithms related to information retrieval; HARMAN, D. u.a.: Inverted files; FALOUTSOS, C.: Signature files; GONNET, G.H. u.a.: New indices for text: PAT trees and PAT arrays; FORD, D.A. u. S. CHRISTODOULAKIS: File organizations for optical disks; FOX, C.: Lexical analysis and stoplists; FRAKES, W.B.: Stemming algorithms; SRINIVASAN, P.: Thesaurus construction; BAEZA-YATES, R.A.: String searching algorithms; HARMAN, D.: Relevance feedback and other query modification techniques; WARTIK, S.: Boolean operators; WARTIK, S. u.a.: Hashing algorithms; HARMAN, D.: Ranking algorithms; FOX, E.: u.a.: Extended Boolean models; RASMUSSEN, E.: Clustering algorithms; HOLLAAR, L.: Special-purpose hardware for information retrieval; STANFILL, C.: Parallel information retrieval algorithms
Langville, A.N.; Meyer, C.D.: Google's PageRank and beyond : the science of search engine rankings (2006) 0.06
```
0.055690415 = product of:
  0.08353562 = sum of:
    0.056690805 = weight(_text_:book in 6) [ClassicSimilarity], result of:
      0.056690805 = score(doc=6,freq=6.0), product of:
        0.2237077 = queryWeight, product of:
          4.414126 = idf(docFreq=1454, maxDocs=44218)
          0.050679956 = queryNorm
        0.25341463 = fieldWeight in 6, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          4.414126 = idf(docFreq=1454, maxDocs=44218)
          0.0234375 = fieldNorm(doc=6)
    0.026844814 = product of:
      0.05368963 = sum of:
        0.05368963 = weight(_text_:search in 6) [ClassicSimilarity], result of:
          0.05368963 = score(doc=6,freq=14.0), product of:
            0.17614716 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.050679956 = queryNorm
            0.30479985 = fieldWeight in 6, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0234375 = fieldNorm(doc=6)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Why doesn't your home page appear on the first page of search results, even when you query your own name? How do other Web pages always appear at the top? What creates these powerful rankings? And how? The first book ever about the science of Web page rankings, "Google's PageRank and Beyond" supplies the answers to these and other questions and more. The book serves two very different audiences: the curious science reader and the technical computational reader. The chapters build in mathematical sophistication, so that the first five are accessible to the general academic reader. While other chapters are much more mathematical in nature, each one contains something for both audiences. For example, the authors include entertaining asides such as how search engines make money and how the Great Firewall of China influences research. The book includes an extensive background chapter designed to help readers learn more about the mathematics of search engines, and it contains several MATLAB codes and links to sample Web data sets. The philosophy throughout is to encourage readers to experiment with the ideas and algorithms in the text. Any business seriously interested in improving its rankings in the major search engines can benefit from the clear examples, sample code, and list of resources provided. It includes: many illustrative examples and entertaining asides; MATLAB code; accessible and informal style; and complete and self-contained section for mathematics review.

Content

Inhalt: Chapter 1. Introduction to Web Search Engines: 1.1 A Short History of Information Retrieval - 1.2 An Overview of Traditional Information Retrieval - 1.3 Web Information Retrieval Chapter 2. Crawling, Indexing, and Query Processing: 2.1 Crawling - 2.2 The Content Index - 2.3 Query Processing Chapter 3. Ranking Webpages by Popularity: 3.1 The Scene in 1998 - 3.2 Two Theses - 3.3 Query-Independence Chapter 4. The Mathematics of Google's PageRank: 4.1 The Original Summation Formula for PageRank - 4.2 Matrix Representation of the Summation Equations - 4.3 Problems with the Iterative Process - 4.4 A Little Markov Chain Theory - 4.5 Early Adjustments to the Basic Model - 4.6 Computation of the PageRank Vector - 4.7 Theorem and Proof for Spectrum of the Google Matrix Chapter 5. Parameters in the PageRank Model: 5.1 The a Factor - 5.2 The Hyperlink Matrix H - 5.3 The Teleportation Matrix E Chapter 6. The Sensitivity of PageRank; 6.1 Sensitivity with respect to alpha - 6.2 Sensitivity with respect to H - 6.3 Sensitivity with respect to vT - 6.4 Other Analyses of Sensitivity - 6.5 Sensitivity Theorems and Proofs Chapter 7. The PageRank Problem as a Linear System: 7.1 Properties of (I - alphaS) - 7.2 Properties of (I - alphaH) - 7.3 Proof of the PageRank Sparse Linear System Chapter 8. Issues in Large-Scale Implementation of PageRank: 8.1 Storage Issues - 8.2 Convergence Criterion - 8.3 Accuracy - 8.4 Dangling Nodes - 8.5 Back Button Modeling
Chapter 9. Accelerating the Computation of PageRank: 9.1 An Adaptive Power Method - 9.2 Extrapolation - 9.3 Aggregation - 9.4 Other Numerical Methods Chapter 10. Updating the PageRank Vector: 10.1 The Two Updating Problems and their History - 10.2 Restarting the Power Method - 10.3 Approximate Updating Using Approximate Aggregation - 10.4 Exact Aggregation - 10.5 Exact vs. Approximate Aggregation - 10.6 Updating with Iterative Aggregation - 10.7 Determining the Partition - 10.8 Conclusions Chapter 11. The HITS Method for Ranking Webpages: 11.1 The HITS Algorithm - 11.2 HITS Implementation - 11.3 HITS Convergence - 11.4 HITS Example - 11.5 Strengths and Weaknesses of HITS - 11.6 HITS's Relationship to Bibliometrics - 11.7 Query-Independent HITS - 11.8 Accelerating HITS - 11.9 HITS Sensitivity Chapter 12. Other Link Methods for Ranking Webpages: 12.1 SALSA - 12.2 Hybrid Ranking Methods - 12.3 Rankings based on Traffic Flow Chapter 13. The Future of Web Information Retrieval: 13.1 Spam - 13.2 Personalization - 13.3 Clustering - 13.4 Intelligent Agents - 13.5 Trends and Time-Sensitive Search - 13.6 Privacy and Censorship - 13.7 Library Classification Schemes - 13.8 Data Fusion Chapter 14. Resources for Web Information Retrieval: 14.1 Resources for Getting Started - 14.2 Resources for Serious Study Chapter 15. The Mathematics Guide: 15.1 Linear Algebra - 15.2 Perron-Frobenius Theory - 15.3 Markov Chains - 15.4 Perron Complementation - 15.5 Stochastic Complementation - 15.6 Censoring - 15.7 Aggregation - 15.8 Disaggregation
Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (2005) 0.05
```
0.05460334 = product of:
  0.08190501 = sum of:
    0.043640595 = weight(_text_:book in 7) [ClassicSimilarity], result of:
      0.043640595 = score(doc=7,freq=2.0), product of:
        0.2237077 = queryWeight, product of:
          4.414126 = idf(docFreq=1454, maxDocs=44218)
          0.050679956 = queryNorm
        0.19507864 = fieldWeight in 7, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.414126 = idf(docFreq=1454, maxDocs=44218)
          0.03125 = fieldNorm(doc=7)
    0.038264416 = product of:
      0.07652883 = sum of:
        0.07652883 = weight(_text_:search in 7) [ClassicSimilarity], result of:
          0.07652883 = score(doc=7,freq=16.0), product of:
            0.17614716 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.050679956 = queryNorm
            0.43445963 = fieldWeight in 7, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.03125 = fieldNorm(doc=7)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

The second edition of Understanding Search Engines: Mathematical Modeling and Text Retrieval follows the basic premise of the first edition by discussing many of the key design issues for building search engines and emphasizing the important role that applied mathematics can play in improving information retrieval. The authors discuss important data structures, algorithms, and software as well as user-centered issues such as interfaces, manual indexing, and document preparation. Significant changes bring the text up to date on current information retrieval methods: for example the addition of a new chapter on link-structure algorithms used in search engines such as Google. The chapter on user interface has been rewritten to specifically focus on search engine usability. In addition the authors have added new recommendations for further reading and expanded the bibliography, and have updated and streamlined the index to make it more reader friendly.

Content

Inhalt: Introduction Document File Preparation - Manual Indexing - Information Extraction - Vector Space Modeling - Matrix Decompositions - Query Representations - Ranking and Relevance Feedback - Searching by Link Structure - User Interface - Book Format Document File Preparation Document Purification and Analysis - Text Formatting - Validation - Manual Indexing - Automatic Indexing - Item Normalization - Inverted File Structures - Document File - Dictionary List - Inversion List - Other File Structures Vector Space Models Construction - Term-by-Document Matrices - Simple Query Matching - Design Issues - Term Weighting - Sparse Matrix Storage - Low-Rank Approximations Matrix Decompositions QR Factorization - Singular Value Decomposition - Low-Rank Approximations - Query Matching - Software - Semidiscrete Decomposition - Updating Techniques Query Management Query Binding - Types of Queries - Boolean Queries - Natural Language Queries - Thesaurus Queries - Fuzzy Queries - Term Searches - Probabilistic Queries Ranking and Relevance Feedback Performance Evaluation - Precision - Recall - Average Precision - Genetic Algorithms - Relevance Feedback Searching by Link Structure HITS Method - HITS Implementation - HITS Summary - PageRank Method - PageRank Adjustments - PageRank Implementation - PageRank Summary User Interface Considerations General Guidelines - Search Engine Interfaces - Form Fill-in - Display Considerations - Progress Indication - No Penalties for Error - Results - Test and Retest - Final Considerations Further Reading

LCSH

Web search engines

Subject

Web search engines
Khoo, C.S.G.; Wan, K.-W.: ¬A simple relevancy-ranking strategy for an interface to Boolean OPACs (2004) 0.05
```
0.04502588 = product of:
  0.13507764 = sum of:
    0.13507764 = sum of:
      0.11104512 = weight(_text_:search in 2509) [ClassicSimilarity], result of:
        0.11104512 = score(doc=2509,freq=44.0), product of:
          0.17614716 = queryWeight, product of:
            3.475677 = idf(docFreq=3718, maxDocs=44218)
            0.050679956 = queryNorm
          0.6304111 = fieldWeight in 2509, product of:
            6.6332498 = tf(freq=44.0), with freq of:
              44.0 = termFreq=44.0
            3.475677 = idf(docFreq=3718, maxDocs=44218)
            0.02734375 = fieldNorm(doc=2509)
      0.02403252 = weight(_text_:22 in 2509) [ClassicSimilarity], result of:
        0.02403252 = score(doc=2509,freq=2.0), product of:
          0.17747258 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.050679956 = queryNorm
          0.1354154 = fieldWeight in 2509, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.02734375 = fieldNorm(doc=2509)
  0.33333334 = coord(1/3)
```
Abstract

A relevancy-ranking algorithm for a natural language interface to Boolean online public access catalogs (OPACs) was formulated and compared with that currently used in a knowledge-based search interface called the E-Referencer, being developed by the authors. The algorithm makes use of seven weIl-known ranking criteria: breadth of match, section weighting, proximity of query words, variant word forms (stemming), document frequency, term frequency and document length. The algorithm converts a natural language query into a series of increasingly broader Boolean search statements. In a small experiment with ten subjects in which the algorithm was simulated by hand, the algorithm obtained good results with a mean overall precision of 0.42 and mean average precision of 0.62, representing a 27 percent improvement in precision and 41 percent improvement in average precision compared to the E-Referencer. The usefulness of each step in the algorithm was analyzed and suggestions are made for improving the algorithm.

Content

"Most Web search engines accept natural language queries, perform some kind of fuzzy matching and produce ranked output, displaying first the documents that are most likely to be relevant. On the other hand, most library online public access catalogs (OPACs) an the Web are still Boolean retrieval systems that perform exact matching, and require users to express their search requests precisely in a Boolean search language and to refine their search statements to improve the search results. It is well-documented that users have difficulty searching Boolean OPACs effectively (e.g. Borgman, 1996; Ensor, 1992; Wallace, 1993). One approach to making OPACs easier to use is to develop a natural language search interface that acts as a middleware between the user's Web browser and the OPAC system. The search interface can accept a natural language query from the user and reformulate it as a series of Boolean search statements that are then submitted to the OPAC. The records retrieved by the OPAC are ranked by the search interface before forwarding them to the user's Web browser. The user, then, does not need to interact directly with the Boolean OPAC but with the natural language search interface or search intermediary. The search interface interacts with the OPAC system an the user's behalf. The advantage of this approach is that no modification to the OPAC or library system is required. Furthermore, the search interface can access multiple OPACs, acting as a meta search engine, and integrate search results from various OPACs before sending them to the user. The search interface needs to incorporate a method for converting the user's natural language query into a series of Boolean search statements, and for ranking the OPAC records retrieved. The purpose of this study was to develop a relevancyranking algorithm for a search interface to Boolean OPAC systems. This is part of an on-going effort to develop a knowledge-based search interface to OPACs called the E-Referencer (Khoo et al., 1998, 1999; Poo et al., 2000). E-Referencer v. 2 that has been implemented applies a repertoire of initial search strategies and reformulation strategies to retrieve records from OPACs using the Z39.50 protocol, and also assists users in mapping query keywords to the Library of Congress subject headings."

Source

Electronic library. 22(2004) no.2, S.112-120
Shiri, A.A.; Revie, C.: Query expansion behavior within a thesaurus-enhanced search environment : a user-centered evaluation (2006) 0.04
```
0.041271627 = product of:
  0.12381488 = sum of:
    0.12381488 = sum of:
      0.08948271 = weight(_text_:search in 56) [ClassicSimilarity], result of:
        0.08948271 = score(doc=56,freq=14.0), product of:
          0.17614716 = queryWeight, product of:
            3.475677 = idf(docFreq=3718, maxDocs=44218)
            0.050679956 = queryNorm
          0.5079997 = fieldWeight in 56, product of:
            3.7416575 = tf(freq=14.0), with freq of:
              14.0 = termFreq=14.0
            3.475677 = idf(docFreq=3718, maxDocs=44218)
            0.0390625 = fieldNorm(doc=56)
      0.034332175 = weight(_text_:22 in 56) [ClassicSimilarity], result of:
        0.034332175 = score(doc=56,freq=2.0), product of:
          0.17747258 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.050679956 = queryNorm
          0.19345059 = fieldWeight in 56, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=56)
  0.33333334 = coord(1/3)
```
Abstract

The study reported here investigated the query expansion behavior of end-users interacting with a thesaurus-enhanced search system on the Web. Two groups, namely academic staff and postgraduate students, were recruited into this study. Data were collected from 90 searches performed by 30 users using the OVID interface to the CAB abstracts database. Data-gathering techniques included questionnaires, screen capturing software, and interviews. The results presented here relate to issues of search-topic and search-term characteristics, number and types of expanded queries, usefulness of thesaurus terms, and behavioral differences between academic staff and postgraduate students in their interaction. The key conclusions drawn were that (a) academic staff chose more narrow and synonymous terms than did postgraduate students, who generally selected broader and related terms; (b) topic complexity affected users' interaction with the thesaurus in that complex topics required more query expansion and search term selection; (c) users' prior topic-search experience appeared to have a significant effect on their selection and evaluation of thesaurus terms; (d) in 50% of the searches where additional terms were suggested from the thesaurus, users stated that they had not been aware of the terms at the beginning of the search; this observation was particularly noticeable in the case of postgraduate students.

Date

22. 7.2006 16:32:43
Fan, W.; Fox, E.A.; Pathak, P.; Wu, H.: ¬The effects of fitness functions an genetic programming-based ranking discovery for Web search (2004) 0.04
```
0.03716494 = product of:
  0.111494824 = sum of:
    0.111494824 = sum of:
      0.07029622 = weight(_text_:search in 2239) [ClassicSimilarity], result of:
        0.07029622 = score(doc=2239,freq=6.0), product of:
          0.17614716 = queryWeight, product of:
            3.475677 = idf(docFreq=3718, maxDocs=44218)
            0.050679956 = queryNorm
          0.39907667 = fieldWeight in 2239, product of:
            2.4494898 = tf(freq=6.0), with freq of:
              6.0 = termFreq=6.0
            3.475677 = idf(docFreq=3718, maxDocs=44218)
            0.046875 = fieldNorm(doc=2239)
      0.041198608 = weight(_text_:22 in 2239) [ClassicSimilarity], result of:
        0.041198608 = score(doc=2239,freq=2.0), product of:
          0.17747258 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.050679956 = queryNorm
          0.23214069 = fieldWeight in 2239, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=2239)
  0.33333334 = coord(1/3)
```
Abstract

Genetic-based evolutionary learning algorithms, such as genetic algorithms (GAs) and genetic programming (GP), have been applied to information retrieval (IR) since the 1980s. Recently, GP has been applied to a new IR taskdiscovery of ranking functions for Web search-and has achieved very promising results. However, in our prior research, only one fitness function has been used for GP-based learning. It is unclear how other fitness functions may affect ranking function discovery for Web search, especially since it is weIl known that choosing a proper fitness function is very important for the effectiveness and efficiency of evolutionary algorithms. In this article, we report our experience in contrasting different fitness function designs an GP-based learning using a very large Web corpus. Our results indicate that the design of fitness functions is instrumental in performance improvement. We also give recommendations an the design of fitness functions for genetic-based information retrieval experiments.

Date

31. 5.2004 19:22:06
Kelledy, F.; Smeaton, A.F.: Signature files and beyond (1996) 0.03
```
0.032865077 = product of:
  0.09859523 = sum of:
    0.09859523 = sum of:
      0.05739662 = weight(_text_:search in 6973) [ClassicSimilarity], result of:
        0.05739662 = score(doc=6973,freq=4.0), product of:
          0.17614716 = queryWeight, product of:
            3.475677 = idf(docFreq=3718, maxDocs=44218)
            0.050679956 = queryNorm
          0.3258447 = fieldWeight in 6973, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            3.475677 = idf(docFreq=3718, maxDocs=44218)
            0.046875 = fieldNorm(doc=6973)
      0.041198608 = weight(_text_:22 in 6973) [ClassicSimilarity], result of:
        0.041198608 = score(doc=6973,freq=2.0), product of:
          0.17747258 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.050679956 = queryNorm
          0.23214069 = fieldWeight in 6973, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=6973)
  0.33333334 = coord(1/3)
```
Abstract

Proposes that signature files be used as a viable alternative to other indexing strategies such as inverted files for searching through large volumes of text. Demonstrates through simulation, that search times can be further reduced by enhancing the basic signature file concept using deterministic partitioning algorithms which eliminate the need for an exhaustive search of the entire signature file. Reports research to evaluate the performance of some deterministic partitioning algorithms in a non simulated environment using 276 MB of raw newspaper text (taken from the Wall Street Journal) and real user queries. Presents a selection of results to illustrate trends and highlight important aspects of the performance of these methods under realistic rather than simulated operating conditions. As a result of the research reported here certain aspects of this approach to signature files are shown to be found wanting and require improvement. Suggests lines of future research on the partitioning of signature files

Source

Information retrieval: new systems and current research. Proceedings of the 16th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Drymen, Scotland, 22-23 Mar 94. Ed.: R. Leon

Furner, J.: ¬A unifying model of document relatedness for hybrid search engines (2003) 0.03

0.032865077 = product of:
  0.09859523 = sum of:
    0.09859523 = sum of:
      0.05739662 = weight(_text_:search in 2717) [ClassicSimilarity], result of:
        0.05739662 = score(doc=2717,freq=4.0), product of:
          0.17614716 = queryWeight, product of:
            3.475677 = idf(docFreq=3718, maxDocs=44218)
            0.050679956 = queryNorm
          0.3258447 = fieldWeight in 2717, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            3.475677 = idf(docFreq=3718, maxDocs=44218)
            0.046875 = fieldNorm(doc=2717)
      0.041198608 = weight(_text_:22 in 2717) [ClassicSimilarity], result of:
        0.041198608 = score(doc=2717,freq=2.0), product of:
          0.17747258 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.050679956 = queryNorm
          0.23214069 = fieldWeight in 2717, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=2717)
  0.33333334 = coord(1/3)

Abstract: Previous work an search-engine design has indicated that information-seekers may benefit from being given the opportunity to exploit multiple sources of evidence of document relatedness. Few existing systems, however, give users more than minimal control over the selections that may be made among methods of exploitation. By applying the methods of "document network analysis" (DNA), a unifying, graph-theoretic model of content-, collaboration-, and context-based systems (CCC) may be developed in which the nature of the similarities between types of document relatedness and document ranking are clarified. The usefulness of the approach to system design suggested by this model may be tested by constructing and evaluating a prototype system (UCXtra) that allows searchers to maintain control over the multiple ways in which document collections may be ranked and re-ranked.
Date: 11. 9.2004 17:32:22

Klas, C.-P.; Fuhr, N.; Schaefer, A.: Evaluating strategic support for information access in the DAFFODIL system (2004) 0.03
```
0.032865077 = product of:
  0.09859523 = sum of:
    0.09859523 = sum of:
      0.05739662 = weight(_text_:search in 2419) [ClassicSimilarity], result of:
        0.05739662 = score(doc=2419,freq=4.0), product of:
          0.17614716 = queryWeight, product of:
            3.475677 = idf(docFreq=3718, maxDocs=44218)
            0.050679956 = queryNorm
          0.3258447 = fieldWeight in 2419, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            3.475677 = idf(docFreq=3718, maxDocs=44218)
            0.046875 = fieldNorm(doc=2419)
      0.041198608 = weight(_text_:22 in 2419) [ClassicSimilarity], result of:
        0.041198608 = score(doc=2419,freq=2.0), product of:
          0.17747258 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.050679956 = queryNorm
          0.23214069 = fieldWeight in 2419, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=2419)
  0.33333334 = coord(1/3)
```
Abstract

The digital library system Daffodil is targeted at strategic support of users during the information search process. For searching, exploring and managing digital library objects it provides user-customisable information seeking patterns over a federation of heterogeneous digital libraries. In this paper evaluation results with respect to retrieval effectiveness, efficiency and user satisfaction are presented. The analysis focuses on strategic support for the scientific work-flow. Daffodil supports the whole work-flow, from data source selection over information seeking to the representation, organisation and reuse of information. By embedding high level search functionality into the scientific work-flow, the user experiences better strategic system support due to a more systematic work process. These ideas have been implemented in Daffodil followed by a qualitative evaluation. The evaluation has been conducted with 28 participants, ranging from information seeking novices to experts. The results are promising, as they support the chosen model.

Date

16.11.2008 16:22:48

Chang, C.-H.; Hsu, C.-C.: Integrating query expansion and conceptual relevance feedback for personalized Web information retrieval (1998) 0.03

0.03180495 = product of:
  0.09541484 = sum of:
    0.09541484 = sum of:
      0.0473498 = weight(_text_:search in 1319) [ClassicSimilarity], result of:
        0.0473498 = score(doc=1319,freq=2.0), product of:
          0.17614716 = queryWeight, product of:
            3.475677 = idf(docFreq=3718, maxDocs=44218)
            0.050679956 = queryNorm
          0.2688082 = fieldWeight in 1319, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.475677 = idf(docFreq=3718, maxDocs=44218)
            0.0546875 = fieldNorm(doc=1319)
      0.04806504 = weight(_text_:22 in 1319) [ClassicSimilarity], result of:
        0.04806504 = score(doc=1319,freq=2.0), product of:
          0.17747258 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.050679956 = queryNorm
          0.2708308 = fieldWeight in 1319, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0546875 = fieldNorm(doc=1319)
  0.33333334 = coord(1/3)

Abstract: Keyword based querying has been an immediate and efficient way to specify and retrieve related information that the user inquired. However, conventional document ranking based on an automatic assessment of document relevance to the query may not be the best approach when little information is given. Proposes an idea to integrate 2 existing techniques, query expansion and relevance feedback to achieve a concept-based information search for the Web
Date: 1. 8.1996 22:08:06

Lalmas, M.: XML retrieval (2009) 0.03
```
0.03149489 = product of:
  0.094484664 = sum of:
    0.094484664 = weight(_text_:book in 4998) [ClassicSimilarity], result of:
      0.094484664 = score(doc=4998,freq=6.0), product of:
        0.2237077 = queryWeight, product of:
          4.414126 = idf(docFreq=1454, maxDocs=44218)
          0.050679956 = queryNorm
        0.42235768 = fieldWeight in 4998, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          4.414126 = idf(docFreq=1454, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4998)
  0.33333334 = coord(1/3)
```
Abstract

Documents usually have a content and a structure. The content refers to the text of the document, whereas the structure refers to how a document is logically organized. An increasingly common way to encode the structure is through the use of a mark-up language. Nowadays, the most widely used mark-up language for representing structure is the eXtensible Mark-up Language (XML). XML can be used to provide a focused access to documents, i.e. returning XML elements, such as sections and paragraphs, instead of whole documents in response to a query. Such focused strategies are of particular benefit for information repositories containing long documents, or documents covering a wide variety of topics, where users are directed to the most relevant content within a document. The increased adoption of XML to represent a document structure requires the development of tools to effectively access documents marked-up in XML. This book provides a detailed description of query languages, indexing strategies, ranking algorithms, presentation scenarios developed to access XML documents. Major advances in XML retrieval were seen from 2002 as a result of INEX, the Initiative for Evaluation of XML Retrieval. INEX, also described in this book, provided test sets for evaluating XML retrieval effectiveness. Many of the developments and results described in this book were investigated within INEX.

Wartik, S.; Fox, E.; Heath, L.; Chen, Q.-F.: Hashing algorithms (1992) 0.03

0.029093731 = product of:
  0.08728119 = sum of:
    0.08728119 = weight(_text_:book in 3510) [ClassicSimilarity], result of:
      0.08728119 = score(doc=3510,freq=2.0), product of:
        0.2237077 = queryWeight, product of:
          4.414126 = idf(docFreq=1454, maxDocs=44218)
          0.050679956 = queryNorm
        0.39015728 = fieldWeight in 3510, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.414126 = idf(docFreq=1454, maxDocs=44218)
          0.0625 = fieldNorm(doc=3510)
  0.33333334 = coord(1/3)

Abstract: Discusses hashing, an information storage and retrieval technique useful for implementing many of the other structures in this book. The concepts underlying hashing are presented, along with 2 implementation strategies. The chapter also contains an extensive discussion of perfect hashing, an important optimization in information retrieval, and an O(n) algorithm to find minimal perfect hash functions for a set of keys

Joss, M.W.; Wszola, S.: ¬The engines that can : text search and retrieval software, their strategies, and vendors (1996) 0.03

0.027261382 = product of:
  0.081784144 = sum of:
    0.081784144 = sum of:
      0.04058554 = weight(_text_:search in 5123) [ClassicSimilarity], result of:
        0.04058554 = score(doc=5123,freq=2.0), product of:
          0.17614716 = queryWeight, product of:
            3.475677 = idf(docFreq=3718, maxDocs=44218)
            0.050679956 = queryNorm
          0.230407 = fieldWeight in 5123, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.475677 = idf(docFreq=3718, maxDocs=44218)
            0.046875 = fieldNorm(doc=5123)
      0.041198608 = weight(_text_:22 in 5123) [ClassicSimilarity], result of:
        0.041198608 = score(doc=5123,freq=2.0), product of:
          0.17747258 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.050679956 = queryNorm
          0.23214069 = fieldWeight in 5123, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=5123)
  0.33333334 = coord(1/3)

Date: 12. 9.1996 13:56:22

Lavrenko, V.: ¬A generative theory of relevance (2009) 0.03
```
0.025715468 = product of:
  0.0771464 = sum of:
    0.0771464 = weight(_text_:book in 3306) [ClassicSimilarity], result of:
      0.0771464 = score(doc=3306,freq=4.0), product of:
        0.2237077 = queryWeight, product of:
          4.414126 = idf(docFreq=1454, maxDocs=44218)
          0.050679956 = queryNorm
        0.34485358 = fieldWeight in 3306, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.414126 = idf(docFreq=1454, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3306)
  0.33333334 = coord(1/3)
```
Abstract

A modern information retrieval system must have the capability to find, organize and present very different manifestations of information - such as text, pictures, videos or database records - any of which may be of relevance to the user. However, the concept of relevance, while seemingly intuitive, is actually hard to define, and it's even harder to model in a formal way. Lavrenko does not attempt to bring forth a new definition of relevance, nor provide arguments as to why any particular definition might be theoretically superior or more complete. Instead, he takes a widely accepted, albeit somewhat conservative definition, makes several assumptions, and from them develops a new probabilistic model that explicitly captures that notion of relevance. With this book, he makes two major contributions to the field of information retrieval: first, a new way to look at topical relevance, complementing the two dominant models, i.e., the classical probabilistic model and the language modeling approach, and which explicitly combines documents, queries, and relevance in a single formalism; second, a new method for modeling exchangeable sequences of discrete random variables which does not make any structural assumptions about the data and which can also handle rare events. Thus his book is of major interest to researchers and graduate students in information retrieval who specialize in relevance modeling, ranking algorithms, and language modeling.

Lalmas, M.: XML information retrieval (2009) 0.03

0.025457015 = product of:
  0.076371044 = sum of:
    0.076371044 = weight(_text_:book in 3880) [ClassicSimilarity], result of:
      0.076371044 = score(doc=3880,freq=2.0), product of:
        0.2237077 = queryWeight, product of:
          4.414126 = idf(docFreq=1454, maxDocs=44218)
          0.050679956 = queryNorm
        0.34138763 = fieldWeight in 3880, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.414126 = idf(docFreq=1454, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3880)
  0.33333334 = coord(1/3)

Footnote: Vgl.: http://www.tandfonline.com/doi/book/10.1081/E-ELIS3.

Search (140 results, page 1 of 7)

Authors

Years

Languages

Types

Themes

Subjects

Classifications