Search (12 results, page 1 of 1)

  • × theme_ss:"Retrievalalgorithmen"
  • × type_ss:"m"
  1. Cross-language information retrieval (1998) 0.01
    0.011067658 = product of:
      0.030436058 = sum of:
        0.005938409 = weight(_text_:a in 6299) [ClassicSimilarity], result of:
          0.005938409 = score(doc=6299,freq=74.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.19372822 = fieldWeight in 6299, product of:
              8.602325 = tf(freq=74.0), with freq of:
                74.0 = termFreq=74.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.01953125 = fieldNorm(doc=6299)
        0.0015034355 = weight(_text_:s in 6299) [ClassicSimilarity], result of:
          0.0015034355 = score(doc=6299,freq=6.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.052015185 = fieldWeight in 6299, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.01953125 = fieldNorm(doc=6299)
        0.013636759 = weight(_text_:u in 6299) [ClassicSimilarity], result of:
          0.013636759 = score(doc=6299,freq=6.0), product of:
            0.08704981 = queryWeight, product of:
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.026584605 = queryNorm
            0.15665466 = fieldWeight in 6299, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.01953125 = fieldNorm(doc=6299)
        0.009357453 = weight(_text_:k in 6299) [ClassicSimilarity], result of:
          0.009357453 = score(doc=6299,freq=2.0), product of:
            0.09490114 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.026584605 = queryNorm
            0.098602116 = fieldWeight in 6299, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.01953125 = fieldNorm(doc=6299)
      0.36363637 = coord(4/11)
    
    Content
    Enthält die Beiträge: GREFENSTETTE, G.: The Problem of Cross-Language Information Retrieval; DAVIS, M.W.: On the Effective Use of Large Parallel Corpora in Cross-Language Text Retrieval; BALLESTEROS, L. u. W.B. CROFT: Statistical Methods for Cross-Language Information Retrieval; Distributed Cross-Lingual Information Retrieval; Automatic Cross-Language Information Retrieval Using Latent Semantic Indexing; EVANS, D.A. u.a.: Mapping Vocabularies Using Latent Semantics; PICCHI, E. u. C. PETERS: Cross-Language Information Retrieval: A System for Comparable Corpus Querying; YAMABANA, K. u.a.: A Language Conversion Front-End for Cross-Language Information Retrieval; GACHOT, D.A. u.a.: The Systran NLP Browser: An Application of Machine Translation Technology in Cross-Language Information Retrieval; HULL, D.: A Weighted Boolean Model for Cross-Language Text Retrieval; SHERIDAN, P. u.a. Building a Large Multilingual Test Collection from Comparable News Documents; OARD; D.W. u. B.J. DORR: Evaluating Cross-Language Text Filtering Effectiveness
    Footnote
    Rez. in: Machine translation review: 1999, no.10, S.26-27 (D. Lewis): "Cross Language Information Retrieval (CLIR) addresses the growing need to access large volumes of data across language boundaries. The typical requirement is for the user to input a free form query, usually a brief description of a topic, into a search or retrieval engine which returns a list, in ranked order, of documents or web pages that are relevant to the topic. The search engine matches the terms in the query to indexed terms, usually keywords previously derived from the target documents. Unlike monolingual information retrieval, CLIR requires query terms in one language to be matched to indexed terms in another. Matching can be done by bilingual dictionary lookup, full machine translation, or by applying statistical methods. A query's success is measured in terms of recall (how many potentially relevant target documents are found) and precision (what proportion of documents found are relevant). Issues in CLIR are how to translate query terms into index terms, how to eliminate alternative translations (e.g. to decide that French 'traitement' in a query means 'treatment' and not 'salary'), and how to rank or weight translation alternatives that are retained (e.g. how to order the French terms 'aventure', 'business', 'affaire', and 'liaison' as relevant translations of English 'affair'). Grefenstette provides a lucid and useful overview of the field and the problems. The volume brings together a number of experiments and projects in CLIR. Mark Davies (New Mexico State University) describes Recuerdo, a Spanish retrieval engine which reduces translation ambiguities by scanning indexes for parallel texts; it also uses either a bilingual dictionary or direct equivalents from a parallel corpus in order to compare results for queries on parallel texts. Lisa Ballesteros and Bruce Croft (University of Massachusetts) use a 'local feedback' technique which automatically enhances a query by adding extra terms to it both before and after translation; such terms can be derived from documents known to be relevant to the query.
    Christian Fluhr at al (DIST/SMTI, France) outline the EMIR (European Multilingual Information Retrieval) and ESPRIT projects. They found that using SYSTRAN to machine translate queries and to access material from various multilingual databases produced less relevant results than a method referred to as 'multilingual reformulation' (the mechanics of which are only hinted at). An interesting technique is Latent Semantic Indexing (LSI), described by Michael Littman et al (Brown University) and, most clearly, by David Evans et al (Carnegie Mellon University). LSI involves creating matrices of documents and the terms they contain and 'fitting' related documents into a reduced matrix space. This effectively allows queries to be mapped onto a common semantic representation of the documents. Eugenio Picchi and Carol Peters (Pisa) report on a procedure to create links between translation equivalents in an Italian-English parallel corpus. The links are used to construct parallel linguistic contexts in real-time for any term or combination of terms that is being searched for in either language. Their interest is primarily lexicographic but they plan to apply the same procedure to comparable corpora, i.e. to texts which are not translations of each other but which share the same domain. Kiyoshi Yamabana et al (NEC, Japan) address the issue of how to disambiguate between alternative translations of query terms. Their DMAX (double maximise) method looks at co-occurrence frequencies between both source language words and target language words in order to arrive at the most probable translation. The statistical data for the decision are derived, not from the translation texts but independently from monolingual corpora in each language. An interactive user interface allows the user to influence the selection of terms during the matching process. Denis Gachot et al (SYSTRAN) describe the SYSTRAN NLP browser, a prototype tool which collects parsing information derived from a text or corpus previously translated with SYSTRAN. The user enters queries into the browser in either a structured or free form and receives grammatical and lexical information about the source text and/or its translation.
    The retrieved output from a query including the phrase 'big rockets' may be, for instance, a sentence containing 'giant rocket' which is semantically ranked above 'military ocket'. David Hull (Xerox Research Centre, Grenoble) describes an implementation of a weighted Boolean model for Spanish-English CLIR. Users construct Boolean-type queries, weighting each term in the query, which is then translated by an on-line dictionary before being applied to the database. Comparisons with the performance of unweighted free-form queries ('vector space' models) proved encouraging. Two contributions consider the evaluation of CLIR systems. In order to by-pass the time-consuming and expensive process of assembling a standard collection of documents and of user queries against which the performance of an CLIR system is manually assessed, Páriac Sheridan et al (ETH Zurich) propose a method based on retrieving 'seed documents'. This involves identifying a unique document in a database (the 'seed document') and, for a number of queries, measuring how fast it is retrieved. The authors have also assembled a large database of multilingual news documents for testing purposes. By storing the (fairly short) documents in a structured form tagged with descriptor codes (e.g. for topic, country and area), the test suite is easily expanded while remaining consistent for the purposes of testing. Douglas Ouard and Bonne Dorr (University of Maryland) describe an evaluation methodology which appears to apply LSI techniques in order to filter and rank incoming documents designed for testing CLIR systems. The volume provides the reader an excellent overview of several projects in CLIR. It is well supported with references and is intended as a secondary text for researchers and practitioners. It highlights the need for a good, general tutorial introduction to the field."
    Pages
    VII,182 S
    Type
    s
  2. Lalmas, M.: XML retrieval (2009) 0.01
    0.009457834 = product of:
      0.034678724 = sum of:
        0.0064758323 = weight(_text_:a in 4998) [ClassicSimilarity], result of:
          0.0064758323 = score(doc=4998,freq=22.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.21126054 = fieldWeight in 4998, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4998)
        0.0017360178 = weight(_text_:s in 4998) [ClassicSimilarity], result of:
          0.0017360178 = score(doc=4998,freq=2.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.060061958 = fieldWeight in 4998, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4998)
        0.026466874 = weight(_text_:k in 4998) [ClassicSimilarity], result of:
          0.026466874 = score(doc=4998,freq=4.0), product of:
            0.09490114 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.026584605 = queryNorm
            0.2788889 = fieldWeight in 4998, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4998)
      0.27272728 = coord(3/11)
    
    Abstract
    Documents usually have a content and a structure. The content refers to the text of the document, whereas the structure refers to how a document is logically organized. An increasingly common way to encode the structure is through the use of a mark-up language. Nowadays, the most widely used mark-up language for representing structure is the eXtensible Mark-up Language (XML). XML can be used to provide a focused access to documents, i.e. returning XML elements, such as sections and paragraphs, instead of whole documents in response to a query. Such focused strategies are of particular benefit for information repositories containing long documents, or documents covering a wide variety of topics, where users are directed to the most relevant content within a document. The increased adoption of XML to represent a document structure requires the development of tools to effectively access documents marked-up in XML. This book provides a detailed description of query languages, indexing strategies, ranking algorithms, presentation scenarios developed to access XML documents. Major advances in XML retrieval were seen from 2002 as a result of INEX, the Initiative for Evaluation of XML Retrieval. INEX, also described in this book, provided test sets for evaluating XML retrieval effectiveness. Many of the developments and results described in this book were investigated within INEX.
    Classification
    BCA (FH K)
    GHBS
    BCA (FH K)
    Pages
    IX,99 S
  3. Lavrenko, V.: ¬A generative theory of relevance (2009) 0.01
    0.0066560227 = product of:
      0.024405416 = sum of:
        0.0058576106 = weight(_text_:a in 3306) [ClassicSimilarity], result of:
          0.0058576106 = score(doc=3306,freq=18.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.19109234 = fieldWeight in 3306, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3306)
        0.016092705 = weight(_text_:r in 3306) [ClassicSimilarity], result of:
          0.016092705 = score(doc=3306,freq=2.0), product of:
            0.088001914 = queryWeight, product of:
              3.3102584 = idf(docFreq=4387, maxDocs=44218)
              0.026584605 = queryNorm
            0.18286766 = fieldWeight in 3306, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3102584 = idf(docFreq=4387, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3306)
        0.0024550997 = weight(_text_:s in 3306) [ClassicSimilarity], result of:
          0.0024550997 = score(doc=3306,freq=4.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.08494043 = fieldWeight in 3306, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3306)
      0.27272728 = coord(3/11)
    
    Abstract
    A modern information retrieval system must have the capability to find, organize and present very different manifestations of information - such as text, pictures, videos or database records - any of which may be of relevance to the user. However, the concept of relevance, while seemingly intuitive, is actually hard to define, and it's even harder to model in a formal way. Lavrenko does not attempt to bring forth a new definition of relevance, nor provide arguments as to why any particular definition might be theoretically superior or more complete. Instead, he takes a widely accepted, albeit somewhat conservative definition, makes several assumptions, and from them develops a new probabilistic model that explicitly captures that notion of relevance. With this book, he makes two major contributions to the field of information retrieval: first, a new way to look at topical relevance, complementing the two dominant models, i.e., the classical probabilistic model and the language modeling approach, and which explicitly combines documents, queries, and relevance in a single formalism; second, a new method for modeling exchangeable sequences of discrete random variables which does not make any structural assumptions about the data and which can also handle rare events. Thus his book is of major interest to researchers and graduate students in information retrieval who specialize in relevance modeling, ranking algorithms, and language modeling.
    Footnote
    Rez. in: JASIST 60(2009) no.12, S.2587-2588 (R. Luk)
    Pages
    XX, 197 S
  4. Effektive Information Retrieval Verfahren in Theorie und Praxis : ausgewählte und erweiterte Beiträge des Vierten Hildesheimer Evaluierungs- und Retrievalworkshop (HIER 2005), Hildesheim, 20.7.2005 (2006) 0.01
    0.0061091823 = product of:
      0.01680025 = sum of:
        0.0018129903 = product of:
          0.0036259806 = sum of:
            0.0036259806 = weight(_text_:h in 5973) [ClassicSimilarity], result of:
              0.0036259806 = score(doc=5973,freq=2.0), product of:
                0.0660481 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.026584605 = queryNorm
                0.05489909 = fieldWeight in 5973, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.015625 = fieldNorm(doc=5973)
          0.5 = coord(1/2)
        0.0012027485 = weight(_text_:s in 5973) [ClassicSimilarity], result of:
          0.0012027485 = score(doc=5973,freq=6.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.04161215 = fieldWeight in 5973, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.015625 = fieldNorm(doc=5973)
        0.0062985485 = weight(_text_:u in 5973) [ClassicSimilarity], result of:
          0.0062985485 = score(doc=5973,freq=2.0), product of:
            0.08704981 = queryWeight, product of:
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.026584605 = queryNorm
            0.07235568 = fieldWeight in 5973, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.015625 = fieldNorm(doc=5973)
        0.0074859625 = weight(_text_:k in 5973) [ClassicSimilarity], result of:
          0.0074859625 = score(doc=5973,freq=2.0), product of:
            0.09490114 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.026584605 = queryNorm
            0.078881696 = fieldWeight in 5973, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.015625 = fieldNorm(doc=5973)
      0.36363637 = coord(4/11)
    
    Editor
    Mandl, T. u. C. Womser-Hacker
    Footnote
    Rez. in: Information - Wissenschaft und Praxis 57(2006) H.5, S.290-291 (C. Schindler): "Weniger als ein Jahr nach dem "Vierten Hildesheimer Evaluierungs- und Retrievalworkshop" (HIER 2005) im Juli 2005 ist der dazugehörige Tagungsband erschienen. Eingeladen hatte die Hildesheimer Informationswissenschaft um ihre Forschungsergebnisse und die einiger externer Experten zum Thema Information Retrieval einem Fachpublikum zu präsentieren und zur Diskussion zu stellen. Unter dem Titel "Effektive Information Retrieval Verfahren in Theorie und Praxis" sind nahezu sämtliche Beiträge des Workshops in dem nun erschienenen, 15 Beiträge umfassenden Band gesammelt. Mit dem Schwerpunkt Information Retrieval (IR) wird ein Teilgebiet der Informationswissenschaft vorgestellt, das schon immer im Zentrum informationswissenschaftlicher Forschung steht. Ob durch den Leistungsanstieg von Prozessoren und Speichermedien, durch die Verbreitung des Internet über nationale Grenzen hinweg oder durch den stetigen Anstieg der Wissensproduktion, festzuhalten ist, dass in einer zunehmend wechselseitig vernetzten Welt die Orientierung und das Auffinden von Dokumenten in großen Wissensbeständen zu einer zentralen Herausforderung geworden sind. Aktuelle Verfahrensweisen zu diesem Thema, dem Information Retrieval, präsentiert der neue Band anhand von praxisbezogenen Projekten und theoretischen Diskussionen. Das Kernthema Information Retrieval wird in dem Sammelband in die Bereiche Retrieval-Systeme, Digitale Bibliothek, Evaluierung und Multilinguale Systeme untergliedert. Die Artikel der einzelnen Sektionen sind insgesamt recht heterogen und bieten daher keine Überschneidungen inhaltlicher Art. Jedoch ist eine vollkommene thematische Abdeckung der unterschiedlichen Bereiche ebenfalls nicht gegeben, was bei der Präsentation von Forschungsergebnissen eines Institutes und seiner Kooperationspartner auch nur bedingt erwartet werden kann. So lässt sich sowohl in der Gliederung als auch in den einzelnen Beiträgen eine thematische Verdichtung erkennen, die das spezielle Profil und die Besonderheit der Hildesheimer Informationswissenschaft im Feld des Information Retrieval wiedergibt. Teil davon ist die mehrsprachige und interdisziplinäre Ausrichtung, die die Schnittstellen zwischen Informationswissenschaft, Sprachwissenschaft und Informatik in ihrer praxisbezogenen und internationalen Forschung fokussiert.
    "Evaluierung", das Thema des dritten Kapitels, ist in seiner Breite nicht auf das Information Retrieval beschränkt sondern beinhaltet ebenso einzelne Aspekte der Bereiche Mensch-Maschine-Interaktion sowie des E-Learning. Michael Muck und Marco Winter von der Stiftung Wissenschaft und Politik sowie dem Informationszentrum Sozialwissenschaften thematisieren in ihrem Beitrag den Einfluss der Fragestellung (Topic) auf die Bewertung von Relevanz und zeigen Verfahrensweisen für die Topic-Erstellung auf, die beim Cross Language Evaluation Forum (CLEF) Anwendung finden. Im darauf folgenden Aufsatz stellt Thomas Mandl verschiedene Evaluierungsinitiativen im Information Retrieval und aktuelle Entwicklungen dar. Joachim Pfister erläutert in seinem Beitrag das automatisierte Gruppieren, das sogenannte Clustering, von Patent-Dokumenten in den Datenbanken des Fachinformationszentrums Karlsruhe und evaluiert unterschiedliche Clusterverfahren auf Basis von Nutzerbewertungen. Ralph Kölle, Glenn Langemeier und Wolfgang Semar widmen sich dem kollaborativen Lernen unter den speziellen Bedingungen des Programmierens. Dabei werden das System VitaminL zur synchronen Bearbeitung von Programmieraufgaben und das Kennzahlensystem K-3 für die Bewertung kollaborativer Zusammenarbeit in einer Lehrveranstaltung angewendet. Der aktuelle Forschungsschwerpunkt der Hildesheimer Informationswissenschaft zeichnet sich im vierten Kapitel unter dem Thema "Multilinguale Systeme" ab. Hier finden sich die meisten Beiträge des Tagungsbandes wieder. Olga Tartakovski und Margaryta Shramko beschreiben und prüfen das System Langldent, das die Sprache von mono- und multilingualen Texten identifiziert. Die Eigenheiten der japanischen Schriftzeichen stellt Nina Kummer dar und vergleicht experimentell die unterschiedlichen Techniken der Indexierung. Suriya Na Nhongkai und Hans-Joachim Bentz präsentieren und prüfen eine bilinguale Suche auf Basis von Konzeptnetzen, wobei die Konzeptstruktur das verbindende Elemente der beiden Textsammlungen darstellt. Das Entwickeln und Evaluieren eines mehrsprachigen Question-Answering-Systems im Rahmen des Cross Language Evaluation Forum (CLEF), das die alltagssprachliche Formulierung von konkreten Fragestellungen ermöglicht, wird im Beitrag von Robert Strötgen, Thomas Mandl und Rene Schneider thematisiert. Den Schluss bildet der Aufsatz von Niels Jensen, der ein mehrsprachiges Web-Retrieval-System ebenfalls im Zusammenhang mit dem CLEF anhand des multilingualen EuroGOVKorpus evaluiert.
    Pages
    VIII, 244 S
    Type
    s
  5. Grossman, D.A.; Frieder, O.: Information retrieval : algorithms and heuristics (1998) 0.01
    0.0051043504 = product of:
      0.028073926 = sum of:
        0.022181686 = product of:
          0.088726744 = sum of:
            0.088726744 = weight(_text_:o in 2182) [ClassicSimilarity], result of:
              0.088726744 = score(doc=2182,freq=2.0), product of:
                0.13338262 = queryWeight, product of:
                  5.017288 = idf(docFreq=795, maxDocs=44218)
                  0.026584605 = queryNorm
                0.6652047 = fieldWeight in 2182, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.017288 = idf(docFreq=795, maxDocs=44218)
                  0.09375 = fieldNorm(doc=2182)
          0.25 = coord(1/4)
        0.00589224 = weight(_text_:s in 2182) [ClassicSimilarity], result of:
          0.00589224 = score(doc=2182,freq=4.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.20385705 = fieldWeight in 2182, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.09375 = fieldNorm(doc=2182)
      0.18181819 = coord(2/11)
    
    Footnote
    Rez. in: JASIS 51(2000) no.11, S.1063-1064 (H.E. Williams)
    Pages
    254 S
  6. Dominich, S.: Mathematical foundations of information retrieval (2001) 0.00
    0.0043161064 = product of:
      0.015825722 = sum of:
        0.0043660053 = weight(_text_:a in 1753) [ClassicSimilarity], result of:
          0.0043660053 = score(doc=1753,freq=10.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.14243183 = fieldWeight in 1753, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1753)
        0.0024550997 = weight(_text_:s in 1753) [ClassicSimilarity], result of:
          0.0024550997 = score(doc=1753,freq=4.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.08494043 = fieldWeight in 1753, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1753)
        0.009004618 = product of:
          0.018009236 = sum of:
            0.018009236 = weight(_text_:22 in 1753) [ClassicSimilarity], result of:
              0.018009236 = score(doc=1753,freq=2.0), product of:
                0.09309476 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.026584605 = queryNorm
                0.19345059 = fieldWeight in 1753, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1753)
          0.5 = coord(1/2)
      0.27272728 = coord(3/11)
    
    Abstract
    This book offers a comprehensive and consistent mathematical approach to information retrieval (IR) without which no implementation is possible, and sheds an entirely new light upon the structure of IR models. It contains the descriptions of all IR models in a unified formal style and language, along with examples for each, thus offering a comprehensive overview of them. The book also creates mathematical foundations and a consistent mathematical theory (including all mathematical results achieved so far) of IR as a stand-alone mathematical discipline, which thus can be read and taught independently. Also, the book contains all necessary mathematical knowledge on which IR relies, to help the reader avoid searching different sources. The book will be of interest to computer or information scientists, librarians, mathematicians, undergraduate students and researchers whose work involves information retrieval.
    Date
    22. 3.2008 12:26:32
    Pages
    XX, 284 S
  7. Mandl, T.: Tolerantes Information Retrieval : Neuronale Netze zur Erhöhung der Adaptivität und Flexibilität bei der Informationssuche (2001) 0.00
    0.002480067 = product of:
      0.009093579 = sum of:
        0.0018129903 = product of:
          0.0036259806 = sum of:
            0.0036259806 = weight(_text_:h in 5965) [ClassicSimilarity], result of:
              0.0036259806 = score(doc=5965,freq=2.0), product of:
                0.0660481 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.026584605 = queryNorm
                0.05489909 = fieldWeight in 5965, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.015625 = fieldNorm(doc=5965)
          0.5 = coord(1/2)
        9.8204E-4 = weight(_text_:s in 5965) [ClassicSimilarity], result of:
          9.8204E-4 = score(doc=5965,freq=4.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.033976175 = fieldWeight in 5965, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.015625 = fieldNorm(doc=5965)
        0.0062985485 = weight(_text_:u in 5965) [ClassicSimilarity], result of:
          0.0062985485 = score(doc=5965,freq=2.0), product of:
            0.08704981 = queryWeight, product of:
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.026584605 = queryNorm
            0.07235568 = fieldWeight in 5965, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2744443 = idf(docFreq=4547, maxDocs=44218)
              0.015625 = fieldNorm(doc=5965)
      0.27272728 = coord(3/11)
    
    Footnote
    Rez. in: nfd - Information 54(2003) H.6, S.379-380 (U. Thiel): "Kannte G. Salton bei der Entwicklung des Vektorraummodells die kybernetisch orientierten Versuche mit assoziativen Speicherstrukturen? An diese und ähnliche Vermutungen, die ich vor einigen Jahren mit Reginald Ferber und anderen Kollegen diskutierte, erinnerte mich die Thematik des vorliegenden Buches. Immerhin lässt sich feststellen, dass die Vektorrepräsentation eine genial einfache Darstellung sowohl der im Information Retrieval (IR) als grundlegende Datenstruktur benutzten "inverted files" als auch der assoziativen Speichermatrizen darstellt, die sich im Laufe der Zeit Über Perzeptrons zu Neuronalen Netzen (NN) weiterentwickelten. Dieser formale Zusammenhang stimulierte in der Folge eine Reihe von Ansätzen, die Netzwerke im Retrieval zu verwenden, wobei sich, wie auch im vorliegenden Band, hybride Ansätze, die Methoden aus beiden Disziplinen kombinieren, als sehr geeignet erweisen. Aber der Reihe nach... Das Buch wurde vom Autor als Dissertation beim Fachbereich IV "Sprachen und Technik" der Universität Hildesheim eingereicht und resultiert aus einer Folge von Forschungsbeiträgen zu mehreren Projekten, an denen der Autor in der Zeit von 1995 bis 2000 an verschiedenen Standorten beteiligt war. Dies erklärt die ungewohnte Breite der Anwendungen, Szenarien und Domänen, in denen die Ergebnisse gewonnen wurden. So wird das in der Arbeit entwickelte COSIMIR Modell (COgnitive SIMilarity learning in Information Retrieval) nicht nur anhand der klassischen Cranfield-Kollektion evaluiert, sondern auch im WING-Projekt der Universität Regensburg im Faktenretrieval aus einer Werkstoffdatenbank eingesetzt. Weitere Versuche mit der als "Transformations-Netzwerk" bezeichneten Komponente, deren Aufgabe die Abbildung von Gewichtungsfunktionen zwischen zwei Termräumen ist, runden das Spektrum der Experimente ab. Aber nicht nur die vorgestellten Resultate sind vielfältig, auch der dem Leser angebotene "State-of-the-Art"-Überblick fasst in hoch informativer Breite Wesentliches aus den Gebieten IR und NN zusammen und beleuchtet die Schnittpunkte der beiden Bereiche. So werden neben den Grundlagen des Text- und Faktenretrieval die Ansätze zur Verbesserung der Adaptivität und zur Beherrschung von Heterogenität vorgestellt, während als Grundlagen Neuronaler Netze neben einer allgemeinen Einführung in die Grundbegriffe u.a. das Backpropagation-Modell, KohonenNetze und die Adaptive Resonance Theory (ART) geschildert werden. Einweiteres Kapitel stellt die bisherigen NN-orientierten Ansätze im IR vor und rundet den Abriss der relevanten Forschungslandschaft ab. Als Vorbereitung der Präsentation des COSIMIR-Modells schiebt der Autor an dieser Stelle ein diskursives Kapitel zum Thema Heterogenität im IR ein, wodurch die Ziele und Grundannahmen der Arbeit noch einmal reflektiert werden. Als Dimensionen der Heterogenität werden der Objekttyp, die Qualität der Objekte und ihrer Erschließung und die Mehrsprachigkeit genannt. Wenn auch diese Systematik im Wesentlichen die Akzente auf Probleme aus den hier tangierten Projekten legt, und weniger eine umfassende Aufbereitung z.B. der Literatur zum Problem der Relevanz anstrebt, ist sie dennoch hilfreich zum Verständnis der in den nachfolgenden Kapitel oft nur implizit angesprochenen Designentscheidungen bei der Konzeption der entwickelten Prototypen. Der Ansatz, Heterogenität durch Transformationen zu behandeln, wird im speziellen Kontext der NN konkretisiert, wobei andere Möglichkeiten, die z.B. Instrumente der Logik und Probabilistik einzusetzen, nur kurz diskutiert werden. Eine weitergehende Analyse hätte wohl auch den Rahmen der Arbeit zu weit gespannt,
    Pages
    IX, 283 S
  8. Langville, A.N.; Meyer, C.D.: Google's PageRank and beyond : the science of search engine rankings (2006) 0.00
    0.0019719787 = product of:
      0.0072305882 = sum of:
        0.0038459331 = product of:
          0.0076918663 = sum of:
            0.0076918663 = weight(_text_:h in 6) [ClassicSimilarity], result of:
              0.0076918663 = score(doc=6,freq=4.0), product of:
                0.0660481 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.026584605 = queryNorm
                0.11645855 = fieldWeight in 6, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.0234375 = fieldNorm(doc=6)
          0.5 = coord(1/2)
        0.0023430442 = weight(_text_:a in 6) [ClassicSimilarity], result of:
          0.0023430442 = score(doc=6,freq=8.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.07643694 = fieldWeight in 6, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0234375 = fieldNorm(doc=6)
        0.0010416106 = weight(_text_:s in 6) [ClassicSimilarity], result of:
          0.0010416106 = score(doc=6,freq=2.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.036037173 = fieldWeight in 6, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0234375 = fieldNorm(doc=6)
      0.27272728 = coord(3/11)
    
    Content
    Inhalt: Chapter 1. Introduction to Web Search Engines: 1.1 A Short History of Information Retrieval - 1.2 An Overview of Traditional Information Retrieval - 1.3 Web Information Retrieval Chapter 2. Crawling, Indexing, and Query Processing: 2.1 Crawling - 2.2 The Content Index - 2.3 Query Processing Chapter 3. Ranking Webpages by Popularity: 3.1 The Scene in 1998 - 3.2 Two Theses - 3.3 Query-Independence Chapter 4. The Mathematics of Google's PageRank: 4.1 The Original Summation Formula for PageRank - 4.2 Matrix Representation of the Summation Equations - 4.3 Problems with the Iterative Process - 4.4 A Little Markov Chain Theory - 4.5 Early Adjustments to the Basic Model - 4.6 Computation of the PageRank Vector - 4.7 Theorem and Proof for Spectrum of the Google Matrix Chapter 5. Parameters in the PageRank Model: 5.1 The a Factor - 5.2 The Hyperlink Matrix H - 5.3 The Teleportation Matrix E Chapter 6. The Sensitivity of PageRank; 6.1 Sensitivity with respect to alpha - 6.2 Sensitivity with respect to H - 6.3 Sensitivity with respect to vT - 6.4 Other Analyses of Sensitivity - 6.5 Sensitivity Theorems and Proofs Chapter 7. The PageRank Problem as a Linear System: 7.1 Properties of (I - alphaS) - 7.2 Properties of (I - alphaH) - 7.3 Proof of the PageRank Sparse Linear System Chapter 8. Issues in Large-Scale Implementation of PageRank: 8.1 Storage Issues - 8.2 Convergence Criterion - 8.3 Accuracy - 8.4 Dangling Nodes - 8.5 Back Button Modeling
    Pages
    X, 224 S
  9. Brenner, E.H.: Beyond Boolean : new approaches in information retrieval; the quest for intuitive online search systems past, present & future (1995) 0.00
    0.0011219439 = product of:
      0.0061706915 = sum of:
        0.0027335514 = weight(_text_:a in 2547) [ClassicSimilarity], result of:
          0.0027335514 = score(doc=2547,freq=2.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.089176424 = fieldWeight in 2547, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2547)
        0.00343714 = weight(_text_:s in 2547) [ClassicSimilarity], result of:
          0.00343714 = score(doc=2547,freq=4.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.118916616 = fieldWeight in 2547, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2547)
      0.18181819 = coord(2/11)
    
    Issue
    A collection of writings.
    Pages
    XV,143 S
    Type
    s
  10. Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (2005) 0.00
    5.36517E-4 = product of:
      0.0029508434 = sum of:
        0.0015620294 = weight(_text_:a in 7) [ClassicSimilarity], result of:
          0.0015620294 = score(doc=7,freq=2.0), product of:
            0.030653298 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.026584605 = queryNorm
            0.050957955 = fieldWeight in 7, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.03125 = fieldNorm(doc=7)
        0.0013888142 = weight(_text_:s in 7) [ClassicSimilarity], result of:
          0.0013888142 = score(doc=7,freq=2.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.048049565 = fieldWeight in 7, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.03125 = fieldNorm(doc=7)
      0.18181819 = coord(2/11)
    
    Abstract
    The second edition of Understanding Search Engines: Mathematical Modeling and Text Retrieval follows the basic premise of the first edition by discussing many of the key design issues for building search engines and emphasizing the important role that applied mathematics can play in improving information retrieval. The authors discuss important data structures, algorithms, and software as well as user-centered issues such as interfaces, manual indexing, and document preparation. Significant changes bring the text up to date on current information retrieval methods: for example the addition of a new chapter on link-structure algorithms used in search engines such as Google. The chapter on user interface has been rewritten to specifically focus on search engine usability. In addition the authors have added new recommendations for further reading and expanded the bibliography, and have updated and streamlined the index to make it more reader friendly.
    Pages
    XVII, 117 S
  11. Computational information retrieval (2001) 0.00
    2.678291E-4 = product of:
      0.00294612 = sum of:
        0.00294612 = weight(_text_:s in 4167) [ClassicSimilarity], result of:
          0.00294612 = score(doc=4167,freq=4.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.101928525 = fieldWeight in 4167, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.046875 = fieldNorm(doc=4167)
      0.09090909 = coord(1/11)
    
    Pages
    XII,185 S
    Type
    s
  12. Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (1999) 0.00
    1.8938375E-4 = product of:
      0.0020832212 = sum of:
        0.0020832212 = weight(_text_:s in 5777) [ClassicSimilarity], result of:
          0.0020832212 = score(doc=5777,freq=2.0), product of:
            0.028903782 = queryWeight, product of:
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.026584605 = queryNorm
            0.072074346 = fieldWeight in 5777, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.0872376 = idf(docFreq=40523, maxDocs=44218)
              0.046875 = fieldNorm(doc=5777)
      0.09090909 = coord(1/11)
    
    Pages
    XIII, 116 S

Languages