Search (32 results, page 1 of 2)

Croft, W.B.; Metzler, D.; Strohman, T.: Search engines : information retrieval in practice (2010) 0.14

0.13513544 = product of:
  0.20270315 = sum of:
    0.10651682 = weight(_text_:search in 2605) [ClassicSimilarity], result of:
      0.10651682 = score(doc=2605,freq=14.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.6095997 = fieldWeight in 2605, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.046875 = fieldNorm(doc=2605)
    0.09618632 = product of:
      0.19237264 = sum of:
        0.19237264 = weight(_text_:engines in 2605) [ClassicSimilarity], result of:
          0.19237264 = score(doc=2605,freq=10.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.75313926 = fieldWeight in 2605, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=2605)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: For introductory information retrieval courses at the undergraduate and graduate level in computer science, information science and computer engineering departments. Written by a leader in the field of information retrieval, Search Engines: Information Retrieval in Practice, is designed to give undergraduate students the understanding and tools they need to evaluate, compare and modify search engines. Coverage of the underlying IR and mathematical models reinforce key concepts. The book's numerous programming exercises make extensive use of Galago, a Java-based open source search engine. SUPPLEMENTS / Extensive lecture slides (in PDF and PPT format) / Solutions to selected end of chapter problems (Instructors only) / Test collections for exercises / Galago search engine
LCSH: Search engines / Programming
Subject: Search engines / Programming

Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (1999) 0.11

0.11103386 = product of:
  0.16655079 = sum of:
    0.08051914 = weight(_text_:search in 5777) [ClassicSimilarity], result of:
      0.08051914 = score(doc=5777,freq=8.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.460814 = fieldWeight in 5777, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.046875 = fieldNorm(doc=5777)
    0.08603165 = product of:
      0.1720633 = sum of:
        0.1720633 = weight(_text_:engines in 5777) [ClassicSimilarity], result of:
          0.1720633 = score(doc=5777,freq=8.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.67362815 = fieldWeight in 5777, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=5777)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: This book discusses many of the key design issues for building search engines and emphazises the important role that applied mathematics can play in improving information retrieval. The authors discuss not only important data structures, algorithms, and software but also user-centered issues such as interfaces, manual indexing, and document preparation. They also present some of the current problems in information retrieval that many not be familiar to applied mathematicians and computer scientists and some of the driving computational methods (SVD, SDD) for automated conceptual indexing
LCSH: Web search engines
Subject: Web search engines

White, R.W.; Roth, R.A.: Exploratory search : beyond the query-response paradigm (2009) 0.11
```
0.11052248 = product of:
  0.16578372 = sum of:
    0.1299372 = weight(_text_:search in 0) [ClassicSimilarity], result of:
      0.1299372 = score(doc=0,freq=30.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.7436354 = fieldWeight in 0, product of:
          5.477226 = tf(freq=30.0), with freq of:
            30.0 = termFreq=30.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=0)
    0.03584652 = product of:
      0.07169304 = sum of:
        0.07169304 = weight(_text_:engines in 0) [ClassicSimilarity], result of:
          0.07169304 = score(doc=0,freq=2.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.2806784 = fieldWeight in 0, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=0)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

As information becomes more ubiquitous and the demands that searchers have on search systems grow, there is a need to support search behaviors beyond simple lookup. Information seeking is the process or activity of attempting to obtain information in both human and technological contexts. Exploratory search describes an information-seeking problem context that is open-ended, persistent, and multifaceted, and information-seeking processes that are opportunistic, iterative, and multitactical. Exploratory searchers aim to solve complex problems and develop enhanced mental capacities. Exploratory search systems support this through symbiotic human-machine relationships that provide guidance in exploring unfamiliar information landscapes. Exploratory search has gained prominence in recent years. There is an increased interest from the information retrieval, information science, and human-computer interaction communities in moving beyond the traditional turn-taking interaction model supported by major Web search engines, and toward support for human intelligence amplification and information use. In this lecture, we introduce exploratory search, relate it to relevant extant research, outline the features of exploratory search systems, discuss the evaluation of these systems, and suggest some future directions for supporting exploratory search. Exploratory search is a new frontier in the search domain and is becoming increasingly important in shaping our future world.

Content

Table of Contents: Introduction / Defining Exploratory Search / Related Work / Features of Exploratory Search Systems / Evaluation of Exploratory Search Systems / Future Directions and concluding Remarks

Sherman, C.: Google power : Unleash the full potential of Google (2005) 0.10

0.10057114 = product of:
  0.1508567 = sum of:
    0.09002314 = weight(_text_:search in 3185) [ClassicSimilarity], result of:
      0.09002314 = score(doc=3185,freq=10.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.51520574 = fieldWeight in 3185, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.046875 = fieldNorm(doc=3185)
    0.060833566 = product of:
      0.12166713 = sum of:
        0.12166713 = weight(_text_:engines in 3185) [ClassicSimilarity], result of:
          0.12166713 = score(doc=3185,freq=4.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.47632706 = fieldWeight in 3185, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=3185)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: With this title, readers learn to push the search engine to its limits and extract the best content from Google, without having to learn complicated code. "Google Power" takes Google users under the hood, and teaches them a wide range of advanced web search techniques, through practical examples. Its content is organised by topic, so reader learns how to conduct in-depth searches on the most popular search topics, from health to government listings to people.
LCSH: Web search engines
Subject: Web search engines

Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (2005) 0.10
```
0.097439155 = product of:
  0.14615873 = sum of:
    0.075914174 = weight(_text_:search in 7) [ClassicSimilarity], result of:
      0.075914174 = score(doc=7,freq=16.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.43445963 = fieldWeight in 7, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.03125 = fieldNorm(doc=7)
    0.07024455 = product of:
      0.1404891 = sum of:
        0.1404891 = weight(_text_:engines in 7) [ClassicSimilarity], result of:
          0.1404891 = score(doc=7,freq=12.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.5500151 = fieldWeight in 7, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.03125 = fieldNorm(doc=7)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

The second edition of Understanding Search Engines: Mathematical Modeling and Text Retrieval follows the basic premise of the first edition by discussing many of the key design issues for building search engines and emphasizing the important role that applied mathematics can play in improving information retrieval. The authors discuss important data structures, algorithms, and software as well as user-centered issues such as interfaces, manual indexing, and document preparation. Significant changes bring the text up to date on current information retrieval methods: for example the addition of a new chapter on link-structure algorithms used in search engines such as Google. The chapter on user interface has been rewritten to specifically focus on search engine usability. In addition the authors have added new recommendations for further reading and expanded the bibliography, and have updated and streamlined the index to make it more reader friendly.

Content

Inhalt: Introduction Document File Preparation - Manual Indexing - Information Extraction - Vector Space Modeling - Matrix Decompositions - Query Representations - Ranking and Relevance Feedback - Searching by Link Structure - User Interface - Book Format Document File Preparation Document Purification and Analysis - Text Formatting - Validation - Manual Indexing - Automatic Indexing - Item Normalization - Inverted File Structures - Document File - Dictionary List - Inversion List - Other File Structures Vector Space Models Construction - Term-by-Document Matrices - Simple Query Matching - Design Issues - Term Weighting - Sparse Matrix Storage - Low-Rank Approximations Matrix Decompositions QR Factorization - Singular Value Decomposition - Low-Rank Approximations - Query Matching - Software - Semidiscrete Decomposition - Updating Techniques Query Management Query Binding - Types of Queries - Boolean Queries - Natural Language Queries - Thesaurus Queries - Fuzzy Queries - Term Searches - Probabilistic Queries Ranking and Relevance Feedback Performance Evaluation - Precision - Recall - Average Precision - Genetic Algorithms - Relevance Feedback Searching by Link Structure HITS Method - HITS Implementation - HITS Summary - PageRank Method - PageRank Adjustments - PageRank Implementation - PageRank Summary User Interface Considerations General Guidelines - Search Engine Interfaces - Form Fill-in - Display Considerations - Progress Indication - No Penalties for Error - Results - Test and Retest - Final Considerations Further Reading

LCSH

Web search engines

Subject

Web search engines
Belew, R.K.: Finding out about : a cognitive perspective on search engine technology and the WWW (2001) 0.10
```
0.097439155 = product of:
  0.14615873 = sum of:
    0.075914174 = weight(_text_:search in 3346) [ClassicSimilarity], result of:
      0.075914174 = score(doc=3346,freq=16.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.43445963 = fieldWeight in 3346, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.03125 = fieldNorm(doc=3346)
    0.07024455 = product of:
      0.1404891 = sum of:
        0.1404891 = weight(_text_:engines in 3346) [ClassicSimilarity], result of:
          0.1404891 = score(doc=3346,freq=12.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.5500151 = fieldWeight in 3346, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.03125 = fieldNorm(doc=3346)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

The World Wide Web is rapidly filling with more text than anyone could have imagined even a short time ago, but the task of isolating relevant parts of this vast information has become just that much more daunting. Richard Belew brings a cognitive perspective to the study of information retrieval as a discipline within computer science. He introduces the idea of Finding Out About (FDA) as the process of actively seeking out information relevant to a topic of interest and describes its many facets - ranging from creating a good characterization of what the user seeks, to what documents actually mean, to methods of inferring semantic clues about each document, to the problem of evaluating whether our search engines are performing as we have intended. Finding Out About explains how to build the tools that are useful for searching collections of text and other media. In the process it takes a close look at the properties of textual documents that do not become clear until very large collections of them are brought together and shows that the construction of effective search engines requires knowledge of the statistical and mathematical properties of linguistic phenomena, as well as an appreciation for the cognitive foundation we bring to the task as language users. The unique approach of this book is its even handling of the phenomena of both numbers and words, making it accessible to a wide audience. The textbook is usable in both undergraduate and graduate classes on information retrieval, library science, and computational linguistics. The text is accompanied by a CD-ROM that contains a hypertext version of the book, including additional topics and notes not present in the printed edition. In addition, the CD contains the full text of C.J. "Keith" van Rijsbergen's famous textbook, Information Retrieval (now out of print). Many active links from Belew's to van Rijsbergen's hypertexts help to unite the material. Several test corpora and indexing tools are provided, to support the design of your own search engine. Additional exercises using these corpora and code are available to instructors. Also supporting this book is a Web site that will include recent additions to the book, as well as links to sites of new topics and methods.

LCSH

Search engines / Programming
Web search engines

Subject

Search engines / Programming
Web search engines
Tunkelang, D.: Faceted search (2009) 0.09
```
0.08902081 = product of:
  0.13353121 = sum of:
    0.0929755 = weight(_text_:search in 26) [ClassicSimilarity], result of:
      0.0929755 = score(doc=26,freq=24.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.5321022 = fieldWeight in 26, product of:
          4.8989797 = tf(freq=24.0), with freq of:
            24.0 = termFreq=24.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.03125 = fieldNorm(doc=26)
    0.04055571 = product of:
      0.08111142 = sum of:
        0.08111142 = weight(_text_:engines in 26) [ClassicSimilarity], result of:
          0.08111142 = score(doc=26,freq=4.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.31755137 = fieldWeight in 26, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.03125 = fieldNorm(doc=26)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

We live in an information age that requires us, more than ever, to represent, access, and use information. Over the last several decades, we have developed a modern science and technology for information retrieval, relentlessly pursuing the vision of a "memex" that Vannevar Bush proposed in his seminal article, "As We May Think." Faceted search plays a key role in this program. Faceted search addresses weaknesses of conventional search approaches and has emerged as a foundation for interactive information retrieval. User studies demonstrate that faceted search provides more effective information-seeking support to users than best-first search. Indeed, faceted search has become increasingly prevalent in online information access systems, particularly for e-commerce and site search. In this lecture, we explore the history, theory, and practice of faceted search. Although we cannot hope to be exhaustive, our aim is to provide sufficient depth and breadth to offer a useful resource to both researchers and practitioners. Because faceted search is an area of interest to computer scientists, information scientists, interface designers, and usability researchers, we do not assume that the reader is a specialist in any of these fields. Rather, we offer a self-contained treatment of the topic, with an extensive bibliography for those who would like to pursue particular aspects in more depth.

LCSH

Web search engines / Research

Subject

Web search engines / Research
Multimedia content and the Semantic Web : methods, standards, and tools (2005) 0.07
```
0.0659266 = product of:
  0.09888989 = sum of:
    0.03354964 = weight(_text_:search in 150) [ClassicSimilarity], result of:
      0.03354964 = score(doc=150,freq=8.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.19200584 = fieldWeight in 150, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.01953125 = fieldNorm(doc=150)
    0.06534025 = sum of:
      0.03584652 = weight(_text_:engines in 150) [ClassicSimilarity], result of:
        0.03584652 = score(doc=150,freq=2.0), product of:
          0.25542772 = queryWeight, product of:
            5.080822 = idf(docFreq=746, maxDocs=44218)
            0.05027291 = queryNorm
          0.1403392 = fieldWeight in 150, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.080822 = idf(docFreq=746, maxDocs=44218)
            0.01953125 = fieldNorm(doc=150)
      0.029493734 = weight(_text_:22 in 150) [ClassicSimilarity], result of:
        0.029493734 = score(doc=150,freq=6.0), product of:
          0.17604718 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05027291 = queryNorm
          0.16753313 = fieldWeight in 150, product of:
            2.4494898 = tf(freq=6.0), with freq of:
              6.0 = termFreq=6.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.01953125 = fieldNorm(doc=150)
  0.6666667 = coord(2/3)
```
Classification

006.7 22

Date

7. 3.2007 19:30:22

DDC

006.7 22

Footnote

Rez. in: JASIST 58(2007) no.3, S.457-458 (A.M.A. Ahmad): "The concept of the semantic web has emerged because search engines and text-based searching are no longer adequate, as these approaches involve an extensive information retrieval process. The deployed searching and retrieving descriptors arc naturally subjective and their deployment is often restricted to the specific application domain for which the descriptors were configured. The new era of information technology imposes different kinds of requirements and challenges. Automatic extracted audiovisual features are required, as these features are more objective, domain-independent, and more native to audiovisual content. This book is a useful guide for researchers, experts, students, and practitioners; it is a very valuable reference and can lead them through their exploration and research in multimedia content and the semantic web. The book is well organized, and introduces the concept of the semantic web and multimedia content analysis to the reader through a logical sequence from standards and hypotheses through system examples, presenting relevant tools and methods. But in some chapters readers will need a good technical background to understand some of the details. Readers may attain sufficient knowledge here to start projects or research related to the book's theme; recent results and articles related to the active research area of integrating multimedia with semantic web technologies are included. This book includes full descriptions of approaches to specific problem domains such as content search, indexing, and retrieval. This book will be very useful to researchers in the multimedia content analysis field who wish to explore the benefits of emerging semantic web technologies in applying multimedia content approaches. The first part of the book covers the definition of the two basic terms multimedia content and semantic web. The Moving Picture Experts Group standards MPEG7 and MPEG21 are quoted extensively. In addition, the means of multimedia content description are elaborated upon and schematically drawn. This extensive description is introduced by authors who are actively involved in those standards and have been participating in the work of the International Organization for Standardization (ISO)/MPEG for many years. On the other hand, this results in bias against the ad hoc or nonstandard tools for multimedia description in favor of the standard approaches. This is a general book for multimedia content; more emphasis on the general multimedia description and extraction could be provided.
The final part of the book discusses research in multimedia content management systems and the semantic web, and presents examples and applications for semantic multimedia analysis in search and retrieval systems. These chapters describe example systems in which current projects have been implemented, and include extensive results and real demonstrations. For example, real case scenarios such as ECommerce medical applications and Web services have been introduced. Topics in natural language, speech and image processing techniques and their application for multimedia indexing, and content-based retrieval have been elaborated upon with extensive examples and deployment methods. The editors of the book themselves provide the readers with a chapter about their latest research results on knowledge-based multimedia content indexing and retrieval. Some interesting applications for multimedia content and the semantic web are introduced. Applications that have taken advantage of the metadata provided by MPEG7 in order to realize advance-access services for multimedia content have been provided. The applications discussed in the third part of the book provide useful guidance to researchers and practitioners properly planning to implement semantic multimedia analysis techniques in new research and development projects in both academia and industry. A fourth part should be added to this book: performance measurements for integrated approaches of multimedia analysis and the semantic web. Performance of the semantic approach is a very sophisticated issue and requires extensive elaboration and effort. Measuring the semantic search is an ongoing research area; several chapters concerning performance measurement and analysis would be required to adequately cover this area and introduce it to readers."

Calishain, T.; Dornfest, R.; Adam, D.J.: Google Pocket Guide (2003) 0.07

0.06542733 = product of:
  0.098141 = sum of:
    0.04744636 = weight(_text_:search in 6) [ClassicSimilarity], result of:
      0.04744636 = score(doc=6,freq=4.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.27153727 = fieldWeight in 6, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=6)
    0.05069464 = product of:
      0.10138928 = sum of:
        0.10138928 = weight(_text_:engines in 6) [ClassicSimilarity], result of:
          0.10138928 = score(doc=6,freq=4.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.39693922 = fieldWeight in 6, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

LCSH: Web search engines / Handbooks, manuals, etc.
Subject: Web search engines / Handbooks, manuals, etc.

Langville, A.N.; Meyer, C.D.: Google's PageRank and beyond : the science of search engine rankings (2006) 0.06
```
0.06418283 = product of:
  0.09627424 = sum of:
    0.05325841 = weight(_text_:search in 6) [ClassicSimilarity], result of:
      0.05325841 = score(doc=6,freq=14.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.30479985 = fieldWeight in 6, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0234375 = fieldNorm(doc=6)
    0.043015826 = product of:
      0.08603165 = sum of:
        0.08603165 = weight(_text_:engines in 6) [ClassicSimilarity], result of:
          0.08603165 = score(doc=6,freq=8.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.33681408 = fieldWeight in 6, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0234375 = fieldNorm(doc=6)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Why doesn't your home page appear on the first page of search results, even when you query your own name? How do other Web pages always appear at the top? What creates these powerful rankings? And how? The first book ever about the science of Web page rankings, "Google's PageRank and Beyond" supplies the answers to these and other questions and more. The book serves two very different audiences: the curious science reader and the technical computational reader. The chapters build in mathematical sophistication, so that the first five are accessible to the general academic reader. While other chapters are much more mathematical in nature, each one contains something for both audiences. For example, the authors include entertaining asides such as how search engines make money and how the Great Firewall of China influences research. The book includes an extensive background chapter designed to help readers learn more about the mathematics of search engines, and it contains several MATLAB codes and links to sample Web data sets. The philosophy throughout is to encourage readers to experiment with the ideas and algorithms in the text. Any business seriously interested in improving its rankings in the major search engines can benefit from the clear examples, sample code, and list of resources provided. It includes: many illustrative examples and entertaining asides; MATLAB code; accessible and informal style; and complete and self-contained section for mathematics review.

Content

Inhalt: Chapter 1. Introduction to Web Search Engines: 1.1 A Short History of Information Retrieval - 1.2 An Overview of Traditional Information Retrieval - 1.3 Web Information Retrieval Chapter 2. Crawling, Indexing, and Query Processing: 2.1 Crawling - 2.2 The Content Index - 2.3 Query Processing Chapter 3. Ranking Webpages by Popularity: 3.1 The Scene in 1998 - 3.2 Two Theses - 3.3 Query-Independence Chapter 4. The Mathematics of Google's PageRank: 4.1 The Original Summation Formula for PageRank - 4.2 Matrix Representation of the Summation Equations - 4.3 Problems with the Iterative Process - 4.4 A Little Markov Chain Theory - 4.5 Early Adjustments to the Basic Model - 4.6 Computation of the PageRank Vector - 4.7 Theorem and Proof for Spectrum of the Google Matrix Chapter 5. Parameters in the PageRank Model: 5.1 The a Factor - 5.2 The Hyperlink Matrix H - 5.3 The Teleportation Matrix E Chapter 6. The Sensitivity of PageRank; 6.1 Sensitivity with respect to alpha - 6.2 Sensitivity with respect to H - 6.3 Sensitivity with respect to vT - 6.4 Other Analyses of Sensitivity - 6.5 Sensitivity Theorems and Proofs Chapter 7. The PageRank Problem as a Linear System: 7.1 Properties of (I - alphaS) - 7.2 Properties of (I - alphaH) - 7.3 Proof of the PageRank Sparse Linear System Chapter 8. Issues in Large-Scale Implementation of PageRank: 8.1 Storage Issues - 8.2 Convergence Criterion - 8.3 Accuracy - 8.4 Dangling Nodes - 8.5 Back Button Modeling
Chapter 9. Accelerating the Computation of PageRank: 9.1 An Adaptive Power Method - 9.2 Extrapolation - 9.3 Aggregation - 9.4 Other Numerical Methods Chapter 10. Updating the PageRank Vector: 10.1 The Two Updating Problems and their History - 10.2 Restarting the Power Method - 10.3 Approximate Updating Using Approximate Aggregation - 10.4 Exact Aggregation - 10.5 Exact vs. Approximate Aggregation - 10.6 Updating with Iterative Aggregation - 10.7 Determining the Partition - 10.8 Conclusions Chapter 11. The HITS Method for Ranking Webpages: 11.1 The HITS Algorithm - 11.2 HITS Implementation - 11.3 HITS Convergence - 11.4 HITS Example - 11.5 Strengths and Weaknesses of HITS - 11.6 HITS's Relationship to Bibliometrics - 11.7 Query-Independent HITS - 11.8 Accelerating HITS - 11.9 HITS Sensitivity Chapter 12. Other Link Methods for Ranking Webpages: 12.1 SALSA - 12.2 Hybrid Ranking Methods - 12.3 Rankings based on Traffic Flow Chapter 13. The Future of Web Information Retrieval: 13.1 Spam - 13.2 Personalization - 13.3 Clustering - 13.4 Intelligent Agents - 13.5 Trends and Time-Sensitive Search - 13.6 Privacy and Censorship - 13.7 Library Classification Schemes - 13.8 Data Fusion Chapter 14. Resources for Web Information Retrieval: 14.1 Resources for Getting Started - 14.2 Resources for Serious Study Chapter 15. The Mathematics Guide: 15.1 Linear Algebra - 15.2 Perron-Frobenius Theory - 15.3 Markov Chains - 15.4 Perron Complementation - 15.5 Stochastic Complementation - 15.6 Censoring - 15.7 Aggregation - 15.8 Disaggregation

Franke, F; Klein, A.; Schüller-Zwierlein, A.: Schlüsselkompetenzen : Literatur recherchieren in Bibliotheken und Internet (2010) 0.05

0.05234187 = product of:
  0.0785128 = sum of:
    0.037957087 = weight(_text_:search in 4721) [ClassicSimilarity], result of:
      0.037957087 = score(doc=4721,freq=4.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.21722981 = fieldWeight in 4721, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.03125 = fieldNorm(doc=4721)
    0.04055571 = product of:
      0.08111142 = sum of:
        0.08111142 = weight(_text_:engines in 4721) [ClassicSimilarity], result of:
          0.08111142 = score(doc=4721,freq=4.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.31755137 = fieldWeight in 4721, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.03125 = fieldNorm(doc=4721)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

LCSH: Web search engines
Subject: Web search engines

Research and advanced technology for digital libraries : 9th European conference, ECDL 2005, Vienna, Austria, September 18 - 23, 2005 ; proceedings (2005) 0.04

0.044422872 = product of:
  0.066634305 = sum of:
    0.037957087 = weight(_text_:search in 2423) [ClassicSimilarity], result of:
      0.037957087 = score(doc=2423,freq=4.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.21722981 = fieldWeight in 2423, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.03125 = fieldNorm(doc=2423)
    0.028677218 = product of:
      0.057354435 = sum of:
        0.057354435 = weight(_text_:engines in 2423) [ClassicSimilarity], result of:
          0.057354435 = score(doc=2423,freq=2.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.22454272 = fieldWeight in 2423, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.03125 = fieldNorm(doc=2423)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Content: Inhalt u.a.: - Digital Library Models and Architectures - Multimedia and Hypermedia Digital Libraries - XML - Building Digital Libraries - User Studies - Digital Preservation - Metadata - Digital Libraries and e-Learning - Text Classification in Digital Libraries - Searching - - Focused Crawling Using Latent Semantic Indexing - An Application for Vertical Search Engines / George Almpanidis, Constantine Kotropoulos, Ioannis Pitas - - Active Support for Query Formulation in Virtual Digital Libraries: A Case Study with DAFFODIL / Andre Schaefer, Matthias Jordan, Claus-Peter Klas, Norbert Fuhr - - Expression of Z39.50 Supported Search Capabilities by Applying Formal Descriptions / Michalis Sfakakis, Sarantos Kapidakis - Text Digital Libraries

Innovations in information retrieval : perspectives for theory and practice (2011) 0.04
```
0.044422872 = product of:
  0.066634305 = sum of:
    0.037957087 = weight(_text_:search in 1757) [ClassicSimilarity], result of:
      0.037957087 = score(doc=1757,freq=4.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.21722981 = fieldWeight in 1757, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.03125 = fieldNorm(doc=1757)
    0.028677218 = product of:
      0.057354435 = sum of:
        0.057354435 = weight(_text_:engines in 1757) [ClassicSimilarity], result of:
          0.057354435 = score(doc=1757,freq=2.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.22454272 = fieldWeight in 1757, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.03125 = fieldNorm(doc=1757)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

The advent of new information retrieval (IR) technologies and approaches to storage and retrieval provide communities with previously unheard of opportunities for mass documentation, digitization, and the recording of information in all its forms. This book introduces and contextualizes these developments and looks at supporting research in IR, the debates, theories and issues. Contributed by an international team of experts, each authored chapter provides a snapshot of changes in the field, as well as the importance of developing innovation, creativity and thinking in IR practice and research. Key discussion areas include: browsing in new information environments classification revisited: a web of knowledge approaches to fiction retrieval research music information retrieval research folksonomies, social tagging and information retrieval digital information interaction as semantic navigation assessing web search machines: a webometric approach. The questions raised are of significance to the whole international library and information science community, and this is essential reading for LIS professionals , researchers and students, and for all those interested in the future of IR.

Content

Inhalt: Bawden, D.: Encountering on the road to serendip? Browsing in new information environments. - Slavic, A.: Classification revisited: a web of knowledge. - Vernitski, A. u. P. Rafferty: Approaches to fiction retrieval research, from theory to practice? - Inskip, C.: Music information retrieval research. - Peters, I.: Folksonomies, social tagging and information retrieval. - Kopak, R., L. Freund u. H. O'Brien: Digital information interaction as semantic navigation. - Thelwall, M.: Assessing web search engines: a webometric approach
Spinning the Semantic Web : bringing the World Wide Web to its full potential (2003) 0.04
```
0.03887001 = product of:
  0.058305014 = sum of:
    0.03321245 = weight(_text_:search in 1981) [ClassicSimilarity], result of:
      0.03321245 = score(doc=1981,freq=4.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.19007608 = fieldWeight in 1981, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1981)
    0.025092565 = product of:
      0.05018513 = sum of:
        0.05018513 = weight(_text_:engines in 1981) [ClassicSimilarity], result of:
          0.05018513 = score(doc=1981,freq=2.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.19647488 = fieldWeight in 1981, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1981)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

As the World Wide Web continues to expand, it becomes increasingly difficult for users to obtain information efficiently. Because most search engines read format languages such as HTML or SGML, search results reflect formatting tags more than actual page content, which is expressed in natural language. Spinning the Semantic Web describes an exciting new type of hierarchy and standardization that will replace the current "Web of links" with a "Web of meaning." Using a flexible set of languages and tools, the Semantic Web will make all available information - display elements, metadata, services, images, and especially content - accessible. The result will be an immense repository of information accessible for a wide range of new applications. This first handbook for the Semantic Web covers, among other topics, software agents that can negotiate and collect information, markup languages that can tag many more types of information in a document, and knowledge systems that enable machines to read Web pages and determine their reliability. The truly interdisciplinary Semantic Web combines aspects of artificial intelligence, markup languages, natural language processing, information retrieval, knowledge representation, intelligent agents, and databases.
TREC: experiment and evaluation in information retrieval (2005) 0.03
```
0.027764294 = product of:
  0.04164644 = sum of:
    0.02372318 = weight(_text_:search in 636) [ClassicSimilarity], result of:
      0.02372318 = score(doc=636,freq=4.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.13576864 = fieldWeight in 636, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.01953125 = fieldNorm(doc=636)
    0.01792326 = product of:
      0.03584652 = sum of:
        0.03584652 = weight(_text_:engines in 636) [ClassicSimilarity], result of:
          0.03584652 = score(doc=636,freq=2.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.1403392 = fieldWeight in 636, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.01953125 = fieldNorm(doc=636)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

The Text REtrieval Conference (TREC), a yearly workshop hosted by the US government's National Institute of Standards and Technology, provides the infrastructure necessary for large-scale evaluation of text retrieval methodologies. With the goal of accelerating research in this area, TREC created the first large test collections of full-text documents and standardized retrieval evaluation. The impact has been significant; since TREC's beginning in 1992, retrieval effectiveness has approximately doubled. TREC has built a variety of large test collections, including collections for such specialized retrieval tasks as cross-language retrieval and retrieval of speech. Moreover, TREC has accelerated the transfer of research ideas into commercial systems, as demonstrated in the number of retrieval techniques developed in TREC that are now used in Web search engines. This book provides a comprehensive review of TREC research, summarizing the variety of TREC results, documenting the best practices in experimental information retrieval, and suggesting areas for further research. The first part of the book describes TREC's history, test collections, and retrieval methodology. Next, the book provides "track" reports -- describing the evaluations of specific tasks, including routing and filtering, interactive retrieval, and retrieving noisy text. The final part of the book offers perspectives on TREC from such participants as Microsoft Research, University of Massachusetts, Cornell University, University of Waterloo, City University of New York, and IBM. The book will be of interest to researchers in information retrieval and related technologies, including natural language processing.

Content

Enthält die Beiträge: 1. The Text REtrieval Conference - Ellen M. Voorhees and Donna K. Harman 2. The TREC Test Collections - Donna K. Harman 3. Retrieval System Evaluation - Chris Buckley and Ellen M. Voorhees 4. The TREC Ad Hoc Experiments - Donna K. Harman 5. Routing and Filtering - Stephen Robertson and Jamie Callan 6. The TREC Interactive Tracks: Putting the User into Search - Susan T. Dumais and Nicholas J. Belkin 7. Beyond English - Donna K. Harman 8. Retrieving Noisy Text - Ellen M. Voorhees and John S. Garofolo 9.The Very Large Collection and Web Tracks - David Hawking and Nick Craswell 10. Question Answering in TREC - Ellen M. Voorhees 11. The University of Massachusetts and a Dozen TRECs - James Allan, W. Bruce Croft and Jamie Callan 12. How Okapi Came to TREC - Stephen Robertson 13. The SMART Project at TREC - Chris Buckley 14. Ten Years of Ad Hoc Retrieval at TREC Using PIRCS - Kui-Lam Kwok 15. MultiText Experiments for TREC - Gordon V. Cormack, Charles L. A. Clarke, Christopher R. Palmer and Thomas R. Lynam 16. A Language-Modeling Approach to TREC - Djoerd Hiemstra and Wessel Kraaij 17. BM Research Activities at TREC - Eric W. Brown, David Carmel, Martin Franz, Abraham Ittycheriah, Tapas Kanungo, Yoelle Maarek, J. Scott McCarley, Robert L. Mack, John M. Prager, John R. Smith, Aya Soffer, Jason Y. Zien and Alan D. Marwick Epilogue: Metareflections on TREC - Karen Sparck Jones

Research and advanced technology for digital libraries : 10th European conference ; proceedings / ECDL 2006, Alicante, Spain, September 17 - 22, 2006 ; proceedings (2006) 0.03

0.026974857 = product of:
  0.040462285 = sum of:
    0.026839713 = weight(_text_:search in 2428) [ClassicSimilarity], result of:
      0.026839713 = score(doc=2428,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.15360467 = fieldWeight in 2428, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.03125 = fieldNorm(doc=2428)
    0.013622572 = product of:
      0.027245143 = sum of:
        0.027245143 = weight(_text_:22 in 2428) [ClassicSimilarity], result of:
          0.027245143 = score(doc=2428,freq=2.0), product of:
            0.17604718 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05027291 = queryNorm
            0.15476047 = fieldWeight in 2428, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=2428)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Content: Inhalt u.a.: Architectures I Preservation Retrieval - The Use of Summaries in XML Retrieval / Zoltdn Szldvik, Anastasios Tombros, Mounia Laimas - An Enhanced Search Interface for Information Discovery from Digital Libraries / Georgia Koutrika, Alkis Simitsis - The TIP/Greenstone Bridge: A Service for Mobile Location-Based Access to Digital Libraries / Annika Hinze, Xin Gao, David Bainbridge Architectures II Applications Methodology Metadata Evaluation User Studies Modeling Audiovisual Content Language Technologies - Incorporating Cross-Document Relationships Between Sentences for Single Document Summarizations / Xiaojun Wan, Jianwu Yang, Jianguo Xiao - Semantic Web Techniques for Multiple Views on Heterogeneous Collections: A Case Study / Marjolein van Gendt, Antoine Isaac, Lourens van der Meij, Stefan Schlobach Posters - A Tool for Converting from MARC to FRBR / Trond Aalberg, Frank Berg Haugen, Ole Husby

Jeanneney, J.-N.: Googles Herausforderung : Für eine europäische Bibliothek (2006) 0.03

0.026170935 = product of:
  0.0392564 = sum of:
    0.018978544 = weight(_text_:search in 46) [ClassicSimilarity], result of:
      0.018978544 = score(doc=46,freq=4.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.10861491 = fieldWeight in 46, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.015625 = fieldNorm(doc=46)
    0.020277856 = product of:
      0.04055571 = sum of:
        0.04055571 = weight(_text_:engines in 46) [ClassicSimilarity], result of:
          0.04055571 = score(doc=46,freq=4.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.15877569 = fieldWeight in 46, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.015625 = fieldNorm(doc=46)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

LCSH: Web search engines
Subject: Web search engines

Grossman, D.A.; Frieder, O.: Information retrieval : algorithms and heuristics (2004) 0.02
```
0.017893143 = product of:
  0.053679425 = sum of:
    0.053679425 = weight(_text_:search in 1486) [ClassicSimilarity], result of:
      0.053679425 = score(doc=1486,freq=8.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.30720934 = fieldWeight in 1486, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.03125 = fieldNorm(doc=1486)
  0.33333334 = coord(1/3)
```
Abstract

Interested in how an efficient search engine works? Want to know what algorithms are used to rank resulting documents in response to user requests? The authors answer these and other key information on retrieval design and implementation questions is provided. This book is not yet another high level text. Instead, algorithms are thoroughly described, making this book ideally suited for both computer science students and practitioners who work on search-related applications. As stated in the foreword, this book provides a current, broad, and detailed overview of the field and is the only one that does so. Examples are used throughout to illustrate the algorithms. The authors explain how a query is ranked against a document collection using either a single or a combination of retrieval strategies, and how an assortment of utilities are integrated into the query processing scheme to improve these rankings. Methods for building and compressing text indexes, querying and retrieving documents in multiple languages, and using parallel or distributed processing to expedite the search are likewise described. This edition is a major expansion of the one published in 1998. Neuaufl. 2005: Besides updating the entire book with current techniques, it includes new sections on language models, cross-language information retrieval, peer-to-peer processing, XML search, mediators, and duplicate document detection.
Proceedings of the Second ACM/IEEE-CS Joint Conference on Digital Libraries : July 14 - 18, 2002, Portland, Oregon, USA. (2002) 0.01
```
0.012652363 = product of:
  0.037957087 = sum of:
    0.037957087 = weight(_text_:search in 172) [ClassicSimilarity], result of:
      0.037957087 = score(doc=172,freq=16.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.21722981 = fieldWeight in 172, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.015625 = fieldNorm(doc=172)
  0.33333334 = coord(1/3)
```
Content

Inhalt: SESSION: Building and using cultural digital libraries Primarily history: historians and the search for primary source materials (Helen R. Tibbo) - Using the Gamera framework for the recognition of cultural heritage materials (Michael Droettboom, Ichiro Fujinaga, Karl MacMillan, G. Sayeed Chouhury, Tim DiLauro, Mark Patton, Teal Anderson) - Supporting access to large digital oral history archives (Samuel Gustman, Dagobert Soergel, Douglas Oard, William Byrne, Michael Picheny, Bhuvana Ramabhadran, Douglas Greenberg) SESSION: Summarization and question answering Using sentence-selection heuristics to rank text segments in TXTRACTOR (Daniel McDonald, Hsinchun Chen) - Using librarian techniques in automatic text summarization for information retrieval (Min-Yen Kan, Judith L. Klavans) - QuASM: a system for question answering using semi-structured data (David Pinto, Michael Branstein, Ryan Coleman, W. Bruce Croft, Matthew King, Wei Li, Xing Wei) SESSION: Studying users Reading-in-the-small: a study of reading on small form factor devices (Catherine C. Marshall, Christine Ruotolo) - A graph-based recommender system for digital library (Zan Huang, Wingyan Chung, Thian-Huat Ong, Hsinchun Chen) - The effects of topic familiarity on information search behavior (Diane Kelly, Colleen Cool) SESSION: Classification and browsing A language modelling approach to relevance profiling for document browsing (David J. Harper, Sara Coulthard, Sun Yixing) - Compound descriptors in context: a matching function for classifications and thesauri (Douglas Tudhope, Ceri Binding, Dorothee Blocks, Daniel Cunliffe) - Structuring keyword-based queries for web databases (Rodrigo C. Vieira, Pavel Calado, Altigran S. da Silva, Alberto H. F. Laender, Berthier A. Ribeiro-Neto) - An approach to automatic classification of text for information retrieval (Hong Cui, P. Bryan Heidorn, Hong Zhang)
SESSION: A digital libraries for education Middle school children's use of the ARTEMIS digital library (June Abbas, Cathleen Norris, Elliott Soloway) - Partnership reviewing: a cooperative approach for peer review of complex educational resources (John Weatherley, Tamara Sumner, Michael Khoo, Michael Wright, Marcel Hoffmann) - A digital library for geography examination resources (Lian-Heong Chua, Dion Hoe-Lian Goh, Ee-Peng Lim, Zehua Liu, Rebecca Pei-Hui Ang) - Digital library services for authors of learning materials (Flora McMartin, Youki Terada) SESSION: Novel search environments Integration of simultaneous searching and reference linking across bibliographic resources on the web (William H. Mischo, Thomas G. Habing, Timothy W. Cole) - Exploring discussion lists: steps and directions (Paula S. Newman) - Comparison of two approaches to building a vertical search tool: a case study in the nanotechnology domain (Michael Chau, Hsinchun Chen, Jialun Qin, Yilu Zhou, Yi Qin, Wai-Ki Sung, Daniel McDonald) SESSION: Video and multimedia digital libraries A multilingual, multimodal digital video library system (Michael R. Lyu, Edward Yau, Sam Sze) - A digital library data model for music (Natalia Minibayeva, Jon W. Dunn) - Video-cuebik: adapting image search to video shots (Alexander G. Hauptmann, Norman D. Papernick) - Virtual multimedia libraries built from the web (Neil C. Rowe) - Multi-modal information retrieval from broadcast video using OCR and speech recognition (Alexander G. Hauptmann, Rong Jin, Tobun Dorbin Ng) SESSION: OAI application Extending SDARTS: extracting metadata from web databases and interfacing with the open archives initiative (Panagiotis G. Ipeirotis, Tom Barry, Luis Gravano) - Using the open archives initiative protocols with EAD (Christopher J. Prom, Thomas G. Habing) - Preservation and transition of NCSTRL using an OAI-based architecture (H. Anan, X. Liu, K. Maly, M. Nelson, M. Zubair, J. C. French, E. Fox, P. Shivakumar) - Integrating harvesting into digital library content (David A. Smith, Anne Mahoney, Gregory Crane) SESSION: Searching across language, time, and space Harvesting translingual vocabulary mappings for multilingual digital libraries (Ray R. Larson, Fredric Gey, Aitao Chen) - Detecting events with date and place information in unstructured text (David A. Smith) - Using sharable ontology to retrieve historical images (Von-Wun Soo, Chen-Yu Lee, Jaw Jium Yeh, Ching-chih Chen) - Towards an electronic variorum edition of Cervantes' Don Quixote:: visualizations that support preparation (Rajiv Kochumman, Carlos Monroy, Richard Furuta, Arpita Goenka, Eduardo Urbina, Erendira Melgoza)
SESSION: Federating and harvesting metadata DP9: an OAI gateway service for web crawlers (Xiaoming Liu, Kurt Maly, Mohammad Zubair, Michael L. Nelson) - The Greenstone plugin architecture (Ian H. Witten, David Bainbridge, Gordon Paynter, Stefan Boddie) - Building FLOW: federating libraries on the web (Anna Keller Gold, Karen S. Baker, Jean-Yves LeMeur, Kim Baldridge) - JAFER ToolKit project: interfacing Z39.50 and XML (Antony Corfield, Matthew Dovey, Richard Mawby, Colin Tatham) - Schema extraction from XML collections (Boris Chidlovskii) - Mirroring an OAI archive on the I2-DSI channel (Ashwini Pande, Malini Kothapalli, Ryan Richardson, Edward A. Fox) SESSION: Music digital libraries HMM-based musical query retrieval (Jonah Shifrin, Bryan Pardo, Colin Meek, William Birmingham) - A comparison of melodic database retrieval techniques using sung queries (Ning Hu, Roger B. Dannenberg) - Enhancing access to the levy sheet music collection: reconstructing full-text lyrics from syllables (Brian Wingenroth, Mark Patton, Tim DiLauro) - Evaluating automatic melody segmentation aimed at music information retrieval (Massimo Melucci, Nicola Orio) SESSION: Preserving, securing, and assessing digital libraries A methodology and system for preserving digital data (Raymond A. Lorie) - Modeling web data (James C. French) - An evaluation model for a digital library services tool (Jim Dorward, Derek Reinke, Mimi Recker) - Why watermark?: the copyright need for an engineering solution (Michael Seadle, J. R. Deller, Jr., Aparna Gurijala) SESSION: Image and cultural digital libraries Time as essence for photo browsing through personal digital libraries (Adrian Graham, Hector Garcia-Molina, Andreas Paepcke, Terry Winograd) - Toward a distributed terabyte text retrieval system in China-US million book digital library (Bin Liu, Wen Gao, Ling Zhang, Tie-jun Huang, Xiao-ming Zhang, Jun Cheng) - Enhanced perspectives for historical and cultural documentaries using informedia technologies (Howard D. Wactlar, Ching-chih Chen) - Interfaces for palmtop image search (Mark Derthick)
SESSION: Digital libraries for spatial data The ADEPT digital library architecture (Greg Janée, James Frew) - G-Portal: a map-based digital library for distributed geospatial and georeferenced resources (Ee-Peng Lim, Dion Hoe-Lian Goh, Zehua Liu, Wee-Keong Ng, Christopher Soo-Guan Khoo, Susan Ellen Higgins) PANEL SESSION: Panels You mean I have to do what with whom: statewide museum/library DIGI collaborative digitization projects---the experiences of California, Colorado & North Carolina (Nancy Allen, Liz Bishoff, Robin Chandler, Kevin Cherry) - Overcoming impediments to effective health and biomedical digital libraries (William Hersh, Jan Velterop, Alexa McCray, Gunther Eynsenbach, Mark Boguski) - The challenges of statistical digital libraries (Cathryn Dippo, Patricia Cruse, Ann Green, Carol Hert) - Biodiversity and biocomplexity informatics: policy and implementation science versus citizen science (P. Bryan Heidorn) - Panel on digital preservation (Joyce Ray, Robin Dale, Reagan Moore, Vicky Reich, William Underwood, Alexa T. McCray) - NSDL: from prototype to production to transformational national resource (William Y. Arms, Edward Fox, Jeanne Narum, Ellen Hoffman) - How important is metadata? (Hector Garcia-Molina, Diane Hillmann, Carl Lagoze, Elizabeth Liddy, Stuart Weibel) - Planning for future digital libraries programs (Stephen M. Griffin) DEMONSTRATION SESSION: Demonstrations u.a.: FACET: thesaurus retrieval with semantic term expansion (Douglas Tudhope, Ceri Binding, Dorothee Blocks, Daniel Cunliffe) - MedTextus: an intelligent web-based medical meta-search system (Bin Zhu, Gondy Leroy, Hsinchun Chen, Yongchi Chen) POSTER SESSION: Posters TUTORIAL SESSION: Tutorials u.a.: Thesauri and ontologies in digital libraries: 1. structure and use in knowledge-based assistance to users (Dagobert Soergel) - How to build a digital library using open-source software (Ian H. Witten) - Thesauri and ontologies in digital libraries: 2. design, evaluation, and development (Dagobert Soergel) WORKSHOP SESSION: Workshops Document search interface design for large-scale collections and intelligent access (Javed Mostafa) - Visual interfaces to digital libraries (Katy Börner, Chaomei Chen) - Text retrieval conference (TREC) genomics pre-track workshop (William Hersh)
Survey of text mining : clustering, classification, and retrieval (2004) 0.01
```
0.011183213 = product of:
  0.03354964 = sum of:
    0.03354964 = weight(_text_:search in 804) [ClassicSimilarity], result of:
      0.03354964 = score(doc=804,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.19200584 = fieldWeight in 804, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=804)
  0.33333334 = coord(1/3)
```
Abstract

Extracting content from text continues to be an important research problem for information processing and management. Approaches to capture the semantics of text-based document collections may be based on Bayesian models, probability theory, vector space models, statistical models, or even graph theory. As the volume of digitized textual media continues to grow, so does the need for designing robust, scalable indexing and search strategies (software) to meet a variety of user needs. Knowledge extraction or creation from text requires systematic yet reliable processing that can be codified and adapted for changing needs and environments. This book will draw upon experts in both academia and industry to recommend practical approaches to the purification, indexing, and mining of textual information. It will address document identification, clustering and categorizing documents, cleaning text, and visualizing semantic models of text.

Search (32 results, page 1 of 2)

Authors

Years

Languages

Types

Themes

Subjects

Classifications