Search (314 results, page 1 of 16)

Rasmussen, E.M.: Indexing and retrieval for the Web (2002) 0.05
```
0.045482628 = product of:
  0.14554441 = sum of:
    0.04767549 = weight(_text_:wide in 4285) [ClassicSimilarity], result of:
      0.04767549 = score(doc=4285,freq=8.0), product of:
        0.13912784 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.031400457 = queryNorm
        0.342674 = fieldWeight in 4285, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4285)
    0.048388597 = weight(_text_:web in 4285) [ClassicSimilarity], result of:
      0.048388597 = score(doc=4285,freq=28.0), product of:
        0.10247572 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.031400457 = queryNorm
        0.47219574 = fieldWeight in 4285, product of:
          5.2915025 = tf(freq=28.0), with freq of:
            28.0 = termFreq=28.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4285)
    0.0055920132 = product of:
      0.0111840265 = sum of:
        0.0111840265 = weight(_text_:online in 4285) [ClassicSimilarity], result of:
          0.0111840265 = score(doc=4285,freq=2.0), product of:
            0.09529729 = queryWeight, product of:
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.031400457 = queryNorm
            0.11735933 = fieldWeight in 4285, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.02734375 = fieldNorm(doc=4285)
      0.5 = coord(1/2)
    0.0144925695 = weight(_text_:information in 4285) [ClassicSimilarity], result of:
      0.0144925695 = score(doc=4285,freq=30.0), product of:
        0.055122808 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.031400457 = queryNorm
        0.2629142 = fieldWeight in 4285, product of:
          5.477226 = tf(freq=30.0), with freq of:
            30.0 = termFreq=30.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4285)
    0.02939574 = weight(_text_:retrieval in 4285) [ClassicSimilarity], result of:
      0.02939574 = score(doc=4285,freq=14.0), product of:
        0.09498371 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.031400457 = queryNorm
        0.30948192 = fieldWeight in 4285, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4285)
  0.3125 = coord(5/16)
```
Abstract

The introduction and growth of the World Wide Web (WWW, or Web) have resulted in a profound change in the way individuals and organizations access information. In terms of volume, nature, and accessibility, the characteristics of electronic information are significantly different from those of even five or six years ago. Control of, and access to, this flood of information rely heavily an automated techniques for indexing and retrieval. According to Gudivada, Raghavan, Grosky, and Kasanagottu (1997, p. 58), "The ability to search and retrieve information from the Web efficiently and effectively is an enabling technology for realizing its full potential." Almost 93 percent of those surveyed consider the Web an "indispensable" Internet technology, second only to e-mail (Graphie, Visualization & Usability Center, 1998). Although there are other ways of locating information an the Web (browsing or following directory structures), 85 percent of users identify Web pages by means of a search engine (Graphie, Visualization & Usability Center, 1998). A more recent study conducted by the Stanford Institute for the Quantitative Study of Society confirms the finding that searching for information is second only to e-mail as an Internet activity (Nie & Ebring, 2000, online). In fact, Nie and Ebring conclude, "... the Internet today is a giant public library with a decidedly commercial tilt. The most widespread use of the Internet today is as an information search utility for products, travel, hobbies, and general information. Virtually all users interviewed responded that they engaged in one or more of these information gathering activities."
Techniques for automated indexing and information retrieval (IR) have been developed, tested, and refined over the past 40 years, and are well documented (see, for example, Agosti & Smeaton, 1996; BaezaYates & Ribeiro-Neto, 1999a; Frakes & Baeza-Yates, 1992; Korfhage, 1997; Salton, 1989; Witten, Moffat, & Bell, 1999). With the introduction of the Web, and the capability to index and retrieve via search engines, these techniques have been extended to a new environment. They have been adopted, altered, and in some Gases extended to include new methods. "In short, search engines are indispensable for searching the Web, they employ a variety of relatively advanced IR techniques, and there are some peculiar aspects of search engines that make searching the Web different than more conventional information retrieval" (Gordon & Pathak, 1999, p. 145). The environment for information retrieval an the World Wide Web differs from that of "conventional" information retrieval in a number of fundamental ways. The collection is very large and changes continuously, with pages being added, deleted, and altered. Wide variability between the size, structure, focus, quality, and usefulness of documents makes Web documents much more heterogeneous than a typical electronic document collection. The wide variety of document types includes images, video, audio, and scripts, as well as many different document languages. Duplication of documents and sites is common. Documents are interconnected through networks of hyperlinks. Because of the size and dynamic nature of the Web, preprocessing all documents requires considerable resources and is often not feasible, certainly not an the frequent basis required to ensure currency. Query length is usually much shorter than in other environments-only a few words-and user behavior differs from that in other environments. These differences make the Web a novel environment for information retrieval (Baeza-Yates & Ribeiro-Neto, 1999b; Bharat & Henzinger, 1998; Huang, 2000).

Source

Annual review of information science and technology. 37(2003), S.91-126

Koch, T.: Experiments with automatic classification of WAIS databases and indexing of WWW : some results from the Nordic WAIS/WWW project (1994) 0.03

0.031523272 = product of:
  0.12609309 = sum of:
    0.06742332 = weight(_text_:wide in 7209) [ClassicSimilarity], result of:
      0.06742332 = score(doc=7209,freq=4.0), product of:
        0.13912784 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.031400457 = queryNorm
        0.4846142 = fieldWeight in 7209, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7209)
    0.025864797 = weight(_text_:web in 7209) [ClassicSimilarity], result of:
      0.025864797 = score(doc=7209,freq=2.0), product of:
        0.10247572 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.031400457 = queryNorm
        0.25239927 = fieldWeight in 7209, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7209)
    0.010583877 = weight(_text_:information in 7209) [ClassicSimilarity], result of:
      0.010583877 = score(doc=7209,freq=4.0), product of:
        0.055122808 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.031400457 = queryNorm
        0.1920054 = fieldWeight in 7209, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7209)
    0.022221092 = weight(_text_:retrieval in 7209) [ClassicSimilarity], result of:
      0.022221092 = score(doc=7209,freq=2.0), product of:
        0.09498371 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.031400457 = queryNorm
        0.23394634 = fieldWeight in 7209, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7209)
  0.25 = coord(4/16)

Abstract: The Nordic WAIS/WWW project sponsored by NORDINFO is a joint project between Lund University Library and the National Technological Library of Denmark. It aims to improve the existing networked information discovery and retrieval tools Wide Area Information System (WAIS) and World Wide Web (WWW), and to move towards unifying WWW and WAIS. Details current results focusing on the WAIS side of the project. Describes research into automatic indexing and classification of WAIS sources, development of an orientation tool for WAIS, and development of a WAIS index of WWW resources

Hauer, M: Silicon Valley Vorarlberg : Maschinelle Indexierung und semantisches Retrieval verbessert den Katalog der Vorarlberger Landesbibliothek (2004) 0.03

0.0313765 = product of:
  0.125506 = sum of:
    0.031999387 = weight(_text_:web in 2489) [ClassicSimilarity], result of:
      0.031999387 = score(doc=2489,freq=6.0), product of:
        0.10247572 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.031400457 = queryNorm
        0.3122631 = fieldWeight in 2489, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2489)
    0.056416538 = weight(_text_:benutzer in 2489) [ClassicSimilarity], result of:
      0.056416538 = score(doc=2489,freq=2.0), product of:
        0.17907447 = queryWeight, product of:
          5.7029257 = idf(docFreq=400, maxDocs=44218)
          0.031400457 = queryNorm
        0.31504512 = fieldWeight in 2489, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.7029257 = idf(docFreq=400, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2489)
    0.005345665 = weight(_text_:information in 2489) [ClassicSimilarity], result of:
      0.005345665 = score(doc=2489,freq=2.0), product of:
        0.055122808 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.031400457 = queryNorm
        0.09697737 = fieldWeight in 2489, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2489)
    0.031744417 = weight(_text_:retrieval in 2489) [ClassicSimilarity], result of:
      0.031744417 = score(doc=2489,freq=8.0), product of:
        0.09498371 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.031400457 = queryNorm
        0.33420905 = fieldWeight in 2489, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2489)
  0.25 = coord(4/16)

Abstract: 10 Jahre Internet haben die WeIt um die Bibliotheken herum stark geändert. Der Web-OPAC war eine Antwort der Bibliotheken. Doch reicht ein Web-OPAC im Zeitalter des Internets noch aus? Außer Web ist es doch der alte Katalog. Ca. 90% aller Bibliotheksrecherchen durch Benutzer sind Themenrecherchen. Ein Anteil dieser Recherchen bringt kein Ergebnis. Es kann leicht gemessen werden, dass null Medien gefunden wurden. Die Gründe hierfür wurden auch immer wieder untersucht: Plural- anstelle Singularformen, zu spezifische Suchbegriffe, Schreib- oder Bedienungsfehler. Zu wenig untersucht sind aber die Recherchen, die nicht mit einer Ausleihe enden, denn auch dann kann man in vielen Fällen von einem Retrieval-Mangel ausgehen. Schließlich: Von den ausgeliehenen Büchern werden nach Einschätzung vieler Bibliothekare 80% nicht weiter als bis zum Inhaltsverzeichnis gelesen (außer in Präsenzbibliotheken) - und erst nach Wochen zurückgegeben. Ein Politiker würde dies neudeutsch als "ein Vermittlungsproblem" bezeichnen. Ein Controller als nicht hinreichende Kapitalnutzung. Einfacher machen es sich immer mehr Studenten und Wissenschaftler, ihr Wissensaustausch vollzieht sich zunehmend an anderen Orten. Bibliotheken (als Funktion) sind unverzichtbar für die wissenschaftliche Kommunikation. Deshalb geht es darum, Wege zu finden und auch zu beschreiten, welche die Schätze von Bibliotheken (als Institution) effizienter an die Zielgruppe bringen. Der Einsatz von Information Retrieval-Technologie, neue Erschließungsmethoden und neuer Content sind Ansätze dazu. Doch die bisherigen Verbundstrukturen und Abhängigkeit haben das hier vorgestellte innovative Projekt keineswegs gefördert. Innovation entsteht wie die Innvoationsforschung zeigt eigentlich immer an der Peripherie: in Bregenz fing es an.
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Search Engines and Beyond : Developing efficient knowledge management systems, April 19-20 1999, Boston, Mass (1999) 0.03

0.030720102 = product of:
  0.098304324 = sum of:
    0.027243135 = weight(_text_:wide in 2596) [ClassicSimilarity], result of:
      0.027243135 = score(doc=2596,freq=2.0), product of:
        0.13912784 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.031400457 = queryNorm
        0.1958137 = fieldWeight in 2596, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.03125 = fieldNorm(doc=2596)
    0.0147798825 = weight(_text_:web in 2596) [ClassicSimilarity], result of:
      0.0147798825 = score(doc=2596,freq=2.0), product of:
        0.10247572 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.031400457 = queryNorm
        0.14422815 = fieldWeight in 2596, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.03125 = fieldNorm(doc=2596)
    0.0060479296 = weight(_text_:information in 2596) [ClassicSimilarity], result of:
      0.0060479296 = score(doc=2596,freq=4.0), product of:
        0.055122808 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.031400457 = queryNorm
        0.10971737 = fieldWeight in 2596, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.03125 = fieldNorm(doc=2596)
    0.02839307 = weight(_text_:retrieval in 2596) [ClassicSimilarity], result of:
      0.02839307 = score(doc=2596,freq=10.0), product of:
        0.09498371 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.031400457 = queryNorm
        0.29892567 = fieldWeight in 2596, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=2596)
    0.021840302 = weight(_text_:software in 2596) [ClassicSimilarity], result of:
      0.021840302 = score(doc=2596,freq=2.0), product of:
        0.124570385 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.031400457 = queryNorm
        0.17532499 = fieldWeight in 2596, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03125 = fieldNorm(doc=2596)
  0.3125 = coord(5/16)

Content: Ramana Rao (Inxight, Palo Alto, CA) 7 ± 2 Insights on achieving Effective Information Access Session One: Updates and a twelve month perspective Danny Sullivan (Search Engine Watch, US / England) Portalization and other search trends Carol Tenopir (University of Tennessee) Search realities faced by end users and professional searchers Session Two: Today's search engines and beyond Daniel Hoogterp (Retrieval Technologies, McLean, VA) Effective presentation and utilization of search techniques Rick Kenny (Fulcrum Technologies, Ontario, Canada) Beyond document clustering: The knowledge impact statement Gary Stock (Ingenius, Kalamazoo, MI) Automated change monitoring Gary Culliss (Direct Hit, Wellesley Hills, MA) User popularity ranked search engines Byron Dom (IBM, CA) Automatically finding the best pages on the World Wide Web (CLEVER) Peter Tomassi (LookSmart, San Francisco, CA) Adding human intellect to search technology Session Three: Panel discussion: Human v automated categorization and editing Ev Brenner (New York, NY)- Chairman James Callan (University of Massachusetts, MA) Marc Krellenstein (Northern Light Technology, Cambridge, MA) Dan Miller (Ask Jeeves, Berkeley, CA) Session Four: Updates and a twelve month perspective Steve Arnold (AIT, Harrods Creek, KY) Review: The leading edge in search and retrieval software Ellen Voorhees (NIST, Gaithersburg, MD) TREC update Session Five: Search engines now and beyond Intelligent Agents John Snyder (Muscat, Cambridge, England) Practical issues behind intelligent agents Text summarization Therese Firmin, (Dept of Defense, Ft George G. Meade, MD) The TIPSTER/SUMMAC evaluation of automatic text summarization systems Cross language searching Elizabeth Liddy (TextWise, Syracuse, NY) A conceptual interlingua approach to cross-language retrieval. Video search and retrieval Armon Amir (IBM, Almaden, CA) CueVideo: Modular system for automatic indexing and browsing of video/audio Speech recognition Michael Witbrock (Lycos, Waltham, MA) Retrieval of spoken documents Visualization James A. Wise (Integral Visuals, Richland, WA) Information visualization in the new millennium: Emerging science or passing fashion? Text mining David Evans (Claritech, Pittsburgh, PA) Text mining - towards decision support

Wolfekuhler, M.R.; Punch, W.F.: Finding salient features for personal Web pages categories (1997) 0.03

0.028712178 = product of:
  0.11484871 = sum of:
    0.04767549 = weight(_text_:wide in 2673) [ClassicSimilarity], result of:
      0.04767549 = score(doc=2673,freq=2.0), product of:
        0.13912784 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.031400457 = queryNorm
        0.342674 = fieldWeight in 2673, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2673)
    0.04479914 = weight(_text_:web in 2673) [ClassicSimilarity], result of:
      0.04479914 = score(doc=2673,freq=6.0), product of:
        0.10247572 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.031400457 = queryNorm
        0.43716836 = fieldWeight in 2673, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2673)
    0.0074839313 = weight(_text_:information in 2673) [ClassicSimilarity], result of:
      0.0074839313 = score(doc=2673,freq=2.0), product of:
        0.055122808 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.031400457 = queryNorm
        0.13576832 = fieldWeight in 2673, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2673)
    0.014890149 = product of:
      0.029780298 = sum of:
        0.029780298 = weight(_text_:22 in 2673) [ClassicSimilarity], result of:
          0.029780298 = score(doc=2673,freq=2.0), product of:
            0.10995905 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.031400457 = queryNorm
            0.2708308 = fieldWeight in 2673, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2673)
      0.5 = coord(1/2)
  0.25 = coord(4/16)

Abstract: Examines techniques that discover features in sets of pre-categorized documents, such that similar documents can be found on the WWW. Examines techniques which will classifiy training examples with high accuracy, then explains why this is not necessarily useful. Describes a method for extracting word clusters from the raw document features. Results show that the clustering technique is successful in discovering word groups in personal Web pages which can be used to find similar information on the WWW
Date: 1. 8.1996 22:08:06
Footnote: Contribution to a special issue of papers from the 6th International World Wide Web conference, held 7-11 Apr 1997, Santa Clara, California

Rädler, K.: In Bibliothekskatalogen "googlen" : Integration von Inhaltsverzeichnissen, Volltexten und WEB-Ressourcen in Bibliothekskataloge (2004) 0.02
```
0.024027316 = product of:
  0.09610926 = sum of:
    0.018474855 = weight(_text_:web in 2432) [ClassicSimilarity], result of:
      0.018474855 = score(doc=2432,freq=2.0), product of:
        0.10247572 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.031400457 = queryNorm
        0.18028519 = fieldWeight in 2432, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2432)
    0.056416538 = weight(_text_:benutzer in 2432) [ClassicSimilarity], result of:
      0.056416538 = score(doc=2432,freq=2.0), product of:
        0.17907447 = queryWeight, product of:
          5.7029257 = idf(docFreq=400, maxDocs=44218)
          0.031400457 = queryNorm
        0.31504512 = fieldWeight in 2432, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.7029257 = idf(docFreq=400, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2432)
    0.005345665 = weight(_text_:information in 2432) [ClassicSimilarity], result of:
      0.005345665 = score(doc=2432,freq=2.0), product of:
        0.055122808 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.031400457 = queryNorm
        0.09697737 = fieldWeight in 2432, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2432)
    0.015872208 = weight(_text_:retrieval in 2432) [ClassicSimilarity], result of:
      0.015872208 = score(doc=2432,freq=2.0), product of:
        0.09498371 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.031400457 = queryNorm
        0.16710453 = fieldWeight in 2432, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2432)
  0.25 = coord(4/16)
```
Abstract

Ausgangslage Die Katalog-Recherchen über Internet, also von außerhalb der Bibliothek, nehmen erwartungsgemäß stark zu bzw. sind mittlerweile die Regel. Damit ist natürlich das Bedürfnis und die Notwendigkeit gewachsen, über den Titel hinaus zusätzliche inhaltliche Informationen zu erhalten, die es erlauben, die Zweckmäßigkeit wesentlich besser abschätzen zu können, eine Bestellung vorzunehmen oder vielleicht auch 50 km in die Bibliothek zu fahren, um ein Buch zu entleihen. Dieses Informationsdefizit wird zunehmend als gravierender Mangel erfahren. Inhaltsverzeichnisse referieren den Inhalt kurz und prägnant. Sie sind die erste Stelle, welche zur Relevanz-Beurteilung herangezogen wird. Fast alle relevanten Terme einer Fachbuchpublikation finden sich bereits dort. Andererseits wird immer deutlicher, dass die dem bibliothekarischen Paradigma entsprechende intellektuelle Indexierung der einzelnen dokumentarischen Einheiten mit den engsten umfassenden dokumentationssprachlichen Termen (Schlagwörter, Klassen) zwar eine notwendige, aber keinesfalls hinreichende Methode darstellt, das teuer erworbene Bibliotheksgut Information für den Benutzer in seiner spezifischen Problemstellung zu aktivieren und als Informationsdienstleistung anbieten zu können. Informationen zu sehr speziellen Fragestellungen, die oft nur in kürzeren Abschnitten (Kapitel) erörtert werden, sind derzeit nur indirekt, mit großem Zeitaufwand und oft überhaupt nicht auffindbar. Sie liegen sozusagen brach. Die Tiefe der intellektuellen Indexierung bis in einzelne inhaltliche Details zu erweitern, ist aus personellen und damit auch finanziellen Gesichtspunkten nicht vertretbar. Bibliotheken fallen deshalb in der Wahrnehmung von Informationssuchenden immer mehr zurück. Die enorme Informationsvielfalt liegt hinter dem Informations- bzw. Recherchehorizont der bibliographischen Aufnahmen im Katalog.

Theme

Semantisches Umfeld in Indexierung u. Retrieval

Salton, G.; McGill, M. J.: Information Retrieval: Grundlegendes für Informationswissenschaftler (1987) 0.02

0.024019053 = product of:
  0.12810162 = sum of:
    0.018517928 = weight(_text_:information in 8648) [ClassicSimilarity], result of:
      0.018517928 = score(doc=8648,freq=6.0), product of:
        0.055122808 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.031400457 = queryNorm
        0.3359395 = fieldWeight in 8648, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.078125 = fieldNorm(doc=8648)
    0.054982945 = weight(_text_:retrieval in 8648) [ClassicSimilarity], result of:
      0.054982945 = score(doc=8648,freq=6.0), product of:
        0.09498371 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.031400457 = queryNorm
        0.5788671 = fieldWeight in 8648, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=8648)
    0.054600753 = weight(_text_:software in 8648) [ClassicSimilarity], result of:
      0.054600753 = score(doc=8648,freq=2.0), product of:
        0.124570385 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.031400457 = queryNorm
        0.43831247 = fieldWeight in 8648, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.078125 = fieldNorm(doc=8648)
  0.1875 = coord(3/16)

Content: Enthält die Kapitel: Information Retrieval: eine Einführung; Invertierte Dateisysteme; Textanalyse und automatisches Indexieren; Die experimentellen Retrievalsysteme SMART und SIRE; Die Bewertung von Retrievalsystemen; Fortgeschrittene Retrievaltechniken; Verarbeitung natürlicher Sprache; Informationstechnologie: Hardware und Software; Datenbankmanagementsysteme; Zukünftige Entwicklungen im Information Retrieval

Pritchard, J.: Information retrieval : smarter indexing (1991) 0.02

0.02149012 = product of:
  0.114613965 = sum of:
    0.015119824 = weight(_text_:information in 4890) [ClassicSimilarity], result of:
      0.015119824 = score(doc=4890,freq=4.0), product of:
        0.055122808 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.031400457 = queryNorm
        0.27429342 = fieldWeight in 4890, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.078125 = fieldNorm(doc=4890)
    0.044893384 = weight(_text_:retrieval in 4890) [ClassicSimilarity], result of:
      0.044893384 = score(doc=4890,freq=4.0), product of:
        0.09498371 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.031400457 = queryNorm
        0.47264296 = fieldWeight in 4890, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=4890)
    0.054600753 = weight(_text_:software in 4890) [ClassicSimilarity], result of:
      0.054600753 = score(doc=4890,freq=2.0), product of:
        0.124570385 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.031400457 = queryNorm
        0.43831247 = fieldWeight in 4890, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.078125 = fieldNorm(doc=4890)
  0.1875 = coord(3/16)

Abstract: Describes full text retrieval (FTR) which indexes every occurrence of every word except defined 'stop' words. This permits much more sophisticated searching than with keyword indexing. Also discusses document imaging processing (DIP). Lists suppliers and users of the software and describes the experiences of ESOO's Planning Division with Computer Intertrade Ltd. (CIL) ImagePro DIP and their operational practices
Source: Advanced information report. 1991, S.7-9

Renz, M.: Automatische Inhaltserschließung im Zeichen von Wissensmanagement (2001) 0.02

0.020703925 = product of:
  0.0828157 = sum of:
    0.0074839313 = weight(_text_:information in 5671) [ClassicSimilarity], result of:
      0.0074839313 = score(doc=5671,freq=2.0), product of:
        0.055122808 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.031400457 = queryNorm
        0.13576832 = fieldWeight in 5671, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5671)
    0.022221092 = weight(_text_:retrieval in 5671) [ClassicSimilarity], result of:
      0.022221092 = score(doc=5671,freq=2.0), product of:
        0.09498371 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.031400457 = queryNorm
        0.23394634 = fieldWeight in 5671, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5671)
    0.038220525 = weight(_text_:software in 5671) [ClassicSimilarity], result of:
      0.038220525 = score(doc=5671,freq=2.0), product of:
        0.124570385 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.031400457 = queryNorm
        0.30681872 = fieldWeight in 5671, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5671)
    0.014890149 = product of:
      0.029780298 = sum of:
        0.029780298 = weight(_text_:22 in 5671) [ClassicSimilarity], result of:
          0.029780298 = score(doc=5671,freq=2.0), product of:
            0.10995905 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.031400457 = queryNorm
            0.2708308 = fieldWeight in 5671, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5671)
      0.5 = coord(1/2)
  0.25 = coord(4/16)

Abstract: Methoden der automatischen Inhaltserschließung werden seit mehr als 30 Jahren entwickelt, ohne in luD-Kreisen auf merkliche Akzeptanz zu stoßen. Gegenwärtig führen jedoch die steigende Informationsflut und der Bedarf an effizienten Zugriffsverfahren im Informations- und Wissensmanagement in breiten Anwenderkreisen zu einem wachsenden Interesse an diesen Methoden, zu verstärkten Anstrengungen in Forschung und Entwicklung und zu neuen Produkten. In diesem Beitrag werden verschiedene Ansätze zu intelligentem und inhaltsbasiertem Retrieval und zur automatischen Inhaltserschließung diskutiert sowie kommerziell vertriebene Softwarewerkzeuge und Lösungen präsentiert. Abschließend wird festgestellt, dass in naher Zukunft mit einer zunehmenden Automatisierung von bestimmten Komponenten des Informations- und Wissensmanagements zu rechnen ist, indem Software-Werkzeuge zur automatischen Inhaltserschließung in den Workflow integriert werden
Date: 22. 3.2001 13:14:48
Source: nfd Information - Wissenschaft und Praxis. 52(2001) H.2, S.69-78

Volk, M.; Mittermaier, H.; Schurig, A.; Biedassek, T.: Halbautomatische Volltextanalyse, Datenbankaufbau und Document Retrieval (1992) 0.02

0.020379033 = product of:
  0.108688176 = sum of:
    0.07898315 = weight(_text_:benutzer in 2571) [ClassicSimilarity], result of:
      0.07898315 = score(doc=2571,freq=2.0), product of:
        0.17907447 = queryWeight, product of:
          5.7029257 = idf(docFreq=400, maxDocs=44218)
          0.031400457 = queryNorm
        0.44106317 = fieldWeight in 2571, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.7029257 = idf(docFreq=400, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2571)
    0.0074839313 = weight(_text_:information in 2571) [ClassicSimilarity], result of:
      0.0074839313 = score(doc=2571,freq=2.0), product of:
        0.055122808 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.031400457 = queryNorm
        0.13576832 = fieldWeight in 2571, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2571)
    0.022221092 = weight(_text_:retrieval in 2571) [ClassicSimilarity], result of:
      0.022221092 = score(doc=2571,freq=2.0), product of:
        0.09498371 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.031400457 = queryNorm
        0.23394634 = fieldWeight in 2571, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2571)
  0.1875 = coord(3/16)

Abstract: In diesem Aufsatz beschreiben wir ein System zur Analyse von Kurzartikeln. Das System arbeitet halbautomatisch. Das heißt, zunächst wird der Artikel vom System analysiert und dann dem benutzer zur Nachberarbeitung vorgelegt. Die so gewonnene Information wird in einem Datenbankeintrag abgelegt. Über die Datenbank - in dBase IV implementiert - sind dann Abfragen und Zugriffe auf die Originaltexte effizient möglich. Der Kern dieses Aufsatzes betrifft die halbautomatische Analyse. Wir beschreiben unser Verfahren für parametrisiertes Pattern Matching sowie linguistische Heuristiken zur Ermittlung von Nominalphrasen und Präpositionalphrasen. Das System wurde für den praktischen Einsatz im Bonner Büro des 'Forums InformatikerInnen Für Frieden und gesellschaftliche Verantwortung e.V. (FIFF)' entwickelt

Kaiser, A.: Computer-unterstütztes Indexieren in Intelligenten Information Retrieval Systemen : Ein Relevanz-Feedback orientierter Ansatz zur Informationserschließung in unformatierten Datenbanken (1993) 0.02
```
0.020300776 = product of:
  0.1082708 = sum of:
    0.06769984 = weight(_text_:benutzer in 4284) [ClassicSimilarity], result of:
      0.06769984 = score(doc=4284,freq=8.0), product of:
        0.17907447 = queryWeight, product of:
          5.7029257 = idf(docFreq=400, maxDocs=44218)
          0.031400457 = queryNorm
        0.37805414 = fieldWeight in 4284, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          5.7029257 = idf(docFreq=400, maxDocs=44218)
          0.0234375 = fieldNorm(doc=4284)
    0.012000988 = weight(_text_:information in 4284) [ClassicSimilarity], result of:
      0.012000988 = score(doc=4284,freq=28.0), product of:
        0.055122808 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.031400457 = queryNorm
        0.21771365 = fieldWeight in 4284, product of:
          5.2915025 = tf(freq=28.0), with freq of:
            28.0 = termFreq=28.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0234375 = fieldNorm(doc=4284)
    0.028569974 = weight(_text_:retrieval in 4284) [ClassicSimilarity], result of:
      0.028569974 = score(doc=4284,freq=18.0), product of:
        0.09498371 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.031400457 = queryNorm
        0.30078813 = fieldWeight in 4284, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0234375 = fieldNorm(doc=4284)
  0.1875 = coord(3/16)
```
Abstract

Information ist in unserer Zeit zu einem sehr wichtigen Gut geworden. Sie ist Grundlage jeglicher seriösen Entscheidungsfindung. Die Informationsflut ist in den letzten Jahren stark angestiegen und auch in absehbarer Zeit wird die Menge der Informationen weiter anwachsen. Daher wird es immer wichtiger, sich ''Information über Information'' zu organisieren. Es ist nicht möglich, über alle Bereiche, mit denen man konfrontiert wird, im letzten Detail informiert zu sein. Notwendig und wichtig ist es aber zu wissen, wo man sich informieren kann. Relevante Informationen müssen möglichst rasch gefunden werden können. Im praktischen, EDV-unterstützten Einsatz findet man zu diesem Zweck Informationssysteme verschiedenster Art. Das Spektrum reicht dabei von Management-Informationssystemen, über Expertensysteme bis zu Datenbanksystemen und Information Retrieval Systemen (IR-Systemen). Obwohl die einzelnen Typen dieser informationsverarbeitenden Systeme für unterschiedliche Anwendergruppen und unterschiedliche Aufgabenarten konzipiert sind, ergeben sich beim Entwurf der Systeme doch sehr ähnlich gelagerte Problemkreise und Fragestellungen. * Die Darstellung und die Organisation von bestehendem Wissen und bekannten Fakten im Informationssystem (Informationserschließung). * Das (Wieder)finden relevanter Informationen aus dem Informationssystem und das Führen des Benutzers durch das Informationssystem. Ein Information Retrieval System beinhaltet unstrukturierte bibliographische oder textuelle Dokumente und unterscheidet sich dadurch wesentlich von Datenbanksystemen, die für gewöhnlich strukturierte Daten enthalten.
Konventionelle, formatierte Datenbanken sind heute in der Praxis bereits weit verbreitet. Dies nicht zuletzt auch deshalb, weil unter anderem die standardisierte Abfragesprache SQL existiert und insbesondere bei relationalen Datenbanksystemen die Forschung intensiv an Verbesserungen in Aufbau und Performance der Systeme arbeitet. Die Verbreitung und Akzeptanz von unformatierten Datenbanken, Information Retrieval Systemen, ist hingegen bei weitem nicht so weit gediehen. Ein Grund dafür ist in der mangelnden Benutzerfreundlichkeit der IR-Systeme und in unzulänglichen Methoden der Informationserschließung zu suchen. Mit der vorliegenden Arbeit soll eine Methode zur Informationserschliessung in Information Retrieval Systemen entwickelt werden, die die Bedürfnisse des Benutzers in den Mittelpunkt stellt und so einen Beitrag dazu leistet, die Akzeptanz und Verbreitung von Information Retrieval Systemen, insbesondere für den Bürobereich, zu erhöhen. Die Fragestellung lautet somit: Ist es möglich, den Benutzer bereits im Stadium der Indexierung von Dokumenten in verstärktem Maße miteinzubeziehen, ohne dabei aber auf die maschinelle Unterstützung völlig zu verzichten, wie dies bei der manuellen Indexierung der Fall ist. Jedes Retrievalsystem kann als ein System beschrieben werden, das aus einer Menge von Dokumenten und einer Menge von Suchfragen besteht und das einen Mechanismus enthält, der die für eine Suchanfrage relevanten Dokumente bestimmt.
Dazu sind folgende Teile eines IR-Systems notwendig: * Informationserschließung Eine Komponente zur Erschließung und Darstellung der gespeicherten Informationen. Dieser Teil dient dazu, den Inhalt der Dokumente zu beschreiben und so darzustellen, daß aufgrund dieser Merkmale ein Dokument gefunden werden kann. Eine Möglichkeit dazu besteht darin, den Dokumenten inhaltsbeschreibende Deskriptoren zuzuordnen. Durch den Prozeß der Indexierung werden die Dokumente in eine Indexierungssprache übersetzt. * Query-Language (Abfragesprache) Eine Komponente zur Formulierung der Suchanfragen des Benutzers. Dieser Teil dient dazu, die Suchanfrage des Benutzers so zu verarbeiten, daß mit der aus der Frage gewonnenen Information über die Bedürfnisse des Benutzers die passenden Dokumente gefunden werden können. * Informationsausgabe - Informationsaufbereitung Eine Komponente zur Ausgabe der auf Grund der Suchanfrage gefundenen Informationen. Dieser Teil stellt das Ergebnis der Suchanfrage dem Benutzer zur Verfügung.
Es würde den Rahmen der Arbeit sprengen, alle Komponenten eines Information Retrieval Systems zu untersuchen. Daher wird ein Schwerpunkt auf die Informationserschließung gelegt. Dabei wird die (semi)automatische Indexierung von Dokumenten zum Zwecke des Information Retrievals, also der Vorgang der Übersetzung der Dokumente in eine Indexierungssprache genauer behandelt. Dieser Schwerpunkt wurde unter anderem deshalb gewählt, weil meiner Ansicht nach die festzustellende mangelnde Akzeptanz von Information Retrieval Systemen auch damit zu begründen ist, daß die in der Praxis eingesetzten Indexierungskomponenten der Systeme zur Zeit noch nicht den Leistungsumfang erbringen, den der Benutzer von einem ''Intelligenten Information Retrieval System'' erwartet. Ziel der Arbeit ist es, ein Modell zur automatischen Indexierung schrittweise zu entwickeln, das den Benutzer in stärkerem Maße in die Indexierung mit einbezieht, als dies bei den in Literatur und Praxis beschriebenen Verfahren der Fall ist.

Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.02

0.019112218 = product of:
  0.101931825 = sum of:
    0.017106129 = weight(_text_:information in 402) [ClassicSimilarity], result of:
      0.017106129 = score(doc=402,freq=2.0), product of:
        0.055122808 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.031400457 = queryNorm
        0.3103276 = fieldWeight in 402, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.125 = fieldNorm(doc=402)
    0.050791066 = weight(_text_:retrieval in 402) [ClassicSimilarity], result of:
      0.050791066 = score(doc=402,freq=2.0), product of:
        0.09498371 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.031400457 = queryNorm
        0.5347345 = fieldWeight in 402, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.125 = fieldNorm(doc=402)
    0.03403463 = product of:
      0.06806926 = sum of:
        0.06806926 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
          0.06806926 = score(doc=402,freq=2.0), product of:
            0.10995905 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.031400457 = queryNorm
            0.61904186 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
      0.5 = coord(1/2)
  0.1875 = coord(3/16)

Source: Information processing and management. 22(1986) no.6, S.465-476

Krüger, C.: Evaluation des WWW-Suchdienstes GERHARD unter besonderer Beachtung automatischer Indexierung (1999) 0.02
```
0.01813755 = product of:
  0.0967336 = sum of:
    0.04815952 = weight(_text_:wide in 1777) [ClassicSimilarity], result of:
      0.04815952 = score(doc=1777,freq=4.0), product of:
        0.13912784 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.031400457 = queryNorm
        0.34615302 = fieldWeight in 1777, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1777)
    0.026127389 = weight(_text_:web in 1777) [ClassicSimilarity], result of:
      0.026127389 = score(doc=1777,freq=4.0), product of:
        0.10247572 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.031400457 = queryNorm
        0.25496176 = fieldWeight in 1777, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1777)
    0.022446692 = weight(_text_:retrieval in 1777) [ClassicSimilarity], result of:
      0.022446692 = score(doc=1777,freq=4.0), product of:
        0.09498371 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.031400457 = queryNorm
        0.23632148 = fieldWeight in 1777, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1777)
  0.1875 = coord(3/16)
```
Abstract

Die vorliegende Arbeit beinhaltet eine Beschreibung und Evaluation des WWW - Suchdienstes GERHARD (German Harvest Automated Retrieval and Directory). GERHARD ist ein Such- und Navigationssystem für das deutsche World Wide Web, weiches ausschließlich wissenschaftlich relevante Dokumente sammelt, und diese auf der Basis computerlinguistischer und statistischer Methoden automatisch mit Hilfe eines bibliothekarischen Klassifikationssystems klassifiziert. Mit dem DFG - Projekt GERHARD ist der Versuch unternommen worden, mit einem auf einem automatischen Klassifizierungsverfahren basierenden World Wide Web - Dienst eine Alternative zu herkömmlichen Methoden der Interneterschließung zu entwickeln. GERHARD ist im deutschsprachigen Raum das einzige Verzeichnis von Internetressourcen, dessen Erstellung und Aktualisierung vollständig automatisch (also maschinell) erfolgt. GERHARD beschränkt sich dabei auf den Nachweis von Dokumenten auf wissenschaftlichen WWW - Servern. Die Grundidee dabei war, kostenintensive intellektuelle Erschließung und Klassifizierung von lnternetseiten durch computerlinguistische und statistische Methoden zu ersetzen, um auf diese Weise die nachgewiesenen Internetressourcen automatisch auf das Vokabular eines bibliothekarischen Klassifikationssystems abzubilden. GERHARD steht für German Harvest Automated Retrieval and Directory. Die WWW - Adresse (URL) von GERHARD lautet: http://www.gerhard.de. Im Rahmen der vorliegenden Diplomarbeit soll eine Beschreibung des Dienstes mit besonderem Schwerpunkt auf dem zugrundeliegenden Indexierungs- bzw. Klassifizierungssystem erfolgen und anschließend mit Hilfe eines kleinen Retrievaltests die Effektivität von GERHARD überprüft werden.

Gábor, K.; Zargayouna, H.; Tellier, I.; Buscaldi, D.; Charnois, T.: ¬A typology of semantic relations dedicated to scientific literature analysis (2016) 0.02

0.017955258 = product of:
  0.09576137 = sum of:
    0.04767549 = weight(_text_:wide in 2933) [ClassicSimilarity], result of:
      0.04767549 = score(doc=2933,freq=2.0), product of:
        0.13912784 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.031400457 = queryNorm
        0.342674 = fieldWeight in 2933, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2933)
    0.025864797 = weight(_text_:web in 2933) [ClassicSimilarity], result of:
      0.025864797 = score(doc=2933,freq=2.0), product of:
        0.10247572 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.031400457 = queryNorm
        0.25239927 = fieldWeight in 2933, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2933)
    0.022221092 = weight(_text_:retrieval in 2933) [ClassicSimilarity], result of:
      0.022221092 = score(doc=2933,freq=2.0), product of:
        0.09498371 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.031400457 = queryNorm
        0.23394634 = fieldWeight in 2933, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2933)
  0.1875 = coord(3/16)

Content: Vortrag, "Semantics, Analytics, Visualisation: Enhancing Scholarly Data Workshop co-located with the 25th International World Wide Web Conference April 11, 2016 - Montreal, Canada", Montreal 2016.
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Fauzi, F.; Belkhatir, M.: Multifaceted conceptual image indexing on the world wide web (2013) 0.02

0.01755148 = product of:
  0.093607895 = sum of:
    0.040864702 = weight(_text_:wide in 2721) [ClassicSimilarity], result of:
      0.040864702 = score(doc=2721,freq=2.0), product of:
        0.13912784 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.031400457 = queryNorm
        0.29372054 = fieldWeight in 2721, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.046875 = fieldNorm(doc=2721)
    0.038399264 = weight(_text_:web in 2721) [ClassicSimilarity], result of:
      0.038399264 = score(doc=2721,freq=6.0), product of:
        0.10247572 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.031400457 = queryNorm
        0.37471575 = fieldWeight in 2721, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=2721)
    0.014343925 = weight(_text_:information in 2721) [ClassicSimilarity], result of:
      0.014343925 = score(doc=2721,freq=10.0), product of:
        0.055122808 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.031400457 = queryNorm
        0.2602176 = fieldWeight in 2721, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=2721)
  0.1875 = coord(3/16)

Abstract: In this paper, we describe a user-centered design of an automated multifaceted concept-based indexing framework which analyzes the semantics of the Web image contextual information and classifies it into five broad semantic concept facets: signal, object, abstract, scene, and relational; and identifies the semantic relationships between the concepts. An important aspect of our indexing model is that it relates to the users' levels of image descriptions. Also, a major contribution relies on the fact that the classification is performed automatically with the raw image contextual information extracted from any general webpage and is not solely based on image tags like state-of-the-art solutions. Human Language Technology techniques and an external knowledge base are used to analyze the information both syntactically and semantically. Experimental results on a human-annotated Web image collection and corresponding contextual information indicate that our method outperforms empirical frameworks employing tf-idf and location-based tf-idf weighting schemes as well as n-gram indexing in a recall/precision based evaluation framework.
Source: Information processing and management. 49(2013) no.2, S.420-440

Pritchard-Schoch, T.: Natural language comes of age (1993) 0.02

0.017320696 = product of:
  0.09237705 = sum of:
    0.012781745 = product of:
      0.02556349 = sum of:
        0.02556349 = weight(_text_:online in 2570) [ClassicSimilarity], result of:
          0.02556349 = score(doc=2570,freq=2.0), product of:
            0.09529729 = queryWeight, product of:
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.031400457 = queryNorm
            0.2682499 = fieldWeight in 2570, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.0625 = fieldNorm(doc=2570)
      0.5 = coord(1/2)
    0.035914708 = weight(_text_:retrieval in 2570) [ClassicSimilarity], result of:
      0.035914708 = score(doc=2570,freq=4.0), product of:
        0.09498371 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.031400457 = queryNorm
        0.37811437 = fieldWeight in 2570, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=2570)
    0.043680605 = weight(_text_:software in 2570) [ClassicSimilarity], result of:
      0.043680605 = score(doc=2570,freq=2.0), product of:
        0.124570385 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.031400457 = queryNorm
        0.35064998 = fieldWeight in 2570, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.0625 = fieldNorm(doc=2570)
  0.1875 = coord(3/16)

Abstract: Discusses natural languages and the natural language implementations of Westlaw's full-text legal documents, Westlaw Is Natural. Natural language is not aritificial intelligence but a hybrid of linguistics, mathematics and statistics. Provides 3 classes of retrieval models. Explains how Westlaw processes an English query. Assesses WIN. Covers WIN enhancements; the natural language features of Congressional Quarterly's Washington Alert using a document for a query; the personal librarian front end search software and Dowquest from Dow Jones news/retrieval. Conmsiders whether natural language encourages fuzzy thinking and whether Boolean logic will still be needed
Source: Online. 17(1993) no.3, S.33-43

Alexander, M.: Automatic indexing of document images using Excalibur EFS (1995) 0.02

0.01652782 = product of:
  0.08814838 = sum of:
    0.008553064 = weight(_text_:information in 1911) [ClassicSimilarity], result of:
      0.008553064 = score(doc=1911,freq=2.0), product of:
        0.055122808 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.031400457 = queryNorm
        0.1551638 = fieldWeight in 1911, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0625 = fieldNorm(doc=1911)
    0.035914708 = weight(_text_:retrieval in 1911) [ClassicSimilarity], result of:
      0.035914708 = score(doc=1911,freq=4.0), product of:
        0.09498371 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.031400457 = queryNorm
        0.37811437 = fieldWeight in 1911, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=1911)
    0.043680605 = weight(_text_:software in 1911) [ClassicSimilarity], result of:
      0.043680605 = score(doc=1911,freq=2.0), product of:
        0.124570385 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.031400457 = queryNorm
        0.35064998 = fieldWeight in 1911, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.0625 = fieldNorm(doc=1911)
  0.1875 = coord(3/16)

Abstract: Discusses research into the application of adaptive pattern recognition technology to enable effective retrieval from scanned document images. Describes application at the British Library of Excalibur EFS software which uses adaptive pattern recognition technology to provide access to digital information in its native forms, fuzzy searching retrieval and automatic indexing capabilities. It was used to make specialist printed catalogues and indexes accessible on computer via content based indexes

Hlava, M.M.K.: Machine aided indexing (MAI) in a multilingual environment (1993) 0.02

0.015912723 = product of:
  0.08486786 = sum of:
    0.0111840265 = product of:
      0.022368053 = sum of:
        0.022368053 = weight(_text_:online in 7405) [ClassicSimilarity], result of:
          0.022368053 = score(doc=7405,freq=2.0), product of:
            0.09529729 = queryWeight, product of:
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.031400457 = queryNorm
            0.23471867 = fieldWeight in 7405, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7405)
      0.5 = coord(1/2)
    0.0074839313 = weight(_text_:information in 7405) [ClassicSimilarity], result of:
      0.0074839313 = score(doc=7405,freq=2.0), product of:
        0.055122808 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.031400457 = queryNorm
        0.13576832 = fieldWeight in 7405, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7405)
    0.0661999 = weight(_text_:software in 7405) [ClassicSimilarity], result of:
      0.0661999 = score(doc=7405,freq=6.0), product of:
        0.124570385 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.031400457 = queryNorm
        0.53142565 = fieldWeight in 7405, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7405)
  0.1875 = coord(3/16)

Abstract: The machine aided indexing (MAI) software devloped by Access Innovations, Inc., is a semantic based, Boolean statement, rule interpreting application with 3 modules: the MA engine which accepts input files, matches terms in the knowledge base, interprets rules, and outputs a text file with suggested indexing terms; a rule building application allowing each Boolean style rule in the knowledge base to be created or modifies; and a statistical computation module which analyzes performance of the MA software against text manually indexed by professional human indexers. The MA software can be applied across multiple languages and can be used where the text to be searched is in one language and the indexes to be output are in another
Imprint: Medford, NJ : Learned Information
Source: Proceedings of the 14th National Online Meeting 1993, New York, 4-6 May 1993. Ed.: M.E. Williams

Biebricher, N.; Fuhr, N.; Lustig, G.; Schwantner, M.; Knorz, G.: ¬The automatic indexing system AIR/PHYS : from research to application (1988) 0.02

0.0152409095 = product of:
  0.08128485 = sum of:
    0.015119824 = weight(_text_:information in 1952) [ClassicSimilarity], result of:
      0.015119824 = score(doc=1952,freq=4.0), product of:
        0.055122808 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.031400457 = queryNorm
        0.27429342 = fieldWeight in 1952, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.078125 = fieldNorm(doc=1952)
    0.044893384 = weight(_text_:retrieval in 1952) [ClassicSimilarity], result of:
      0.044893384 = score(doc=1952,freq=4.0), product of:
        0.09498371 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.031400457 = queryNorm
        0.47264296 = fieldWeight in 1952, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=1952)
    0.021271642 = product of:
      0.042543285 = sum of:
        0.042543285 = weight(_text_:22 in 1952) [ClassicSimilarity], result of:
          0.042543285 = score(doc=1952,freq=2.0), product of:
            0.10995905 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.031400457 = queryNorm
            0.38690117 = fieldWeight in 1952, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=1952)
      0.5 = coord(1/2)
  0.1875 = coord(3/16)

Date: 16. 8.1998 12:51:22
Footnote: Wiederabgedruckt in: Readings in information retrieval. Ed.: K. Sparck Jones u. P. Willett. San Francisco: Morgan Kaufmann 1997. S.513-517.
Source: Proceedings of the 11th annual conference on research and development in information retrieval. Ed.: Y. Chiaramella

Samstag-Schnock, U.; Meadow, C.T.: PBS: an ecomical natural language query interpreter (1993) 0.02

0.01521975 = product of:
  0.081172 = sum of:
    0.012095859 = weight(_text_:information in 5091) [ClassicSimilarity], result of:
      0.012095859 = score(doc=5091,freq=4.0), product of:
        0.055122808 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.031400457 = queryNorm
        0.21943474 = fieldWeight in 5091, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0625 = fieldNorm(doc=5091)
    0.025395533 = weight(_text_:retrieval in 5091) [ClassicSimilarity], result of:
      0.025395533 = score(doc=5091,freq=2.0), product of:
        0.09498371 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.031400457 = queryNorm
        0.26736724 = fieldWeight in 5091, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=5091)
    0.043680605 = weight(_text_:software in 5091) [ClassicSimilarity], result of:
      0.043680605 = score(doc=5091,freq=2.0), product of:
        0.124570385 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.031400457 = queryNorm
        0.35064998 = fieldWeight in 5091, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.0625 = fieldNorm(doc=5091)
  0.1875 = coord(3/16)

Abstract: Reports on the design and implementation of the information searching and retrieval software, PBS (Parsing, Boolean recognition, Stemming) for the front end OAK 2, a new version of OAK developed at Toronto Univ. OAK 2 is a research tool for user behaviour studies. PBS receives natural language search statements from an end user and identifies search facets and implied Boolean logic operators
Source: Journal of the American Society for Information Science. 44(1993) no.5, S.265-272

Search (314 results, page 1 of 16)

Authors

Years

Languages

Types

Themes

Subjects

Classifications