Search (594 results, page 1 of 30)

Srimurugan, A.: ¬An expert systems for generation of UDC class numbers : an investigation (2000) 0.09

0.08760825 = product of:
  0.1752165 = sum of:
    0.1752165 = sum of:
      0.07623341 = weight(_text_:systems in 1013) [ClassicSimilarity], result of:
        0.07623341 = score(doc=1013,freq=2.0), product of:
          0.16037072 = queryWeight, product of:
            3.0731742 = idf(docFreq=5561, maxDocs=44218)
            0.052184064 = queryNorm
          0.47535738 = fieldWeight in 1013, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.0731742 = idf(docFreq=5561, maxDocs=44218)
            0.109375 = fieldNorm(doc=1013)
      0.09898309 = weight(_text_:22 in 1013) [ClassicSimilarity], result of:
        0.09898309 = score(doc=1013,freq=2.0), product of:
          0.1827397 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052184064 = queryNorm
          0.5416616 = fieldWeight in 1013, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.109375 = fieldNorm(doc=1013)
  0.5 = coord(1/2)

Source: Extensions and corrections to the UDC. 22(2000), S.25-30

Schrodt, R.: Tiefen und Untiefen im wissenschaftlichen Sprachgebrauch (2008) 0.06

0.055254754 = product of:
  0.11050951 = sum of:
    0.11050951 = product of:
      0.3315285 = sum of:
        0.3315285 = weight(_text_:3a in 140) [ClassicSimilarity], result of:
          0.3315285 = score(doc=140,freq=2.0), product of:
            0.4424171 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.052184064 = queryNorm
            0.7493574 = fieldWeight in 140, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0625 = fieldNorm(doc=140)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Content: Vgl. auch: https://studylibde.com/doc/13053640/richard-schrodt. Vgl. auch: http%3A%2F%2Fwww.univie.ac.at%2FGermanistik%2Fschrodt%2Fvorlesung%2Fwissenschaftssprache.doc&usg=AOvVaw1lDLDR6NFf1W0-oC9mEUJf.

Lepsky, K.; Vorhauer, J.: Lingo - ein open source System für die Automatische Indexierung deutschsprachiger Dokumente (2006) 0.05

0.050061855 = product of:
  0.10012371 = sum of:
    0.10012371 = sum of:
      0.043561947 = weight(_text_:systems in 3581) [ClassicSimilarity], result of:
        0.043561947 = score(doc=3581,freq=2.0), product of:
          0.16037072 = queryWeight, product of:
            3.0731742 = idf(docFreq=5561, maxDocs=44218)
            0.052184064 = queryNorm
          0.2716328 = fieldWeight in 3581, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.0731742 = idf(docFreq=5561, maxDocs=44218)
            0.0625 = fieldNorm(doc=3581)
      0.056561764 = weight(_text_:22 in 3581) [ClassicSimilarity], result of:
        0.056561764 = score(doc=3581,freq=2.0), product of:
          0.1827397 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052184064 = queryNorm
          0.30952093 = fieldWeight in 3581, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0625 = fieldNorm(doc=3581)
  0.5 = coord(1/2)

Abstract: Lingo ist ein frei verfügbares System (open source) zur automatischen Indexierung der deutschen Sprache. Bei der Entwicklung von lingo standen hohe Konfigurierbarkeit und Flexibilität des Systems für unterschiedliche Einsatzmöglichkeiten im Vordergrund. Der Beitrag zeigt den Nutzen einer linguistisch basierten automatischen Indexierung für das Information Retrieval auf. Die für eine Retrievalverbesserung zur Verfügung stehende linguistische Funktionalität von lingo wird vorgestellt und an Beispielen erläutert: Grundformerkennung, Kompositumerkennung bzw. Kompositumzerlegung, Wortrelationierung, lexikalische und algorithmische Mehrwortgruppenerkennung, OCR-Fehlerkorrektur. Der offene Systemaufbau von lingo wird beschrieben, mögliche Einsatzszenarien und Anwendungsgrenzen werden benannt.
Date: 24. 3.2006 12:22:02

Kanaeva, Z.: Ranking: Google und CiteSeer (2005) 0.04

0.043804124 = product of:
  0.08760825 = sum of:
    0.08760825 = sum of:
      0.038116705 = weight(_text_:systems in 3276) [ClassicSimilarity], result of:
        0.038116705 = score(doc=3276,freq=2.0), product of:
          0.16037072 = queryWeight, product of:
            3.0731742 = idf(docFreq=5561, maxDocs=44218)
            0.052184064 = queryNorm
          0.23767869 = fieldWeight in 3276, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.0731742 = idf(docFreq=5561, maxDocs=44218)
            0.0546875 = fieldNorm(doc=3276)
      0.049491543 = weight(_text_:22 in 3276) [ClassicSimilarity], result of:
        0.049491543 = score(doc=3276,freq=2.0), product of:
          0.1827397 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052184064 = queryNorm
          0.2708308 = fieldWeight in 3276, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0546875 = fieldNorm(doc=3276)
  0.5 = coord(1/2)

Abstract: Im Rahmen des klassischen Information Retrieval wurden verschiedene Verfahren für das Ranking sowie die Suche in einer homogenen strukturlosen Dokumentenmenge entwickelt. Die Erfolge der Suchmaschine Google haben gezeigt dass die Suche in einer zwar inhomogenen aber zusammenhängenden Dokumentenmenge wie dem Internet unter Berücksichtigung der Dokumentenverbindungen (Links) sehr effektiv sein kann. Unter den von der Suchmaschine Google realisierten Konzepten ist ein Verfahren zum Ranking von Suchergebnissen (PageRank), das in diesem Artikel kurz erklärt wird. Darüber hinaus wird auf die Konzepte eines Systems namens CiteSeer eingegangen, welches automatisch bibliographische Angaben indexiert (engl. Autonomous Citation Indexing, ACI). Letzteres erzeugt aus einer Menge von nicht vernetzten wissenschaftlichen Dokumenten eine zusammenhängende Dokumentenmenge und ermöglicht den Einsatz von Banking-Verfahren, die auf den von Google genutzten Verfahren basieren.
Date: 20. 3.2005 16:23:22

RAK-NBM : Interpretationshilfe zu NBM 3b,3 (2000) 0.04

0.03999521 = product of:
  0.07999042 = sum of:
    0.07999042 = product of:
      0.15998083 = sum of:
        0.15998083 = weight(_text_:22 in 4362) [ClassicSimilarity], result of:
          0.15998083 = score(doc=4362,freq=4.0), product of:
            0.1827397 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052184064 = queryNorm
            0.8754574 = fieldWeight in 4362, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=4362)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 1.2000 19:22:27

Diederichs, A.: Wissensmanagement ist Macht : Effektiv und kostenbewußt arbeiten im Informationszeitalter (2005) 0.04

0.03999521 = product of:
  0.07999042 = sum of:
    0.07999042 = product of:
      0.15998083 = sum of:
        0.15998083 = weight(_text_:22 in 3211) [ClassicSimilarity], result of:
          0.15998083 = score(doc=3211,freq=4.0), product of:
            0.1827397 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052184064 = queryNorm
            0.8754574 = fieldWeight in 3211, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=3211)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 2.2005 9:16:22

Pesch, K.: ¬Eine gigantische Informationsfülle : "Brockhaus multimedial 2004" kann jedoch nicht rundum überzeugen (2003) 0.03

0.034995805 = product of:
  0.06999161 = sum of:
    0.06999161 = product of:
      0.13998322 = sum of:
        0.13998322 = weight(_text_:22 in 502) [ClassicSimilarity], result of:
          0.13998322 = score(doc=502,freq=4.0), product of:
            0.1827397 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052184064 = queryNorm
            0.76602525 = fieldWeight in 502, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=502)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 3. 5.1997 8:44:22
22. 9.2003 10:02:00

Donsbach, W.: Wahrheit in den Medien : über den Sinn eines methodischen Objektivitätsbegriffes (2001) 0.03

0.034534223 = product of:
  0.06906845 = sum of:
    0.06906845 = product of:
      0.20720533 = sum of:
        0.20720533 = weight(_text_:3a in 5895) [ClassicSimilarity], result of:
          0.20720533 = score(doc=5895,freq=2.0), product of:
            0.4424171 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.052184064 = queryNorm
            0.46834838 = fieldWeight in 5895, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5895)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Source: Politische Meinung. 381(2001) Nr.1, S.65-74 [https%3A%2F%2Fwww.dgfe.de%2Ffileadmin%2FOrdnerRedakteure%2FSektionen%2FSek02_AEW%2FKWF%2FPublikationen_Reihe_1989-2003%2FBand_17%2FBd_17_1994_355-406_A.pdf&usg=AOvVaw2KcbRsHy5UQ9QRIUyuOLNi]

Diedrichs, R.; Sandholzer, U.: ¬Der Gemeinsame Bibliotheksverbund GBV (2001) 0.03
```
0.03128866 = product of:
  0.06257732 = sum of:
    0.06257732 = sum of:
      0.027226217 = weight(_text_:systems in 1775) [ClassicSimilarity], result of:
        0.027226217 = score(doc=1775,freq=2.0), product of:
          0.16037072 = queryWeight, product of:
            3.0731742 = idf(docFreq=5561, maxDocs=44218)
            0.052184064 = queryNorm
          0.1697705 = fieldWeight in 1775, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.0731742 = idf(docFreq=5561, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1775)
      0.0353511 = weight(_text_:22 in 1775) [ClassicSimilarity], result of:
        0.0353511 = score(doc=1775,freq=2.0), product of:
          0.1827397 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052184064 = queryNorm
          0.19345059 = fieldWeight in 1775, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1775)
  0.5 = coord(1/2)
```
Abstract

Der Gemeinsame Bibliotheksverbund (GBV) wird heute von den sieben Bundesländern Bremen, Hamburg, Mecklenburg-Vorpommern, Niedersachsen, Sachsen-Anhalt, Schleswig-Holstein und Thüringen getragen. Sitz der Verbundzentrale des GBV ist Göttingen. Dem GBV gehören die Staats-, Landes- und Hochschulbibliotheken der beteiligten Länder sowie die Staatsbibliothek zu Berlin - Preußischer Kulturbesitz, zahlreiche öffentliche Bibliotheken und Spezialbibliotheken an. Insgesamt beteiligen sich über 400 Bibliotheken aktiv am Verbund. Der Weg des GBV ist geprägt von der permanenten Anpassung der Aufgaben und Perspektiven der Verbundarbeit an die Entwicklungen im bibliothekarischen Umfeld. Ausgehend von der reinen Verbundkatalogisierung wurde dieses Aufgabenfeld sehr schnell um Online-Fernleihe, Unterstützung lokaler Bibliothekssysteme und Endbenutzerdienste erweitert. Mit Einführung des Pica-Systems in Niedersachsen wurde bereits Anfang der neunziger Jahre internationale Zusammenarbeit zur Selbstverständlichkeit. Mit Zusammenschluss von sieben Bundesländern zu einem Verbund mußten dann auch Verbundstruktur und Organisation den Erfordernissen einer länderübergreifenden Kooperation angepasst werden. Der vorliegende Beitrag beschreibt diese Entwicklung und den heute erreichten Stand der Verbundarbeit. Ausgehend davon wird versucht, eine Einschätzung der künftigen Perspektiven des GBV hinsichtlich der traditionellen Verbundarbeit und den sich abzeichnenden Tendenzen einer globalen Zusammenarbeit aller an der Informationsversorgung in Forschung und Lehre beteiligten Einrichtungen und Organisationen aufzuzeigen.

Date

22. 3.2008 13:54:49
Mönch, C.; Aalberg, T.: Automatic conversion from MARC to FRBR (2003) 0.03
```
0.03128866 = product of:
  0.06257732 = sum of:
    0.06257732 = sum of:
      0.027226217 = weight(_text_:systems in 2422) [ClassicSimilarity], result of:
        0.027226217 = score(doc=2422,freq=2.0), product of:
          0.16037072 = queryWeight, product of:
            3.0731742 = idf(docFreq=5561, maxDocs=44218)
            0.052184064 = queryNorm
          0.1697705 = fieldWeight in 2422, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.0731742 = idf(docFreq=5561, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2422)
      0.0353511 = weight(_text_:22 in 2422) [ClassicSimilarity], result of:
        0.0353511 = score(doc=2422,freq=2.0), product of:
          0.1827397 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052184064 = queryNorm
          0.19345059 = fieldWeight in 2422, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2422)
  0.5 = coord(1/2)
```
Abstract

Catalogs have for centuries been the main tool that enabled users to search for items in a library by author, title, or subject. A catalog can be interpreted as a set of bibliographic records, where each record acts as a surrogate for a publication. Every record describes a specific publication and contains the data that is used to create the indexes of search systems and the information that is presented to the user. Bibliographic records are often captured and exchanged by the use of the MARC format. Although there are numerous rdquodialectsrdquo of the MARC format in use, they are usually crafted on the same basis and are interoperable with each other -to a certain extent. The data model of a MARC-based catalog, however, is rdquo[...] extremely non-normalized with excessive replication of datardquo [1]. For instance, a literary work that exists in numerous editions and translations is likely to yield a large result set because each edition or translation is represented by an individual record, that is unrelated to other records that describe the same work.

Source

Research and advanced technology for digital libraries : 7th European Conference, proceedings / ECDL 2003, Trondheim, Norway, August 17-22, 2003

dpa: Struktur des Denkorgans wird bald entschlüsselt sein (2000) 0.03

0.029996406 = product of:
  0.059992813 = sum of:
    0.059992813 = product of:
      0.119985625 = sum of:
        0.119985625 = weight(_text_:22 in 3952) [ClassicSimilarity], result of:
          0.119985625 = score(doc=3952,freq=4.0), product of:
            0.1827397 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052184064 = queryNorm
            0.6565931 = fieldWeight in 3952, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=3952)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 17. 7.1996 9:33:22
22. 7.2000 19:05:41

IST 99 Helsinki : Gestaltung der Informationsgesellschaft für Europa (2000) 0.03

0.029996406 = product of:
  0.059992813 = sum of:
    0.059992813 = product of:
      0.119985625 = sum of:
        0.119985625 = weight(_text_:22 in 4363) [ClassicSimilarity], result of:
          0.119985625 = score(doc=4363,freq=4.0), product of:
            0.1827397 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052184064 = queryNorm
            0.6565931 = fieldWeight in 4363, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=4363)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: Bericht über die 'Europäische Konferenz über Technologie der Informationsgesellschaft, 22.-24.11.1999, Helsinki'
Date: 22. 1.2000 19:26:00

Zschunke, P.; Svensson, P.: Bücherbrett für alle Fälle : Geräte-Speicher fassen Tausende von Seiten (2000) 0.03

0.029996406 = product of:
  0.059992813 = sum of:
    0.059992813 = product of:
      0.119985625 = sum of:
        0.119985625 = weight(_text_:22 in 4823) [ClassicSimilarity], result of:
          0.119985625 = score(doc=4823,freq=4.0), product of:
            0.1827397 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052184064 = queryNorm
            0.6565931 = fieldWeight in 4823, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=4823)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 3. 5.1997 8:44:22
18. 6.2000 9:11:22

Riesthuis, G.J.A.: Some thoughts about the format of the Master Reference File database (2000) 0.03

0.029996406 = product of:
  0.059992813 = sum of:
    0.059992813 = product of:
      0.119985625 = sum of:
        0.119985625 = weight(_text_:22 in 6405) [ClassicSimilarity], result of:
          0.119985625 = score(doc=6405,freq=4.0), product of:
            0.1827397 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052184064 = queryNorm
            0.6565931 = fieldWeight in 6405, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=6405)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Extensions and corrections to the UDC. 22(2000), S.15-22

Franken, R.: Unternehmensintelligenz? : Unternehmensführung und Wissen (2005) 0.03

0.028280882 = product of:
  0.056561764 = sum of:
    0.056561764 = product of:
      0.11312353 = sum of:
        0.11312353 = weight(_text_:22 in 2638) [ClassicSimilarity], result of:
          0.11312353 = score(doc=2638,freq=2.0), product of:
            0.1827397 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052184064 = queryNorm
            0.61904186 = fieldWeight in 2638, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=2638)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 2.2005 9:13:10

Ulrich, P.S.: Collaborative Digital Reference Service : Weltweites Projekt (2001) 0.03

0.028280882 = product of:
  0.056561764 = sum of:
    0.056561764 = product of:
      0.11312353 = sum of:
        0.11312353 = weight(_text_:22 in 5649) [ClassicSimilarity], result of:
          0.11312353 = score(doc=5649,freq=2.0), product of:
            0.1827397 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052184064 = queryNorm
            0.61904186 = fieldWeight in 5649, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=5649)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 20. 4.2002 17:30:22

Franken, S.: Fördernd oder hemmend? : Wissenskultur als Voraussetzung für Lernen und Innovation (2005) 0.03

0.028280882 = product of:
  0.056561764 = sum of:
    0.056561764 = product of:
      0.11312353 = sum of:
        0.11312353 = weight(_text_:22 in 6468) [ClassicSimilarity], result of:
          0.11312353 = score(doc=6468,freq=2.0), product of:
            0.1827397 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052184064 = queryNorm
            0.61904186 = fieldWeight in 6468, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=6468)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 2.2005 9:14:13

Fank, M.: Wissensbewahrung : Wissensverlust bedroht Geschäftserfolg (2005) 0.03

0.028280882 = product of:
  0.056561764 = sum of:
    0.056561764 = product of:
      0.11312353 = sum of:
        0.11312353 = weight(_text_:22 in 6746) [ClassicSimilarity], result of:
          0.11312353 = score(doc=6746,freq=2.0), product of:
            0.1827397 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052184064 = queryNorm
            0.61904186 = fieldWeight in 6746, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=6746)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 2.2005 9:14:55

Tochtermann, K.; Granitzer, M.: Wissenserschließung : Pfade durch den digitalen Informationsdschungel (2005) 0.03

0.028280882 = product of:
  0.056561764 = sum of:
    0.056561764 = product of:
      0.11312353 = sum of:
        0.11312353 = weight(_text_:22 in 4390) [ClassicSimilarity], result of:
          0.11312353 = score(doc=4390,freq=2.0), product of:
            0.1827397 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052184064 = queryNorm
            0.61904186 = fieldWeight in 4390, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=4390)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 5.11.2005 19:18:22

Peters, G.; Gaese, V.: ¬Das DocCat-System in der Textdokumentation von G+J (2003) 0.03
```
0.025030928 = product of:
  0.050061855 = sum of:
    0.050061855 = sum of:
      0.021780973 = weight(_text_:systems in 1507) [ClassicSimilarity], result of:
        0.021780973 = score(doc=1507,freq=2.0), product of:
          0.16037072 = queryWeight, product of:
            3.0731742 = idf(docFreq=5561, maxDocs=44218)
            0.052184064 = queryNorm
          0.1358164 = fieldWeight in 1507, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.0731742 = idf(docFreq=5561, maxDocs=44218)
            0.03125 = fieldNorm(doc=1507)
      0.028280882 = weight(_text_:22 in 1507) [ClassicSimilarity], result of:
        0.028280882 = score(doc=1507,freq=2.0), product of:
          0.1827397 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052184064 = queryNorm
          0.15476047 = fieldWeight in 1507, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.03125 = fieldNorm(doc=1507)
  0.5 = coord(1/2)
```
Abstract

Wir werden einmal die Grundlagen des Text-Mining-Systems bei IBM darstellen, dann werden wir das Projekt etwas umfangreicher und deutlicher darstellen, da kennen wir uns aus. Von daher haben wir zwei Teile, einmal Heidelberg, einmal Hamburg. Noch einmal zur Technologie. Text-Mining ist eine von IBM entwickelte Technologie, die in einer besonderen Ausformung und Programmierung für uns zusammengestellt wurde. Das Projekt hieß bei uns lange Zeit DocText Miner und heißt seit einiger Zeit auf Vorschlag von IBM DocCat, das soll eine Abkürzung für Document-Categoriser sein, sie ist ja auch nett und anschaulich. Wir fangen an mit Text-Mining, das bei IBM in Heidelberg entwickelt wurde. Die verstehen darunter das automatische Indexieren als eine Instanz, also einen Teil von Text-Mining. Probleme werden dabei gezeigt, und das Text-Mining ist eben eine Methode zur Strukturierung von und der Suche in großen Dokumentenmengen, die Extraktion von Informationen und, das ist der hohe Anspruch, von impliziten Zusammenhängen. Das letztere sei dahingestellt. IBM macht das quantitativ, empirisch, approximativ und schnell. das muss man wirklich sagen. Das Ziel, und das ist ganz wichtig für unser Projekt gewesen, ist nicht, den Text zu verstehen, sondern das Ergebnis dieser Verfahren ist, was sie auf Neudeutsch a bundle of words, a bag of words nennen, also eine Menge von bedeutungstragenden Begriffen aus einem Text zu extrahieren, aufgrund von Algorithmen, also im Wesentlichen aufgrund von Rechenoperationen. Es gibt eine ganze Menge von linguistischen Vorstudien, ein wenig Linguistik ist auch dabei, aber nicht die Grundlage der ganzen Geschichte. Was sie für uns gemacht haben, ist also die Annotierung von Pressetexten für unsere Pressedatenbank. Für diejenigen, die es noch nicht kennen: Gruner + Jahr führt eine Textdokumentation, die eine Datenbank führt, seit Anfang der 70er Jahre, da sind z.Z. etwa 6,5 Millionen Dokumente darin, davon etwas über 1 Million Volltexte ab 1993. Das Prinzip war lange Zeit, dass wir die Dokumente, die in der Datenbank gespeichert waren und sind, verschlagworten und dieses Prinzip haben wir auch dann, als der Volltext eingeführt wurde, in abgespeckter Form weitergeführt. Zu diesen 6,5 Millionen Dokumenten gehören dann eben auch ungefähr 10 Millionen Faksimileseiten, weil wir die Faksimiles auch noch standardmäßig aufheben.

Date

22. 4.2003 11:45:36

Search (594 results, page 1 of 30)

Authors

Types

Themes