Document (#27932)

Author
Keim, D.A.
Title
Datenvisualisierung und Data Mining
Source
Grundlagen der praktischen Information und Dokumentation. 5., völlig neu gefaßte Ausgabe. 2 Bde. Hrsg. von R. Kuhlen, Th. Seeger u. D. Strauch. Begründet von Klaus Laisiepen, Ernst Lutterbeck, Karl-Heinrich Meyer-Uhlenried. Bd.1: Handbuch zur Einführung in die Informationswissenschaft und -praxis
Imprint
München : Saur
Year
2004
Pages
S.363-370
Abstract
Die rasante technologische Entwicklung der letzten zwei Jahrzehnte ermöglicht heute die persistente Speicherung riesiger Datenmengen durch den Computer. Forscher an der Universität Berkeley haben berechnet, dass jedes Jahr ca. 1 Exabyte (= 1 Million Terabyte) Daten generiert werden - ein großer Teil davon in digitaler Form. Das bedeutet aber, dass in den nächsten drei Jahren mehr Daten generiert werden als in der gesamten menschlichen Entwicklung zuvor. Die Daten werden oft automatisch mit Hilfe von Sensoren und Überwachungssystemen aufgezeichnet. So werden beispielsweise alltägliche Vorgänge des menschlichen Lebens, wie das Bezahlen mit Kreditkarte oder die Benutzung des Telefons, durch Computer aufgezeichnet. Dabei werden gewöhnlich alle verfügbaren Parameter abgespeichert, wodurch hochdimensionale Datensätze entstehen. Die Daten werden gesammelt, da sie wertvolle Informationen enthalten, die einen Wettbewerbsvorteil bieten können. Das Finden der wertvollen Informationen in den großen Datenmengen ist aber keine leichte Aufgabe. Heutige Datenbankmanagementsysteme können nur kleine Teilmengen dieser riesigen Datenmengen darstellen. Werden die Daten zum Beispiel in textueller Form ausgegeben, können höchstens ein paar hundert Zeilen auf dem Bildschirm dargestellt werden. Bei Millionen von Datensätzen ist dies aber nur ein Tropfen auf den heißen Stein.
Theme
Suchoberflächen
Data Mining

Similar documents (content)

  1. Iglezakis, D.; Schembera, B.: Anforderungen der Ingenieurwissenschaften an das Forschungsdatenmanagement der Universität Stuttgart : Ergebnisse der Bedarfsanalyse des Projektes DIPL-ING (2018) 0.17
    0.16533498 = sum of:
      0.16533498 = product of:
        0.68889576 = sum of:
          0.022853693 = weight(abstract_txt:dass in 4488) [ClassicSimilarity], result of:
            0.022853693 = score(doc=4488,freq=1.0), product of:
              0.08087302 = queryWeight, product of:
                1.0060028 = boost
                4.5213976 = idf(docFreq=1306, maxDocs=44218)
                0.01778 = queryNorm
              0.28258735 = fieldWeight in 4488, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5213976 = idf(docFreq=1306, maxDocs=44218)
                0.0625 = fieldNorm(doc=4488)
          0.046169072 = weight(abstract_txt:können in 4488) [ClassicSimilarity], result of:
            0.046169072 = score(doc=4488,freq=2.0), product of:
              0.1174232 = queryWeight, product of:
                1.4846358 = boost
                4.4483833 = idf(docFreq=1405, maxDocs=44218)
                0.01778 = queryNorm
              0.39318526 = fieldWeight in 4488, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.4483833 = idf(docFreq=1405, maxDocs=44218)
                0.0625 = fieldNorm(doc=4488)
          0.13205102 = weight(abstract_txt:generiert in 4488) [ClassicSimilarity], result of:
            0.13205102 = score(doc=4488,freq=1.0), product of:
              0.26041174 = queryWeight, product of:
                1.8052096 = boost
                8.113368 = idf(docFreq=35, maxDocs=44218)
                0.01778 = queryNorm
              0.5070855 = fieldWeight in 4488, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.113368 = idf(docFreq=35, maxDocs=44218)
                0.0625 = fieldNorm(doc=4488)
          0.194143 = weight(abstract_txt:datenmengen in 4488) [ClassicSimilarity], result of:
            0.194143 = score(doc=4488,freq=1.0), product of:
              0.38542894 = queryWeight, product of:
                2.68977 = boost
                8.059301 = idf(docFreq=37, maxDocs=44218)
                0.01778 = queryNorm
              0.50370634 = fieldWeight in 4488, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.059301 = idf(docFreq=37, maxDocs=44218)
                0.0625 = fieldNorm(doc=4488)
          0.21983929 = weight(abstract_txt:daten in 4488) [ClassicSimilarity], result of:
            0.21983929 = score(doc=4488,freq=6.0), product of:
              0.27321163 = queryWeight, product of:
                2.9235933 = boost
                5.255941 = idf(docFreq=626, maxDocs=44218)
                0.01778 = queryNorm
              0.80464834 = fieldWeight in 4488, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.255941 = idf(docFreq=626, maxDocs=44218)
                0.0625 = fieldNorm(doc=4488)
          0.07383964 = weight(abstract_txt:werden in 4488) [ClassicSimilarity], result of:
            0.07383964 = score(doc=4488,freq=3.0), product of:
              0.19453841 = queryWeight, product of:
                3.120542 = boost
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.01778 = queryNorm
              0.3795633 = fieldWeight in 4488, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.0625 = fieldNorm(doc=4488)
        0.24 = coord(6/25)
    
  2. Unser, M.; Wäckerlin, D.: Dienstleistung "Abstract/Index" im NEBIS-Katalog (2006) 0.14
    0.14214335 = sum of:
      0.14214335 = product of:
        0.50765485 = sum of:
          0.01714027 = weight(abstract_txt:dass in 5030) [ClassicSimilarity], result of:
            0.01714027 = score(doc=5030,freq=1.0), product of:
              0.08087302 = queryWeight, product of:
                1.0060028 = boost
                4.5213976 = idf(docFreq=1306, maxDocs=44218)
                0.01778 = queryNorm
              0.21194051 = fieldWeight in 5030, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5213976 = idf(docFreq=1306, maxDocs=44218)
                0.046875 = fieldNorm(doc=5030)
          0.045713577 = weight(abstract_txt:informationen in 5030) [ClassicSimilarity], result of:
            0.045713577 = score(doc=5030,freq=4.0), product of:
              0.09797954 = queryWeight, product of:
                1.1072993 = boost
                4.976667 = idf(docFreq=828, maxDocs=44218)
                0.01778 = queryNorm
              0.4665625 = fieldWeight in 5030, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.976667 = idf(docFreq=828, maxDocs=44218)
                0.046875 = fieldNorm(doc=5030)
          0.042409 = weight(abstract_txt:können in 5030) [ClassicSimilarity], result of:
            0.042409 = score(doc=5030,freq=3.0), product of:
              0.1174232 = queryWeight, product of:
                1.4846358 = boost
                4.4483833 = idf(docFreq=1405, maxDocs=44218)
                0.01778 = queryNorm
              0.3611637 = fieldWeight in 5030, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.4483833 = idf(docFreq=1405, maxDocs=44218)
                0.046875 = fieldNorm(doc=5030)
          0.09903826 = weight(abstract_txt:generiert in 5030) [ClassicSimilarity], result of:
            0.09903826 = score(doc=5030,freq=1.0), product of:
              0.26041174 = queryWeight, product of:
                1.8052096 = boost
                8.113368 = idf(docFreq=35, maxDocs=44218)
                0.01778 = queryNorm
              0.3803141 = fieldWeight in 5030, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.113368 = idf(docFreq=35, maxDocs=44218)
                0.046875 = fieldNorm(doc=5030)
          0.14560725 = weight(abstract_txt:datenmengen in 5030) [ClassicSimilarity], result of:
            0.14560725 = score(doc=5030,freq=1.0), product of:
              0.38542894 = queryWeight, product of:
                2.68977 = boost
                8.059301 = idf(docFreq=37, maxDocs=44218)
                0.01778 = queryNorm
              0.37777975 = fieldWeight in 5030, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.059301 = idf(docFreq=37, maxDocs=44218)
                0.046875 = fieldNorm(doc=5030)
          0.06731176 = weight(abstract_txt:daten in 5030) [ClassicSimilarity], result of:
            0.06731176 = score(doc=5030,freq=1.0), product of:
              0.27321163 = queryWeight, product of:
                2.9235933 = boost
                5.255941 = idf(docFreq=626, maxDocs=44218)
                0.01778 = queryNorm
              0.24637222 = fieldWeight in 5030, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.255941 = idf(docFreq=626, maxDocs=44218)
                0.046875 = fieldNorm(doc=5030)
          0.09043472 = weight(abstract_txt:werden in 5030) [ClassicSimilarity], result of:
            0.09043472 = score(doc=5030,freq=8.0), product of:
              0.19453841 = queryWeight, product of:
                3.120542 = boost
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.01778 = queryNorm
              0.46486822 = fieldWeight in 5030, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.046875 = fieldNorm(doc=5030)
        0.28 = coord(7/25)
    
  3. Schwartz, D.: Graphische Datenanalyse für digitale Bibliotheken : Leistungs- und Funktionsumfang moderner Analyse- und Visualisierungsinstrumente (2006) 0.13
    0.13347025 = sum of:
      0.13347025 = product of:
        0.8341891 = sum of:
          0.048969693 = weight(abstract_txt:können in 30) [ClassicSimilarity], result of:
            0.048969693 = score(doc=30,freq=1.0), product of:
              0.1174232 = queryWeight, product of:
                1.4846358 = boost
                4.4483833 = idf(docFreq=1405, maxDocs=44218)
                0.01778 = queryNorm
              0.41703594 = fieldWeight in 30, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4483833 = idf(docFreq=1405, maxDocs=44218)
                0.09375 = fieldNorm(doc=30)
          0.5043983 = weight(abstract_txt:datenmengen in 30) [ClassicSimilarity], result of:
            0.5043983 = score(doc=30,freq=3.0), product of:
              0.38542894 = queryWeight, product of:
                2.68977 = boost
                8.059301 = idf(docFreq=37, maxDocs=44218)
                0.01778 = queryNorm
              1.3086674 = fieldWeight in 30, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.059301 = idf(docFreq=37, maxDocs=44218)
                0.09375 = fieldNorm(doc=30)
          0.1903864 = weight(abstract_txt:daten in 30) [ClassicSimilarity], result of:
            0.1903864 = score(doc=30,freq=2.0), product of:
              0.27321163 = queryWeight, product of:
                2.9235933 = boost
                5.255941 = idf(docFreq=626, maxDocs=44218)
                0.01778 = queryNorm
              0.6968459 = fieldWeight in 30, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.255941 = idf(docFreq=626, maxDocs=44218)
                0.09375 = fieldNorm(doc=30)
          0.09043472 = weight(abstract_txt:werden in 30) [ClassicSimilarity], result of:
            0.09043472 = score(doc=30,freq=2.0), product of:
              0.19453841 = queryWeight, product of:
                3.120542 = boost
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.01778 = queryNorm
              0.46486822 = fieldWeight in 30, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.09375 = fieldNorm(doc=30)
        0.16 = coord(4/25)
    
  4. Richter, S.: ¬Die formale Beschreibung von Dokumenten in Archiven und Bibliotheken : Perspektiven des Datenaustauschs (2004) 0.13
    0.12607743 = sum of:
      0.12607743 = product of:
        0.45027652 = sum of:
          0.02424 = weight(abstract_txt:dass in 4982) [ClassicSimilarity], result of:
            0.02424 = score(doc=4982,freq=2.0), product of:
              0.08087302 = queryWeight, product of:
                1.0060028 = boost
                4.5213976 = idf(docFreq=1306, maxDocs=44218)
                0.01778 = queryNorm
              0.29972914 = fieldWeight in 4982, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5213976 = idf(docFreq=1306, maxDocs=44218)
                0.046875 = fieldNorm(doc=4982)
          0.022856789 = weight(abstract_txt:informationen in 4982) [ClassicSimilarity], result of:
            0.022856789 = score(doc=4982,freq=1.0), product of:
              0.09797954 = queryWeight, product of:
                1.1072993 = boost
                4.976667 = idf(docFreq=828, maxDocs=44218)
                0.01778 = queryNorm
              0.23328125 = fieldWeight in 4982, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.976667 = idf(docFreq=828, maxDocs=44218)
                0.046875 = fieldNorm(doc=4982)
          0.054749783 = weight(abstract_txt:können in 4982) [ClassicSimilarity], result of:
            0.054749783 = score(doc=4982,freq=5.0), product of:
              0.1174232 = queryWeight, product of:
                1.4846358 = boost
                4.4483833 = idf(docFreq=1405, maxDocs=44218)
                0.01778 = queryNorm
              0.46626037 = fieldWeight in 4982, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.4483833 = idf(docFreq=1405, maxDocs=44218)
                0.046875 = fieldNorm(doc=4982)
          0.02676021 = weight(abstract_txt:aber in 4982) [ClassicSimilarity], result of:
            0.02676021 = score(doc=4982,freq=1.0), product of:
              0.12458965 = queryWeight, product of:
                1.5292693 = boost
                4.5821176 = idf(docFreq=1229, maxDocs=44218)
                0.01778 = queryNorm
              0.21478677 = fieldWeight in 4982, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5821176 = idf(docFreq=1229, maxDocs=44218)
                0.046875 = fieldNorm(doc=4982)
          0.09903826 = weight(abstract_txt:generiert in 4982) [ClassicSimilarity], result of:
            0.09903826 = score(doc=4982,freq=1.0), product of:
              0.26041174 = queryWeight, product of:
                1.8052096 = boost
                8.113368 = idf(docFreq=35, maxDocs=44218)
                0.01778 = queryNorm
              0.3803141 = fieldWeight in 4982, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.113368 = idf(docFreq=35, maxDocs=44218)
                0.046875 = fieldNorm(doc=4982)
          0.11658738 = weight(abstract_txt:daten in 4982) [ClassicSimilarity], result of:
            0.11658738 = score(doc=4982,freq=3.0), product of:
              0.27321163 = queryWeight, product of:
                2.9235933 = boost
                5.255941 = idf(docFreq=626, maxDocs=44218)
                0.01778 = queryNorm
              0.4267292 = fieldWeight in 4982, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.255941 = idf(docFreq=626, maxDocs=44218)
                0.046875 = fieldNorm(doc=4982)
          0.10604411 = weight(abstract_txt:werden in 4982) [ClassicSimilarity], result of:
            0.10604411 = score(doc=4982,freq=11.0), product of:
              0.19453841 = queryWeight, product of:
                3.120542 = boost
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.01778 = queryNorm
              0.5451063 = fieldWeight in 4982, product of:
                3.3166249 = tf(freq=11.0), with freq of:
                  11.0 = termFreq=11.0
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.046875 = fieldNorm(doc=4982)
        0.28 = coord(7/25)
    
  5. Renker, L.: Exploration von Textkorpora : Topic Models als Grundlage der Interaktion (2015) 0.12
    0.11649197 = sum of:
      0.11649197 = product of:
        0.48538324 = sum of:
          0.039583758 = weight(abstract_txt:dass in 2380) [ClassicSimilarity], result of:
            0.039583758 = score(doc=2380,freq=3.0), product of:
              0.08087302 = queryWeight, product of:
                1.0060028 = boost
                4.5213976 = idf(docFreq=1306, maxDocs=44218)
                0.01778 = queryNorm
              0.48945564 = fieldWeight in 2380, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.5213976 = idf(docFreq=1306, maxDocs=44218)
                0.0625 = fieldNorm(doc=2380)
          0.03047572 = weight(abstract_txt:informationen in 2380) [ClassicSimilarity], result of:
            0.03047572 = score(doc=2380,freq=1.0), product of:
              0.09797954 = queryWeight, product of:
                1.1072993 = boost
                4.976667 = idf(docFreq=828, maxDocs=44218)
                0.01778 = queryNorm
              0.31104168 = fieldWeight in 2380, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.976667 = idf(docFreq=828, maxDocs=44218)
                0.0625 = fieldNorm(doc=2380)
          0.046169072 = weight(abstract_txt:können in 2380) [ClassicSimilarity], result of:
            0.046169072 = score(doc=2380,freq=2.0), product of:
              0.1174232 = queryWeight, product of:
                1.4846358 = boost
                4.4483833 = idf(docFreq=1405, maxDocs=44218)
                0.01778 = queryNorm
              0.39318526 = fieldWeight in 2380, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.4483833 = idf(docFreq=1405, maxDocs=44218)
                0.0625 = fieldNorm(doc=2380)
          0.194143 = weight(abstract_txt:datenmengen in 2380) [ClassicSimilarity], result of:
            0.194143 = score(doc=2380,freq=1.0), product of:
              0.38542894 = queryWeight, product of:
                2.68977 = boost
                8.059301 = idf(docFreq=37, maxDocs=44218)
                0.01778 = queryNorm
              0.50370634 = fieldWeight in 2380, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.059301 = idf(docFreq=37, maxDocs=44218)
                0.0625 = fieldNorm(doc=2380)
          0.08974901 = weight(abstract_txt:daten in 2380) [ClassicSimilarity], result of:
            0.08974901 = score(doc=2380,freq=1.0), product of:
              0.27321163 = queryWeight, product of:
                2.9235933 = boost
                5.255941 = idf(docFreq=626, maxDocs=44218)
                0.01778 = queryNorm
              0.3284963 = fieldWeight in 2380, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.255941 = idf(docFreq=626, maxDocs=44218)
                0.0625 = fieldNorm(doc=2380)
          0.08526268 = weight(abstract_txt:werden in 2380) [ClassicSimilarity], result of:
            0.08526268 = score(doc=2380,freq=4.0), product of:
              0.19453841 = queryWeight, product of:
                3.120542 = boost
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.01778 = queryNorm
              0.43828195 = fieldWeight in 2380, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.0625 = fieldNorm(doc=2380)
        0.24 = coord(6/25)