Document (#43776)

Author
Mandl, T.
Title
Text Mining und Data Mining
Source
Grundlagen der Informationswissenschaft. Hrsg.: Rainer Kuhlen, Dirk Lewandowski, Wolfgang Semar und Christa Womser-Hacker. 7., völlig neu gefasste Ausg
Imprint
Berlin : DeGruyter
Year
2023
Pages
S.327-338
Abstract
Text und Data Mining sind ein Bündel von Technologien, die eng mit den Themenfeldern Statistik, Maschinelles Lernen und dem Erkennen von Mustern verbunden sind. Die üblichen Definitionen beziehen eine Vielzahl von verschiedenen Verfahren mit ein, ohne eine exakte Grenze zu ziehen. Data Mining bezeichnet die Suche nach Mustern, Regelmäßigkeiten oder Auffälligkeiten in stark strukturierten und vor allem numerischen Daten. "Any algorithm that enumerates patterns from, or fits models to, data is a data mining algorithm." Numerische Daten und Datenbankinhalte werden als strukturierte Daten bezeichnet. Dagegen gelten Textdokumente in natürlicher Sprache als unstrukturierte Daten.
Footnote
Vgl.: https://doi.org/10.1515/9783110769043.
Theme
Data Mining

Similar documents (author)

  1. Mandl, T.: Einsatz neuronaler Netze als Transferkomponenten beim Retrieval in heterogenen Dokumentbeständen (2000) 5.14
    5.1444697 = sum of:
      5.1444697 = weight(author_txt:mandl in 6563) [ClassicSimilarity], result of:
        5.1444697 = fieldWeight in 6563, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.231152 = idf(docFreq=31, maxDocs=44218)
          0.625 = fieldNorm(doc=6563)
    
  2. Mandl, T.: Web- und Multimedia-Dokumente : Neuere Entwicklungen bei der Evaluierung von Information Retrieval Systemen (2003) 5.14
    5.1444697 = sum of:
      5.1444697 = weight(author_txt:mandl in 1734) [ClassicSimilarity], result of:
        5.1444697 = fieldWeight in 1734, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.231152 = idf(docFreq=31, maxDocs=44218)
          0.625 = fieldNorm(doc=1734)
    
  3. Mandl, T.: Qualität als neue Dimension im Information Retrieval : Das AQUAINT Projekt (2005) 5.14
    5.1444697 = sum of:
      5.1444697 = weight(author_txt:mandl in 3184) [ClassicSimilarity], result of:
        5.1444697 = fieldWeight in 3184, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.231152 = idf(docFreq=31, maxDocs=44218)
          0.625 = fieldNorm(doc=3184)
    
  4. Mandl, T.: Tolerantes Information Retrieval : Neuronale Netze zur Erhöhung der Adaptivität und Flexibilität bei der Informationssuche (2001) 5.14
    5.1444697 = sum of:
      5.1444697 = weight(author_txt:mandl in 5965) [ClassicSimilarity], result of:
        5.1444697 = fieldWeight in 5965, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.231152 = idf(docFreq=31, maxDocs=44218)
          0.625 = fieldNorm(doc=5965)
    
  5. Mandl, T.: Neue Entwicklungen bei den Evaluierungsinitiativen im Information Retrieval (2006) 5.14
    5.1444697 = sum of:
      5.1444697 = weight(author_txt:mandl in 5975) [ClassicSimilarity], result of:
        5.1444697 = fieldWeight in 5975, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.231152 = idf(docFreq=31, maxDocs=44218)
          0.625 = fieldNorm(doc=5975)
    

Similar documents (content)

  1. Keim, D.A.: Data Mining mit bloßem Auge (2002) 0.13
    0.12891766 = sum of:
      0.12891766 = product of:
        1.0743139 = sum of:
          0.102617174 = weight(abstract_txt:data in 1086) [ClassicSimilarity], result of:
            0.102617174 = score(doc=1086,freq=1.0), product of:
              0.123029344 = queryWeight, product of:
                2.1642382 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.017038537 = queryNorm
              0.83408695 = fieldWeight in 1086, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.25 = fieldNorm(doc=1086)
          0.32095745 = weight(abstract_txt:daten in 1086) [ClassicSimilarity], result of:
            0.32095745 = score(doc=1086,freq=1.0), product of:
              0.2442626 = queryWeight, product of:
                2.7275593 = boost
                5.255941 = idf(docFreq=626, maxDocs=44218)
                0.017038537 = queryNorm
              1.3139852 = fieldWeight in 1086, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.255941 = idf(docFreq=626, maxDocs=44218)
                0.25 = fieldNorm(doc=1086)
          0.6507392 = weight(abstract_txt:mining in 1086) [ClassicSimilarity], result of:
            0.6507392 = score(doc=1086,freq=1.0), product of:
              0.42150235 = queryWeight, product of:
                4.0059056 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.017038537 = queryNorm
              1.5438566 = fieldWeight in 1086, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.25 = fieldNorm(doc=1086)
        0.12 = coord(3/25)
    
  2. Witschel, H.F.: Text, Wörter, Morpheme : Möglichkeiten einer automatischen Terminologie-Extraktion (2004) 0.10
    0.10235454 = sum of:
      0.10235454 = product of:
        0.5117727 = sum of:
          0.03322282 = weight(abstract_txt:sind in 126) [ClassicSimilarity], result of:
            0.03322282 = score(doc=126,freq=4.0), product of:
              0.06784633 = queryWeight, product of:
                1.0164684 = boost
                3.9174201 = idf(docFreq=2390, maxDocs=44218)
                0.017038537 = queryNorm
              0.48967752 = fieldWeight in 126, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.9174201 = idf(docFreq=2390, maxDocs=44218)
                0.0625 = fieldNorm(doc=126)
          0.031648792 = weight(abstract_txt:text in 126) [ClassicSimilarity], result of:
            0.031648792 = score(doc=126,freq=3.0), product of:
              0.0722969 = queryWeight, product of:
                1.049278 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.017038537 = queryNorm
              0.4377614 = fieldWeight in 126, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=126)
          0.07887394 = weight(abstract_txt:natürlicher in 126) [ClassicSimilarity], result of:
            0.07887394 = score(doc=126,freq=1.0), product of:
              0.15212515 = queryWeight, product of:
                1.0762576 = boost
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.017038537 = queryNorm
              0.5184806 = fieldWeight in 126, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.0625 = fieldNorm(doc=126)
          0.20534235 = weight(abstract_txt:mustern in 126) [ClassicSimilarity], result of:
            0.20534235 = score(doc=126,freq=1.0), product of:
              0.3627224 = queryWeight, product of:
                2.3502707 = boost
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.017038537 = queryNorm
              0.56611437 = fieldWeight in 126, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.0625 = fieldNorm(doc=126)
          0.1626848 = weight(abstract_txt:mining in 126) [ClassicSimilarity], result of:
            0.1626848 = score(doc=126,freq=1.0), product of:
              0.42150235 = queryWeight, product of:
                4.0059056 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.017038537 = queryNorm
              0.38596416 = fieldWeight in 126, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.0625 = fieldNorm(doc=126)
        0.2 = coord(5/25)
    
  3. Liu, B.: Web data mining : exploring hyperlinks, contents, and usage data (2011) 0.08
    0.08015717 = sum of:
      0.08015717 = product of:
        0.66797644 = sum of:
          0.036544878 = weight(abstract_txt:text in 354) [ClassicSimilarity], result of:
            0.036544878 = score(doc=354,freq=4.0), product of:
              0.0722969 = queryWeight, product of:
                1.049278 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.017038537 = queryNorm
              0.5054833 = fieldWeight in 354, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=354)
          0.06787488 = weight(abstract_txt:data in 354) [ClassicSimilarity], result of:
            0.06787488 = score(doc=354,freq=7.0), product of:
              0.123029344 = queryWeight, product of:
                2.1642382 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.017038537 = queryNorm
              0.55169666 = fieldWeight in 354, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0625 = fieldNorm(doc=354)
          0.5635567 = weight(abstract_txt:mining in 354) [ClassicSimilarity], result of:
            0.5635567 = score(doc=354,freq=12.0), product of:
              0.42150235 = queryWeight, product of:
                4.0059056 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.017038537 = queryNorm
              1.3370191 = fieldWeight in 354, product of:
                3.4641016 = tf(freq=12.0), with freq of:
                  12.0 = termFreq=12.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.0625 = fieldNorm(doc=354)
        0.12 = coord(3/25)
    
  4. Narock, T.; Zhou, L.; Yoon, V.: Semantic similarity of ontology instances using polarity mining (2013) 0.08
    0.079082794 = sum of:
      0.079082794 = product of:
        0.49426746 = sum of:
          0.02284055 = weight(abstract_txt:text in 620) [ClassicSimilarity], result of:
            0.02284055 = score(doc=620,freq=1.0), product of:
              0.0722969 = queryWeight, product of:
                1.049278 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.017038537 = queryNorm
              0.3159271 = fieldWeight in 620, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.078125 = fieldNorm(doc=620)
          0.12829496 = weight(abstract_txt:algorithm in 620) [ClassicSimilarity], result of:
            0.12829496 = score(doc=620,freq=4.0), product of:
              0.14391357 = queryWeight, product of:
                1.4804085 = boost
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.017038537 = queryNorm
              0.89147234 = fieldWeight in 620, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.078125 = fieldNorm(doc=620)
          0.055543173 = weight(abstract_txt:data in 620) [ClassicSimilarity], result of:
            0.055543173 = score(doc=620,freq=3.0), product of:
              0.123029344 = queryWeight, product of:
                2.1642382 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.017038537 = queryNorm
              0.4514628 = fieldWeight in 620, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.078125 = fieldNorm(doc=620)
          0.2875888 = weight(abstract_txt:mining in 620) [ClassicSimilarity], result of:
            0.2875888 = score(doc=620,freq=2.0), product of:
              0.42150235 = queryWeight, product of:
                4.0059056 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.017038537 = queryNorm
              0.68229467 = fieldWeight in 620, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.078125 = fieldNorm(doc=620)
        0.16 = coord(4/25)
    
  5. Jetter, H.-C.: Informationsvisualisierung und Visual Analytics (2023) 0.07
    0.07479247 = sum of:
      0.07479247 = product of:
        0.46745294 = sum of:
          0.028771807 = weight(abstract_txt:sind in 791) [ClassicSimilarity], result of:
            0.028771807 = score(doc=791,freq=3.0), product of:
              0.06784633 = queryWeight, product of:
                1.0164684 = boost
                3.9174201 = idf(docFreq=2390, maxDocs=44218)
                0.017038537 = queryNorm
              0.42407316 = fieldWeight in 791, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.9174201 = idf(docFreq=2390, maxDocs=44218)
                0.0625 = fieldNorm(doc=791)
          0.094360106 = weight(abstract_txt:statistik in 791) [ClassicSimilarity], result of:
            0.094360106 = score(doc=791,freq=1.0), product of:
              0.17143689 = queryWeight, product of:
                1.1425307 = boost
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.017038537 = queryNorm
              0.55040723 = fieldWeight in 791, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.0625 = fieldNorm(doc=791)
          0.20534235 = weight(abstract_txt:mustern in 791) [ClassicSimilarity], result of:
            0.20534235 = score(doc=791,freq=1.0), product of:
              0.3627224 = queryWeight, product of:
                2.3502707 = boost
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.017038537 = queryNorm
              0.56611437 = fieldWeight in 791, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.0625 = fieldNorm(doc=791)
          0.13897866 = weight(abstract_txt:daten in 791) [ClassicSimilarity], result of:
            0.13897866 = score(doc=791,freq=3.0), product of:
              0.2442626 = queryWeight, product of:
                2.7275593 = boost
                5.255941 = idf(docFreq=626, maxDocs=44218)
                0.017038537 = queryNorm
              0.5689723 = fieldWeight in 791, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.255941 = idf(docFreq=626, maxDocs=44218)
                0.0625 = fieldNorm(doc=791)
        0.16 = coord(4/25)