Search (34 results, page 1 of 2)

  • × theme_ss:"Data Mining"
  1. Peters, G.; Gaese, V.: ¬Das DocCat-System in der Textdokumentation von G+J (2003) 0.02
    0.017031772 = product of:
      0.045418058 = sum of:
        0.02999498 = weight(_text_:geschichte in 1507) [ClassicSimilarity], result of:
          0.02999498 = score(doc=1507,freq=2.0), product of:
            0.14280191 = queryWeight, product of:
              4.7528 = idf(docFreq=1036, maxDocs=44218)
              0.030045848 = queryNorm
            0.21004607 = fieldWeight in 1507, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.7528 = idf(docFreq=1036, maxDocs=44218)
              0.03125 = fieldNorm(doc=1507)
        0.00999535 = product of:
          0.0199907 = sum of:
            0.0199907 = weight(_text_:1993 in 1507) [ClassicSimilarity], result of:
              0.0199907 = score(doc=1507,freq=2.0), product of:
                0.11657991 = queryWeight, product of:
                  3.8800673 = idf(docFreq=2481, maxDocs=44218)
                  0.030045848 = queryNorm
                0.17147636 = fieldWeight in 1507, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.8800673 = idf(docFreq=2481, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1507)
          0.5 = coord(1/2)
        0.0054277307 = product of:
          0.016283192 = sum of:
            0.016283192 = weight(_text_:22 in 1507) [ClassicSimilarity], result of:
              0.016283192 = score(doc=1507,freq=2.0), product of:
                0.105215445 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.030045848 = queryNorm
                0.15476047 = fieldWeight in 1507, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1507)
          0.33333334 = coord(1/3)
      0.375 = coord(3/8)
    
    Abstract
    Wir werden einmal die Grundlagen des Text-Mining-Systems bei IBM darstellen, dann werden wir das Projekt etwas umfangreicher und deutlicher darstellen, da kennen wir uns aus. Von daher haben wir zwei Teile, einmal Heidelberg, einmal Hamburg. Noch einmal zur Technologie. Text-Mining ist eine von IBM entwickelte Technologie, die in einer besonderen Ausformung und Programmierung für uns zusammengestellt wurde. Das Projekt hieß bei uns lange Zeit DocText Miner und heißt seit einiger Zeit auf Vorschlag von IBM DocCat, das soll eine Abkürzung für Document-Categoriser sein, sie ist ja auch nett und anschaulich. Wir fangen an mit Text-Mining, das bei IBM in Heidelberg entwickelt wurde. Die verstehen darunter das automatische Indexieren als eine Instanz, also einen Teil von Text-Mining. Probleme werden dabei gezeigt, und das Text-Mining ist eben eine Methode zur Strukturierung von und der Suche in großen Dokumentenmengen, die Extraktion von Informationen und, das ist der hohe Anspruch, von impliziten Zusammenhängen. Das letztere sei dahingestellt. IBM macht das quantitativ, empirisch, approximativ und schnell. das muss man wirklich sagen. Das Ziel, und das ist ganz wichtig für unser Projekt gewesen, ist nicht, den Text zu verstehen, sondern das Ergebnis dieser Verfahren ist, was sie auf Neudeutsch a bundle of words, a bag of words nennen, also eine Menge von bedeutungstragenden Begriffen aus einem Text zu extrahieren, aufgrund von Algorithmen, also im Wesentlichen aufgrund von Rechenoperationen. Es gibt eine ganze Menge von linguistischen Vorstudien, ein wenig Linguistik ist auch dabei, aber nicht die Grundlage der ganzen Geschichte. Was sie für uns gemacht haben, ist also die Annotierung von Pressetexten für unsere Pressedatenbank. Für diejenigen, die es noch nicht kennen: Gruner + Jahr führt eine Textdokumentation, die eine Datenbank führt, seit Anfang der 70er Jahre, da sind z.Z. etwa 6,5 Millionen Dokumente darin, davon etwas über 1 Million Volltexte ab 1993. Das Prinzip war lange Zeit, dass wir die Dokumente, die in der Datenbank gespeichert waren und sind, verschlagworten und dieses Prinzip haben wir auch dann, als der Volltext eingeführt wurde, in abgespeckter Form weitergeführt. Zu diesen 6,5 Millionen Dokumenten gehören dann eben auch ungefähr 10 Millionen Faksimileseiten, weil wir die Faksimiles auch noch standardmäßig aufheben.
    Date
    22. 4.2003 11:45:36
  2. Amir, A.; Feldman, R.; Kashi, R.: ¬A new and versatile method for association generation (1997) 0.01
    0.0054523647 = product of:
      0.043618917 = sum of:
        0.043618917 = product of:
          0.065428376 = sum of:
            0.032861996 = weight(_text_:29 in 1270) [ClassicSimilarity], result of:
              0.032861996 = score(doc=1270,freq=2.0), product of:
                0.1056919 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.030045848 = queryNorm
                0.31092256 = fieldWeight in 1270, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1270)
            0.032566383 = weight(_text_:22 in 1270) [ClassicSimilarity], result of:
              0.032566383 = score(doc=1270,freq=2.0), product of:
                0.105215445 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.030045848 = queryNorm
                0.30952093 = fieldWeight in 1270, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1270)
          0.6666667 = coord(2/3)
      0.125 = coord(1/8)
    
    Date
    5. 4.1996 15:29:15
    Source
    Information systems. 22(1997) nos.5/6, S.333-347
  3. Hofstede, A.H.M. ter; Proper, H.A.; Van der Weide, T.P.: Exploiting fact verbalisation in conceptual information modelling (1997) 0.00
    0.004770819 = product of:
      0.038166553 = sum of:
        0.038166553 = product of:
          0.05724983 = sum of:
            0.028754247 = weight(_text_:29 in 2908) [ClassicSimilarity], result of:
              0.028754247 = score(doc=2908,freq=2.0), product of:
                0.1056919 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.030045848 = queryNorm
                0.27205724 = fieldWeight in 2908, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2908)
            0.028495584 = weight(_text_:22 in 2908) [ClassicSimilarity], result of:
              0.028495584 = score(doc=2908,freq=2.0), product of:
                0.105215445 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.030045848 = queryNorm
                0.2708308 = fieldWeight in 2908, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2908)
          0.6666667 = coord(2/3)
      0.125 = coord(1/8)
    
    Date
    5. 4.1996 15:29:15
    Source
    Information systems. 22(1997) nos.5/6, S.349-385
  4. Budzik, J.; Hammond, K.J.; Birnbaum, L.: Information access in context (2001) 0.00
    0.0023961873 = product of:
      0.019169498 = sum of:
        0.019169498 = product of:
          0.057508495 = sum of:
            0.057508495 = weight(_text_:29 in 3835) [ClassicSimilarity], result of:
              0.057508495 = score(doc=3835,freq=2.0), product of:
                0.1056919 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.030045848 = queryNorm
                0.5441145 = fieldWeight in 3835, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.109375 = fieldNorm(doc=3835)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Date
    29. 3.2002 17:31:17
  5. Chowdhury, G.G.: Template mining for information extraction from digital documents (1999) 0.00
    0.002374632 = product of:
      0.018997056 = sum of:
        0.018997056 = product of:
          0.056991167 = sum of:
            0.056991167 = weight(_text_:22 in 4577) [ClassicSimilarity], result of:
              0.056991167 = score(doc=4577,freq=2.0), product of:
                0.105215445 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.030045848 = queryNorm
                0.5416616 = fieldWeight in 4577, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=4577)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Date
    2. 4.2000 18:01:22
  6. Witten, I.H.; Frank, E.: Data Mining : Praktische Werkzeuge und Techniken für das maschinelle Lernen (2000) 0.00
    0.002053875 = product of:
      0.016431 = sum of:
        0.016431 = product of:
          0.049292997 = sum of:
            0.049292997 = weight(_text_:29 in 6833) [ClassicSimilarity], result of:
              0.049292997 = score(doc=6833,freq=2.0), product of:
                0.1056919 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.030045848 = queryNorm
                0.46638384 = fieldWeight in 6833, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.09375 = fieldNorm(doc=6833)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Date
    27. 1.1996 10:29:55
  7. Keim, D.A.: Data Mining mit bloßem Auge (2002) 0.00
    0.002053875 = product of:
      0.016431 = sum of:
        0.016431 = product of:
          0.049292997 = sum of:
            0.049292997 = weight(_text_:29 in 1086) [ClassicSimilarity], result of:
              0.049292997 = score(doc=1086,freq=2.0), product of:
                0.1056919 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.030045848 = queryNorm
                0.46638384 = fieldWeight in 1086, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.09375 = fieldNorm(doc=1086)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Date
    31.12.1996 19:29:41
  8. Kruse, R.; Borgelt, C.: Suche im Datendschungel (2002) 0.00
    0.002053875 = product of:
      0.016431 = sum of:
        0.016431 = product of:
          0.049292997 = sum of:
            0.049292997 = weight(_text_:29 in 1087) [ClassicSimilarity], result of:
              0.049292997 = score(doc=1087,freq=2.0), product of:
                0.1056919 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.030045848 = queryNorm
                0.46638384 = fieldWeight in 1087, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.09375 = fieldNorm(doc=1087)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Date
    31.12.1996 19:29:41
  9. Wrobel, S.: Lern- und Entdeckungsverfahren (2002) 0.00
    0.002053875 = product of:
      0.016431 = sum of:
        0.016431 = product of:
          0.049292997 = sum of:
            0.049292997 = weight(_text_:29 in 1105) [ClassicSimilarity], result of:
              0.049292997 = score(doc=1105,freq=2.0), product of:
                0.1056919 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.030045848 = queryNorm
                0.46638384 = fieldWeight in 1105, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.09375 = fieldNorm(doc=1105)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Date
    31.12.1996 19:29:41
  10. KDD : techniques and applications (1998) 0.00
    0.002035399 = product of:
      0.016283192 = sum of:
        0.016283192 = product of:
          0.04884957 = sum of:
            0.04884957 = weight(_text_:22 in 6783) [ClassicSimilarity], result of:
              0.04884957 = score(doc=6783,freq=2.0), product of:
                0.105215445 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.030045848 = queryNorm
                0.46428138 = fieldWeight in 6783, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=6783)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Footnote
    A special issue of selected papers from the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'97), held Singapore, 22-23 Feb 1997
  11. Borgelt, C.; Kruse, R.: Unsicheres Wissen nutzen (2002) 0.00
    0.0017115625 = product of:
      0.0136925 = sum of:
        0.0136925 = product of:
          0.0410775 = sum of:
            0.0410775 = weight(_text_:29 in 1104) [ClassicSimilarity], result of:
              0.0410775 = score(doc=1104,freq=2.0), product of:
                0.1056919 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.030045848 = queryNorm
                0.38865322 = fieldWeight in 1104, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1104)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Date
    31.12.1996 19:29:41
  12. Cardie, C.: Empirical methods in information extraction (1997) 0.00
    0.0013692499 = product of:
      0.010953999 = sum of:
        0.010953999 = product of:
          0.032861996 = sum of:
            0.032861996 = weight(_text_:29 in 3246) [ClassicSimilarity], result of:
              0.032861996 = score(doc=3246,freq=2.0), product of:
                0.1056919 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.030045848 = queryNorm
                0.31092256 = fieldWeight in 3246, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3246)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Date
    6. 3.1999 13:50:29
  13. Tiefschürfen in Datenbanken (2002) 0.00
    0.0013692499 = product of:
      0.010953999 = sum of:
        0.010953999 = product of:
          0.032861996 = sum of:
            0.032861996 = weight(_text_:29 in 996) [ClassicSimilarity], result of:
              0.032861996 = score(doc=996,freq=2.0), product of:
                0.1056919 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.030045848 = queryNorm
                0.31092256 = fieldWeight in 996, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0625 = fieldNorm(doc=996)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Date
    31.12.1996 19:29:41
  14. Bath, P.A.: Data mining in health and medical information (2003) 0.00
    0.0013692499 = product of:
      0.010953999 = sum of:
        0.010953999 = product of:
          0.032861996 = sum of:
            0.032861996 = weight(_text_:29 in 4263) [ClassicSimilarity], result of:
              0.032861996 = score(doc=4263,freq=2.0), product of:
                0.1056919 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.030045848 = queryNorm
                0.31092256 = fieldWeight in 4263, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4263)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Date
    23.10.2005 18:29:03
  15. Matson, L.D.; Bonski, D.J.: Do digital libraries need librarians? (1997) 0.00
    0.0013569327 = product of:
      0.0108554615 = sum of:
        0.0108554615 = product of:
          0.032566383 = sum of:
            0.032566383 = weight(_text_:22 in 1737) [ClassicSimilarity], result of:
              0.032566383 = score(doc=1737,freq=2.0), product of:
                0.105215445 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.030045848 = queryNorm
                0.30952093 = fieldWeight in 1737, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1737)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Date
    22.11.1998 18:57:22
  16. Lusti, M.: Data Warehousing and Data Mining : Eine Einführung in entscheidungsunterstützende Systeme (1999) 0.00
    0.0013569327 = product of:
      0.0108554615 = sum of:
        0.0108554615 = product of:
          0.032566383 = sum of:
            0.032566383 = weight(_text_:22 in 4261) [ClassicSimilarity], result of:
              0.032566383 = score(doc=4261,freq=2.0), product of:
                0.105215445 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.030045848 = queryNorm
                0.30952093 = fieldWeight in 4261, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4261)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Date
    17. 7.2002 19:22:06
  17. Methodologies for knowledge discovery and data mining : Third Pacific-Asia Conference, PAKDD'99, Beijing, China, April 26-28, 1999, Proceedings (1999) 0.00
    0.0011980936 = product of:
      0.009584749 = sum of:
        0.009584749 = product of:
          0.028754247 = sum of:
            0.028754247 = weight(_text_:29 in 3821) [ClassicSimilarity], result of:
              0.028754247 = score(doc=3821,freq=2.0), product of:
                0.1056919 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.030045848 = queryNorm
                0.27205724 = fieldWeight in 3821, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3821)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Abstract
    The 29 revised full papers presented together with 37 short papers were carefully selected from a total of 158 submissions. The book is divided into sections on emerging KDD technology; association rules; feature selection and generation; mining in semi-unstructured data; interestingness, surprisingness, and exceptions; rough sets, fuzzy logic, and neural networks; induction, classification, and clustering; visualization, causal models and graph-based methods; agent-based and distributed data mining; and advanced topics and new methodologies
  18. Wiegmann, S.: Hättest du die Titanic überlebt? : Eine kurze Einführung in das Data Mining mit freier Software (2023) 0.00
    0.0011980936 = product of:
      0.009584749 = sum of:
        0.009584749 = product of:
          0.028754247 = sum of:
            0.028754247 = weight(_text_:29 in 876) [ClassicSimilarity], result of:
              0.028754247 = score(doc=876,freq=2.0), product of:
                0.1056919 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.030045848 = queryNorm
                0.27205724 = fieldWeight in 876, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=876)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Date
    28. 1.2022 11:05:29
  19. Srinivasan, P.: Text mining in biomedicine : challenges and opportunities (2006) 0.00
    0.0010269375 = product of:
      0.0082155 = sum of:
        0.0082155 = product of:
          0.024646498 = sum of:
            0.024646498 = weight(_text_:29 in 1497) [ClassicSimilarity], result of:
              0.024646498 = score(doc=1497,freq=2.0), product of:
                0.1056919 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.030045848 = queryNorm
                0.23319192 = fieldWeight in 1497, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1497)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Date
    29. 2.2008 17:14:09
  20. Liu, X.; Yu, S.; Janssens, F.; Glänzel, W.; Moreau, Y.; Moor, B.de: Weighted hybrid clustering by combining text mining and bibliometrics on a large-scale journal database (2010) 0.00
    0.0010269375 = product of:
      0.0082155 = sum of:
        0.0082155 = product of:
          0.024646498 = sum of:
            0.024646498 = weight(_text_:29 in 3464) [ClassicSimilarity], result of:
              0.024646498 = score(doc=3464,freq=2.0), product of:
                0.1056919 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.030045848 = queryNorm
                0.23319192 = fieldWeight in 3464, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3464)
          0.33333334 = coord(1/3)
      0.125 = coord(1/8)
    
    Date
    1. 6.2010 9:29:57

Years

Languages

  • e 20
  • d 14

Types