Document (#43131)

Author
Giesselbach, S.
Estler-Ziegler, T.
Title
Dokumente schneller analysieren mit Künstlicher Intelligenz
Source
Mail an Inetbib vom 06.02.2021, von Tania Estler-Ziegler
Year
2021
Abstract
Künstliche Intelligenz (KI) und natürliches Sprachverstehen (natural language understanding/NLU) verändern viele Aspekte unseres Alltags und unserer Arbeitsweise. Besondere Prominenz erlangte NLU durch Sprachassistenten wie Siri, Alexa und Google Now. NLU bietet Firmen und Einrichtungen das Potential, Prozesse effizienter zu gestalten und Mehrwert aus textuellen Inhalten zu schöpfen. So sind NLU-Lösungen in der Lage, komplexe, unstrukturierte Dokumente inhaltlich zu erschließen. Für die semantische Textanalyse hat das NLU-Team des IAIS Sprachmodelle entwickelt, die mit Deep-Learning-Verfahren trainiert werden. Die NLU-Suite analysiert Dokumente, extrahiert Eckdaten und erstellt bei Bedarf sogar eine strukturierte Zusammenfassung. Mit diesen Ergebnissen, aber auch über den Inhalt der Dokumente selbst, lassen sich Dokumente vergleichen oder Texte mit ähnlichen Informationen finden. KI-basierten Sprachmodelle sind der klassischen Verschlagwortung deutlich überlegen. Denn sie finden nicht nur Texte mit vordefinierten Schlagwörtern, sondern suchen intelligent nach Begriffen, die in ähnlichem Zusammenhang auftauchen oder als Synonym gebraucht werden. Der Vortrag liefert eine Einordnung der Begriffe "Künstliche Intelligenz" und "Natural Language Understanding" und zeigt Möglichkeiten, Grenzen, aktuelle Forschungsrichtungen und Methoden auf. Anhand von Praxisbeispielen wird anschließend demonstriert, wie NLU zur automatisierten Belegverarbeitung, zur Katalogisierung von großen Datenbeständen wie Nachrichten und Patenten und zur automatisierten thematischen Gruppierung von Social Media Beiträgen und Publikationen genutzt werden kann.
Content
Vgl.: https://www.iais.fraunhofer.de/.
Footnote
Vortrag im Rahmen des Berliner Arbeitskreis Information (BAK) am 25.02.2021.
Theme
Computerlinguistik
Automatisches Indexieren
Field
Informatik
Sprachwissenschaft

Similar documents (author)

  1. Ziegler, R.A.; Ziegler, R.S.: ¬The National Film Registry : a videography (1995) 6.39
    6.3892665 = sum of:
      6.3892665 = weight(author_txt:ziegler in 3445) [ClassicSimilarity], result of:
        6.3892665 = fieldWeight in 3445, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          9.035788 = idf(docFreq=13, maxDocs=43254)
          0.5 = fieldNorm(doc=3445)
    
  2. Ziegler, J.: ¬Der Auskunftsbibliothekar : ein Zauberlehrling? (1991) 5.65
    5.6473675 = sum of:
      5.6473675 = weight(author_txt:ziegler in 4325) [ClassicSimilarity], result of:
        5.6473675 = fieldWeight in 4325, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.035788 = idf(docFreq=13, maxDocs=43254)
          0.625 = fieldNorm(doc=4325)
    
  3. Ziegler, B.: ESS: ein schneller Algorithmus zur Mustersuche in Zeichenfolgen (1996) 5.65
    5.6473675 = sum of:
      5.6473675 = weight(author_txt:ziegler in 613) [ClassicSimilarity], result of:
        5.6473675 = fieldWeight in 613, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.035788 = idf(docFreq=13, maxDocs=43254)
          0.625 = fieldNorm(doc=613)
    
  4. Ziegler, C.: Smartes Chaos : Web 2.0 versus Semantic Web (2006) 5.65
    5.6473675 = sum of:
      5.6473675 = weight(author_txt:ziegler in 6869) [ClassicSimilarity], result of:
        5.6473675 = fieldWeight in 6869, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.035788 = idf(docFreq=13, maxDocs=43254)
          0.625 = fieldNorm(doc=6869)
    
  5. Ziegler, C.: Weltendämmerung : XML und Datenbanken: Einblick in Tamino (2001) 5.65
    5.6473675 = sum of:
      5.6473675 = weight(author_txt:ziegler in 803) [ClassicSimilarity], result of:
        5.6473675 = fieldWeight in 803, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.035788 = idf(docFreq=13, maxDocs=43254)
          0.625 = fieldNorm(doc=803)
    

Similar documents (content)

  1. Nohr, H.: Theorie des Information Retrieval II : Automatische Indexierung (2004) 0.11
    0.109781735 = sum of:
      0.109781735 = product of:
        0.54890865 = sum of:
          0.029122567 = weight(abstract_txt:oder in 1473) [ClassicSimilarity], result of:
            0.029122567 = score(doc=1473,freq=2.0), product of:
              0.07714136 = queryWeight, product of:
                4.271175 = idf(docFreq=1641, maxDocs=43254)
                0.018060923 = queryNorm
              0.37752208 = fieldWeight in 1473, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.271175 = idf(docFreq=1641, maxDocs=43254)
                0.0625 = fieldNorm(doc=1473)
          0.10256057 = weight(abstract_txt:unstrukturierte in 1473) [ClassicSimilarity], result of:
            0.10256057 = score(doc=1473,freq=1.0), product of:
              0.1785615 = queryWeight, product of:
                1.0758092 = boost
                9.189939 = idf(docFreq=11, maxDocs=43254)
                0.018060923 = queryNorm
              0.57437116 = fieldWeight in 1473, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.189939 = idf(docFreq=11, maxDocs=43254)
                0.0625 = fieldNorm(doc=1473)
          0.105501406 = weight(abstract_txt:textanalyse in 1473) [ClassicSimilarity], result of:
            0.105501406 = score(doc=1473,freq=1.0), product of:
              0.18195878 = queryWeight, product of:
                1.0859951 = boost
                9.27695 = idf(docFreq=10, maxDocs=43254)
                0.018060923 = queryNorm
              0.57980937 = fieldWeight in 1473, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.27695 = idf(docFreq=10, maxDocs=43254)
                0.0625 = fieldNorm(doc=1473)
          0.029955737 = weight(abstract_txt:werden in 1473) [ClassicSimilarity], result of:
            0.029955737 = score(doc=1473,freq=3.0), product of:
              0.07860573 = queryWeight, product of:
                1.2363149 = boost
                3.5203447 = idf(docFreq=3478, maxDocs=43254)
                0.018060923 = queryNorm
              0.3810885 = fieldWeight in 1473, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.5203447 = idf(docFreq=3478, maxDocs=43254)
                0.0625 = fieldNorm(doc=1473)
          0.28176835 = weight(abstract_txt:dokumente in 1473) [ClassicSimilarity], result of:
            0.28176835 = score(doc=1473,freq=3.0), product of:
              0.41528407 = queryWeight, product of:
                3.6685884 = boost
                6.267673 = idf(docFreq=222, maxDocs=43254)
                0.018060923 = queryNorm
              0.67849547 = fieldWeight in 1473, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.267673 = idf(docFreq=222, maxDocs=43254)
                0.0625 = fieldNorm(doc=1473)
        0.2 = coord(5/25)
    
  2. Ehrmann, S.: ¬Die Nadel im Bytehaufen : Finden statt suchen: Text Retrieval, Multimediadatenbanken, Dokumentenmanagement (2000) 0.10
    0.10094394 = sum of:
      0.10094394 = product of:
        0.6308996 = sum of:
          0.041185528 = weight(abstract_txt:oder in 318) [ClassicSimilarity], result of:
            0.041185528 = score(doc=318,freq=1.0), product of:
              0.07714136 = queryWeight, product of:
                4.271175 = idf(docFreq=1641, maxDocs=43254)
                0.018060923 = queryNorm
              0.53389686 = fieldWeight in 318, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.271175 = idf(docFreq=1641, maxDocs=43254)
                0.125 = fieldNorm(doc=318)
          0.09589854 = weight(abstract_txt:finden in 318) [ClassicSimilarity], result of:
            0.09589854 = score(doc=318,freq=1.0), product of:
              0.13551858 = queryWeight, product of:
                1.3254268 = boost
                5.66113 = idf(docFreq=408, maxDocs=43254)
                0.018060923 = queryNorm
              0.70764124 = fieldWeight in 318, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.66113 = idf(docFreq=408, maxDocs=43254)
                0.125 = fieldNorm(doc=318)
          0.16845745 = weight(abstract_txt:texte in 318) [ClassicSimilarity], result of:
            0.16845745 = score(doc=318,freq=1.0), product of:
              0.19729573 = queryWeight, product of:
                1.5992457 = boost
                6.830658 = idf(docFreq=126, maxDocs=43254)
                0.018060923 = queryNorm
              0.85383224 = fieldWeight in 318, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.830658 = idf(docFreq=126, maxDocs=43254)
                0.125 = fieldNorm(doc=318)
          0.3253581 = weight(abstract_txt:dokumente in 318) [ClassicSimilarity], result of:
            0.3253581 = score(doc=318,freq=1.0), product of:
              0.41528407 = queryWeight, product of:
                3.6685884 = boost
                6.267673 = idf(docFreq=222, maxDocs=43254)
                0.018060923 = queryNorm
              0.7834591 = fieldWeight in 318, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.267673 = idf(docFreq=222, maxDocs=43254)
                0.125 = fieldNorm(doc=318)
        0.16 = coord(4/25)
    
  3. Kasprzik, A.: Automatisierte und semiautomatisierte Klassifizierung : eine Analyse aktueller Projekte (2014) 0.09
    0.09167705 = sum of:
      0.09167705 = product of:
        0.57298154 = sum of:
          0.025740957 = weight(abstract_txt:oder in 3935) [ClassicSimilarity], result of:
            0.025740957 = score(doc=3935,freq=1.0), product of:
              0.07714136 = queryWeight, product of:
                4.271175 = idf(docFreq=1641, maxDocs=43254)
                0.018060923 = queryNorm
              0.33368555 = fieldWeight in 3935, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.271175 = idf(docFreq=1641, maxDocs=43254)
                0.078125 = fieldNorm(doc=3935)
          0.20571083 = weight(abstract_txt:automatisierten in 3935) [ClassicSimilarity], result of:
            0.20571083 = score(doc=3935,freq=1.0), product of:
              0.3083488 = queryWeight, product of:
                1.9992977 = boost
                8.5393505 = idf(docFreq=22, maxDocs=43254)
                0.018060923 = queryNorm
              0.6671368 = fieldWeight in 3935, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.5393505 = idf(docFreq=22, maxDocs=43254)
                0.078125 = fieldNorm(doc=3935)
          0.13818093 = weight(abstract_txt:intelligenz in 3935) [ClassicSimilarity], result of:
            0.13818093 = score(doc=3935,freq=1.0), product of:
              0.27072808 = queryWeight, product of:
                2.2943974 = boost
                6.5331817 = idf(docFreq=170, maxDocs=43254)
                0.018060923 = queryNorm
              0.5104048 = fieldWeight in 3935, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5331817 = idf(docFreq=170, maxDocs=43254)
                0.078125 = fieldNorm(doc=3935)
          0.20334882 = weight(abstract_txt:dokumente in 3935) [ClassicSimilarity], result of:
            0.20334882 = score(doc=3935,freq=1.0), product of:
              0.41528407 = queryWeight, product of:
                3.6685884 = boost
                6.267673 = idf(docFreq=222, maxDocs=43254)
                0.018060923 = queryNorm
              0.48966196 = fieldWeight in 3935, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.267673 = idf(docFreq=222, maxDocs=43254)
                0.078125 = fieldNorm(doc=3935)
        0.16 = coord(4/25)
    
  4. Heyer, G.; Läuter, M.; Quasthoff, U.; Wolff, C.: Texttechnologische Anwendungen am Beispiel Text Mining (2000) 0.09
    0.08649597 = sum of:
      0.08649597 = product of:
        0.43247986 = sum of:
          0.04368385 = weight(abstract_txt:oder in 566) [ClassicSimilarity], result of:
            0.04368385 = score(doc=566,freq=2.0), product of:
              0.07714136 = queryWeight, product of:
                4.271175 = idf(docFreq=1641, maxDocs=43254)
                0.018060923 = queryNorm
              0.5662831 = fieldWeight in 566, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.271175 = idf(docFreq=1641, maxDocs=43254)
                0.09375 = fieldNorm(doc=566)
          0.15384087 = weight(abstract_txt:unstrukturierte in 566) [ClassicSimilarity], result of:
            0.15384087 = score(doc=566,freq=1.0), product of:
              0.1785615 = queryWeight, product of:
                1.0758092 = boost
                9.189939 = idf(docFreq=11, maxDocs=43254)
                0.018060923 = queryNorm
              0.86155677 = fieldWeight in 566, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.189939 = idf(docFreq=11, maxDocs=43254)
                0.09375 = fieldNorm(doc=566)
          0.036688134 = weight(abstract_txt:werden in 566) [ClassicSimilarity], result of:
            0.036688134 = score(doc=566,freq=2.0), product of:
              0.07860573 = queryWeight, product of:
                1.2363149 = boost
                3.5203447 = idf(docFreq=3478, maxDocs=43254)
                0.018060923 = queryNorm
              0.46673614 = fieldWeight in 566, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5203447 = idf(docFreq=3478, maxDocs=43254)
                0.09375 = fieldNorm(doc=566)
          0.071923904 = weight(abstract_txt:finden in 566) [ClassicSimilarity], result of:
            0.071923904 = score(doc=566,freq=1.0), product of:
              0.13551858 = queryWeight, product of:
                1.3254268 = boost
                5.66113 = idf(docFreq=408, maxDocs=43254)
                0.018060923 = queryNorm
              0.53073096 = fieldWeight in 566, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.66113 = idf(docFreq=408, maxDocs=43254)
                0.09375 = fieldNorm(doc=566)
          0.12634309 = weight(abstract_txt:texte in 566) [ClassicSimilarity], result of:
            0.12634309 = score(doc=566,freq=1.0), product of:
              0.19729573 = queryWeight, product of:
                1.5992457 = boost
                6.830658 = idf(docFreq=126, maxDocs=43254)
                0.018060923 = queryNorm
              0.6403742 = fieldWeight in 566, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.830658 = idf(docFreq=126, maxDocs=43254)
                0.09375 = fieldNorm(doc=566)
        0.2 = coord(5/25)
    
  5. Blittkowsky, R.: ¬Das World Wide Web gleicht einer Fliege : Studien versuchen zu erklären, warum Suchmaschinen nicht immer fündig werden (2001) 0.07
    0.07356979 = sum of:
      0.07356979 = product of:
        0.3065408 = sum of:
          0.02302341 = weight(abstract_txt:oder in 3091) [ClassicSimilarity], result of:
            0.02302341 = score(doc=3091,freq=5.0), product of:
              0.07714136 = queryWeight, product of:
                4.271175 = idf(docFreq=1641, maxDocs=43254)
                0.018060923 = queryNorm
              0.2984574 = fieldWeight in 3091, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.271175 = idf(docFreq=1641, maxDocs=43254)
                0.03125 = fieldNorm(doc=3091)
          0.05439333 = weight(abstract_txt:trainiert in 3091) [ClassicSimilarity], result of:
            0.05439333 = score(doc=3091,freq=1.0), product of:
              0.18571684 = queryWeight, product of:
                1.0971525 = boost
                9.37226 = idf(docFreq=9, maxDocs=43254)
                0.018060923 = queryNorm
              0.29288313 = fieldWeight in 3091, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.37226 = idf(docFreq=9, maxDocs=43254)
                0.03125 = fieldNorm(doc=3091)
          0.017294953 = weight(abstract_txt:werden in 3091) [ClassicSimilarity], result of:
            0.017294953 = score(doc=3091,freq=4.0), product of:
              0.07860573 = queryWeight, product of:
                1.2363149 = boost
                3.5203447 = idf(docFreq=3478, maxDocs=43254)
                0.018060923 = queryNorm
              0.22002155 = fieldWeight in 3091, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.5203447 = idf(docFreq=3478, maxDocs=43254)
                0.03125 = fieldNorm(doc=3091)
          0.041525286 = weight(abstract_txt:finden in 3091) [ClassicSimilarity], result of:
            0.041525286 = score(doc=3091,freq=3.0), product of:
              0.13551858 = queryWeight, product of:
                1.3254268 = boost
                5.66113 = idf(docFreq=408, maxDocs=43254)
                0.018060923 = queryNorm
              0.30641764 = fieldWeight in 3091, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.66113 = idf(docFreq=408, maxDocs=43254)
                0.03125 = fieldNorm(doc=3091)
          0.055272367 = weight(abstract_txt:intelligenz in 3091) [ClassicSimilarity], result of:
            0.055272367 = score(doc=3091,freq=1.0), product of:
              0.27072808 = queryWeight, product of:
                2.2943974 = boost
                6.5331817 = idf(docFreq=170, maxDocs=43254)
                0.018060923 = queryNorm
              0.20416193 = fieldWeight in 3091, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5331817 = idf(docFreq=170, maxDocs=43254)
                0.03125 = fieldNorm(doc=3091)
          0.11503145 = weight(abstract_txt:dokumente in 3091) [ClassicSimilarity], result of:
            0.11503145 = score(doc=3091,freq=2.0), product of:
              0.41528407 = queryWeight, product of:
                3.6685884 = boost
                6.267673 = idf(docFreq=222, maxDocs=43254)
                0.018060923 = queryNorm
              0.27699462 = fieldWeight in 3091, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.267673 = idf(docFreq=222, maxDocs=43254)
                0.03125 = fieldNorm(doc=3091)
        0.24 = coord(6/25)