Document (#39381)

Author
Renker, L.
Title
Exploration von Textkorpora : Topic Models als Grundlage der Interaktion
Imprint
Gummersbach : Fakultät für Informatik und Ingenieurswissenschaften
Year
2015
Pages
117 S
Abstract
Das Internet birgt schier endlose Informationen. Ein zentrales Problem besteht heutzutage darin diese auch zugänglich zu machen. Es ist ein fundamentales Domänenwissen erforderlich, um in einer Volltextsuche die korrekten Suchanfragen zu formulieren. Das ist jedoch oftmals nicht vorhanden, so dass viel Zeit aufgewandt werden muss, um einen Überblick des behandelten Themas zu erhalten. In solchen Situationen findet sich ein Nutzer in einem explorativen Suchvorgang, in dem er sich schrittweise an ein Thema heranarbeiten muss. Für die Organisation von Daten werden mittlerweile ganz selbstverständlich Verfahren des Machine Learnings verwendet. In den meisten Fällen bleiben sie allerdings für den Anwender unsichtbar. Die interaktive Verwendung in explorativen Suchprozessen könnte die menschliche Urteilskraft enger mit der maschinellen Verarbeitung großer Datenmengen verbinden. Topic Models sind ebensolche Verfahren. Sie finden in einem Textkorpus verborgene Themen, die sich relativ gut von Menschen interpretieren lassen und sind daher vielversprechend für die Anwendung in explorativen Suchprozessen. Nutzer können damit beim Verstehen unbekannter Quellen unterstützt werden. Bei der Betrachtung entsprechender Forschungsarbeiten fiel auf, dass Topic Models vorwiegend zur Erzeugung statischer Visualisierungen verwendet werden. Das Sensemaking ist ein wesentlicher Bestandteil der explorativen Suche und wird dennoch nur in sehr geringem Umfang genutzt, um algorithmische Neuerungen zu begründen und in einen umfassenden Kontext zu setzen. Daraus leitet sich die Vermutung ab, dass die Verwendung von Modellen des Sensemakings und die nutzerzentrierte Konzeption von explorativen Suchen, neue Funktionen für die Interaktion mit Topic Models hervorbringen und einen Kontext für entsprechende Forschungsarbeiten bieten können.
Content
Vgl.: urn:nbn:de:hbz:832-epub4-6686.
Footnote
Masterthesis zur Erlangung des akademischen Grades Master of Science (M.Sc.) vorgelegt an der Fachhochschule Köln / Fakultät für Informatik und Ingenieurswissenschaften im Studiengang Medieninformatik.
Theme
Computerlinguistik
Semantisches Umfeld in Indexierung u. Retrieval
RSWK
Explorative Suche
Mensch-Maschine-Kommunikation
Sinnkonstitution

Similar documents (content)

  1. Reichert, S.; Mayr, P.: Untersuchung von Relevanzeigenschaften in einem kontrollierten Eyetracking-Experiment (2012) 0.17
    0.16940841 = sum of:
      0.16940841 = product of:
        0.70586836 = sum of:
          0.044293173 = weight(abstract_txt:suche in 328) [ClassicSimilarity], result of:
            0.044293173 = score(doc=328,freq=1.0), product of:
              0.10105904 = queryWeight, product of:
                1.2180035 = boost
                5.6101127 = idf(docFreq=439, maxDocs=44218)
                0.014789553 = queryNorm
              0.43829006 = fieldWeight in 328, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6101127 = idf(docFreq=439, maxDocs=44218)
                0.078125 = fieldNorm(doc=328)
          0.048937988 = weight(abstract_txt:nutzer in 328) [ClassicSimilarity], result of:
            0.048937988 = score(doc=328,freq=1.0), product of:
              0.108006045 = queryWeight, product of:
                1.2591718 = boost
                5.799733 = idf(docFreq=363, maxDocs=44218)
                0.014789553 = queryNorm
              0.45310414 = fieldWeight in 328, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.799733 = idf(docFreq=363, maxDocs=44218)
                0.078125 = fieldNorm(doc=328)
          0.07498429 = weight(abstract_txt:verwendet in 328) [ClassicSimilarity], result of:
            0.07498429 = score(doc=328,freq=1.0), product of:
              0.14354813 = queryWeight, product of:
                1.4516426 = boost
                6.686252 = idf(docFreq=149, maxDocs=44218)
                0.014789553 = queryNorm
              0.5223634 = fieldWeight in 328, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.686252 = idf(docFreq=149, maxDocs=44218)
                0.078125 = fieldNorm(doc=328)
          0.0374579 = weight(abstract_txt:werden in 328) [ClassicSimilarity], result of:
            0.0374579 = score(doc=328,freq=3.0), product of:
              0.07894946 = queryWeight, product of:
                1.5224763 = boost
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.014789553 = queryNorm
              0.47445413 = fieldWeight in 328, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.078125 = fieldNorm(doc=328)
          0.022609703 = weight(abstract_txt:sich in 328) [ClassicSimilarity], result of:
            0.022609703 = score(doc=328,freq=1.0), product of:
              0.08132497 = queryWeight, product of:
                1.5452114 = boost
                3.5586145 = idf(docFreq=3422, maxDocs=44218)
                0.014789553 = queryNorm
              0.27801675 = fieldWeight in 328, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5586145 = idf(docFreq=3422, maxDocs=44218)
                0.078125 = fieldNorm(doc=328)
          0.4775853 = weight(abstract_txt:explorativen in 328) [ClassicSimilarity], result of:
            0.4775853 = score(doc=328,freq=1.0), product of:
              0.6694189 = queryWeight, product of:
                4.9565554 = boost
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.014789553 = queryNorm
              0.71343267 = fieldWeight in 328, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.078125 = fieldNorm(doc=328)
        0.24 = coord(6/25)
    
  2. Boteram, F.: Typisierung semantischer Relationen in integrierten Systemen der Wissensorganisation (2013) 0.15
    0.14914817 = sum of:
      0.14914817 = product of:
        0.6214507 = sum of:
          0.03915039 = weight(abstract_txt:nutzer in 919) [ClassicSimilarity], result of:
            0.03915039 = score(doc=919,freq=1.0), product of:
              0.108006045 = queryWeight, product of:
                1.2591718 = boost
                5.799733 = idf(docFreq=363, maxDocs=44218)
                0.014789553 = queryNorm
              0.36248332 = fieldWeight in 919, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.799733 = idf(docFreq=363, maxDocs=44218)
                0.0625 = fieldNorm(doc=919)
          0.04753982 = weight(abstract_txt:kontext in 919) [ClassicSimilarity], result of:
            0.04753982 = score(doc=919,freq=1.0), product of:
              0.122931264 = queryWeight, product of:
                1.3433591 = boost
                6.187499 = idf(docFreq=246, maxDocs=44218)
                0.014789553 = queryNorm
              0.3867187 = fieldWeight in 919, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.187499 = idf(docFreq=246, maxDocs=44218)
                0.0625 = fieldNorm(doc=919)
          0.05810275 = weight(abstract_txt:verwendung in 919) [ClassicSimilarity], result of:
            0.05810275 = score(doc=919,freq=1.0), product of:
              0.14052549 = queryWeight, product of:
                1.436278 = boost
                6.615483 = idf(docFreq=160, maxDocs=44218)
                0.014789553 = queryNorm
              0.41346768 = fieldWeight in 919, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.615483 = idf(docFreq=160, maxDocs=44218)
                0.0625 = fieldNorm(doc=919)
          0.059987437 = weight(abstract_txt:verwendet in 919) [ClassicSimilarity], result of:
            0.059987437 = score(doc=919,freq=1.0), product of:
              0.14354813 = queryWeight, product of:
                1.4516426 = boost
                6.686252 = idf(docFreq=149, maxDocs=44218)
                0.014789553 = queryNorm
              0.41789076 = fieldWeight in 919, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.686252 = idf(docFreq=149, maxDocs=44218)
                0.0625 = fieldNorm(doc=919)
          0.034602124 = weight(abstract_txt:werden in 919) [ClassicSimilarity], result of:
            0.034602124 = score(doc=919,freq=4.0), product of:
              0.07894946 = queryWeight, product of:
                1.5224763 = boost
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.014789553 = queryNorm
              0.43828195 = fieldWeight in 919, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.0625 = fieldNorm(doc=919)
          0.38206822 = weight(abstract_txt:explorativen in 919) [ClassicSimilarity], result of:
            0.38206822 = score(doc=919,freq=1.0), product of:
              0.6694189 = queryWeight, product of:
                4.9565554 = boost
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.014789553 = queryNorm
              0.5707461 = fieldWeight in 919, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.0625 = fieldNorm(doc=919)
        0.24 = coord(6/25)
    
  3. Hubrich, J.: Vom Stringmatching zur Begriffsexploration : das Potential integrierter begrifflicher Interoperabilität (2013) 0.14
    0.14158475 = sum of:
      0.14158475 = product of:
        0.88490474 = sum of:
          0.030584246 = weight(abstract_txt:werden in 793) [ClassicSimilarity], result of:
            0.030584246 = score(doc=793,freq=2.0), product of:
              0.07894946 = queryWeight, product of:
                1.5224763 = boost
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.014789553 = queryNorm
              0.3873902 = fieldWeight in 793, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.078125 = fieldNorm(doc=793)
          0.031974953 = weight(abstract_txt:sich in 793) [ClassicSimilarity], result of:
            0.031974953 = score(doc=793,freq=2.0), product of:
              0.08132497 = queryWeight, product of:
                1.5452114 = boost
                3.5586145 = idf(docFreq=3422, maxDocs=44218)
                0.014789553 = queryNorm
              0.39317507 = fieldWeight in 793, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5586145 = idf(docFreq=3422, maxDocs=44218)
                0.078125 = fieldNorm(doc=793)
          0.3447603 = weight(abstract_txt:suchprozessen in 793) [ClassicSimilarity], result of:
            0.3447603 = score(doc=793,freq=2.0), product of:
              0.3150302 = queryWeight, product of:
                2.1504881 = boost
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.014789553 = queryNorm
              1.0943723 = fieldWeight in 793, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.078125 = fieldNorm(doc=793)
          0.4775853 = weight(abstract_txt:explorativen in 793) [ClassicSimilarity], result of:
            0.4775853 = score(doc=793,freq=1.0), product of:
              0.6694189 = queryWeight, product of:
                4.9565554 = boost
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.014789553 = queryNorm
              0.71343267 = fieldWeight in 793, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.078125 = fieldNorm(doc=793)
        0.16 = coord(4/25)
    
  4. Ihm, P.: Numerische Taxonomie und Datenbanken (1982) 0.13
    0.12964405 = sum of:
      0.12964405 = product of:
        0.8102753 = sum of:
          0.067184374 = weight(abstract_txt:verfahren in 101) [ClassicSimilarity], result of:
            0.067184374 = score(doc=101,freq=1.0), product of:
              0.10660498 = queryWeight, product of:
                1.2509781 = boost
                5.761993 = idf(docFreq=377, maxDocs=44218)
                0.014789553 = queryNorm
              0.63021797 = fieldWeight in 101, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.761993 = idf(docFreq=377, maxDocs=44218)
                0.109375 = fieldNorm(doc=101)
          0.042817943 = weight(abstract_txt:werden in 101) [ClassicSimilarity], result of:
            0.042817943 = score(doc=101,freq=2.0), product of:
              0.07894946 = queryWeight, product of:
                1.5224763 = boost
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.014789553 = queryNorm
              0.54234624 = fieldWeight in 101, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.109375 = fieldNorm(doc=101)
          0.031653587 = weight(abstract_txt:sich in 101) [ClassicSimilarity], result of:
            0.031653587 = score(doc=101,freq=1.0), product of:
              0.08132497 = queryWeight, product of:
                1.5452114 = boost
                3.5586145 = idf(docFreq=3422, maxDocs=44218)
                0.014789553 = queryNorm
              0.38922346 = fieldWeight in 101, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5586145 = idf(docFreq=3422, maxDocs=44218)
                0.109375 = fieldNorm(doc=101)
          0.6686194 = weight(abstract_txt:explorativen in 101) [ClassicSimilarity], result of:
            0.6686194 = score(doc=101,freq=1.0), product of:
              0.6694189 = queryWeight, product of:
                4.9565554 = boost
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.014789553 = queryNorm
              0.9988057 = fieldWeight in 101, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.109375 = fieldNorm(doc=101)
        0.16 = coord(4/25)
    
  5. Otto, A.: Ordnungssysteme als Wissensbasis für die Suche in textbasierten Datenbeständen : dargestellt am Beispiel einer soziologischen Bibliographie (1998) 0.10
    0.09633232 = sum of:
      0.09633232 = product of:
        0.344044 = sum of:
          0.07594697 = weight(abstract_txt:suche in 6625) [ClassicSimilarity], result of:
            0.07594697 = score(doc=6625,freq=6.0), product of:
              0.10105904 = queryWeight, product of:
                1.2180035 = boost
                5.6101127 = idf(docFreq=439, maxDocs=44218)
                0.014789553 = queryNorm
              0.7515109 = fieldWeight in 6625, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.6101127 = idf(docFreq=439, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6625)
          0.047506522 = weight(abstract_txt:verfahren in 6625) [ClassicSimilarity], result of:
            0.047506522 = score(doc=6625,freq=2.0), product of:
              0.10660498 = queryWeight, product of:
                1.2509781 = boost
                5.761993 = idf(docFreq=377, maxDocs=44218)
                0.014789553 = queryNorm
              0.44563138 = fieldWeight in 6625, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.761993 = idf(docFreq=377, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6625)
          0.034256592 = weight(abstract_txt:nutzer in 6625) [ClassicSimilarity], result of:
            0.034256592 = score(doc=6625,freq=1.0), product of:
              0.108006045 = queryWeight, product of:
                1.2591718 = boost
                5.799733 = idf(docFreq=363, maxDocs=44218)
                0.014789553 = queryNorm
              0.3171729 = fieldWeight in 6625, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.799733 = idf(docFreq=363, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6625)
          0.050839905 = weight(abstract_txt:verwendung in 6625) [ClassicSimilarity], result of:
            0.050839905 = score(doc=6625,freq=1.0), product of:
              0.14052549 = queryWeight, product of:
                1.436278 = boost
                6.615483 = idf(docFreq=160, maxDocs=44218)
                0.014789553 = queryNorm
              0.36178422 = fieldWeight in 6625, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.615483 = idf(docFreq=160, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6625)
          0.07423067 = weight(abstract_txt:verwendet in 6625) [ClassicSimilarity], result of:
            0.07423067 = score(doc=6625,freq=2.0), product of:
              0.14354813 = queryWeight, product of:
                1.4516426 = boost
                6.686252 = idf(docFreq=149, maxDocs=44218)
                0.014789553 = queryNorm
              0.51711345 = fieldWeight in 6625, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.686252 = idf(docFreq=149, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6625)
          0.033850558 = weight(abstract_txt:werden in 6625) [ClassicSimilarity], result of:
            0.033850558 = score(doc=6625,freq=5.0), product of:
              0.07894946 = queryWeight, product of:
                1.5224763 = boost
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.014789553 = queryNorm
              0.42876238 = fieldWeight in 6625, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6625)
          0.027412811 = weight(abstract_txt:sich in 6625) [ClassicSimilarity], result of:
            0.027412811 = score(doc=6625,freq=3.0), product of:
              0.08132497 = queryWeight, product of:
                1.5452114 = boost
                3.5586145 = idf(docFreq=3422, maxDocs=44218)
                0.014789553 = queryNorm
              0.3370774 = fieldWeight in 6625, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.5586145 = idf(docFreq=3422, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6625)
        0.28 = coord(7/25)