Document (#29332)

Author
Westermeyer, D.
Title
Adaptive Techniken zur Informationsgewinnung : der Webcrawler InfoSpiders
Imprint
Münster : Institut für Wirtschaftsinformatik der Westfälische Wilhelms-Universität Münster
Year
2005
Pages
22 S
Abstract
Die Suche nach Informationen im Internet führt den Nutzer meistens direkt zu einer Suchmaschine. Teile der gelieferten Ergebnisse enthalten aber manchmal nicht das, was der Nutzer gesucht hat. Hier setzen sog. adaptive Agenten an, welche die Gewohnheiten ihres Nutzers zu erlernen versuchen, um später auf Basis dessen selbstständig Entscheidungen zu treffen, ohne dass der Nutzer dazu befragt werden muss. Zunächst werden im Grundlagenteil adaptive Techniken zur Informationsgewinnung sowie die grundlegenden Eigenschaften von Webcrawlern besprochen. Im Hauptteil wird daraufhin der Webcrawler InfoSpiders erläutert. Dieses Programm arbeitet mit mehreren adaptiven Agenten, die parallel basierend auf einem Satz von Startlinks das Internet nach Informationen durchsuchen. Dabei bedienen sich die Agenten verschiedenster Techniken. Darunter fallen beispielsweise statistische Methoden, die den Inhalt von Webseiten untersuchen sowie neuronale Netze, mit denen der Inhalt bewertet wird. Eine andere Technik implementiert der genetische Algorithmus mit Hilfe dessen die Agenten Nachkommen mit neuen Mutationen erzeugen können. Danach wird eine konkrete Implementierung des InfoSpiders-Algorithmus' anhand von MySpiders verdeutlicht. Im Anschluss daran wird der InfoSpiders-Algorithmus sowie MySpiders einer Evaluation bezüglich des zusätzlichen Nutzens gegenüber herkömmlichen Suchmaschinen unterzogen. Eine Zusammenfassung mit Ausblick zu weiteren Entwicklungen in dem Bereich adaptiver Agenten zur Suche im Internet wird das Thema abschließen.
Content
Ausarbeitung im Rahmen des Seminars Suchmaschinen und Suchalgorithmen, Institut für Wirtschaftsinformatik Praktische Informatik in der Wirtschaft, Westfälische Wilhelms-Universität Münster. - Vgl.: http://www-wi.uni-muenster.de/pi/lehre/ss05/seminarSuchen/Ausarbeitungen/DenisWestermeyer.pdf
Theme
Suchmaschinen
Object
InfoSpider

Similar documents (content)

  1. Lahme, N.: Information Retrieval im Wissensmanagement : ein am Vorwissen orientierter Ansatz zur Komposition von Informationsressourcen (2004) 0.18
    0.17723228 = sum of:
      0.17723228 = product of:
        0.5538509 = sum of:
          0.029345281 = weight(abstract_txt:nach in 940) [ClassicSimilarity], result of:
            0.029345281 = score(doc=940,freq=2.0), product of:
              0.06093728 = queryWeight, product of:
                1.0092603 = boost
                4.358632 = idf(docFreq=1514, maxDocs=43556)
                0.013852548 = queryNorm
              0.48156533 = fieldWeight in 940, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.358632 = idf(docFreq=1514, maxDocs=43556)
                0.078125 = fieldNorm(doc=940)
          0.030878535 = weight(abstract_txt:informationen in 940) [ClassicSimilarity], result of:
            0.030878535 = score(doc=940,freq=1.0), product of:
              0.0794277 = queryWeight, product of:
                1.1522524 = boost
                4.976164 = idf(docFreq=816, maxDocs=43556)
                0.013852548 = queryNorm
              0.3887628 = fieldWeight in 940, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.976164 = idf(docFreq=816, maxDocs=43556)
                0.078125 = fieldNorm(doc=940)
          0.028023861 = weight(abstract_txt:eine in 940) [ClassicSimilarity], result of:
            0.028023861 = score(doc=940,freq=3.0), product of:
              0.05909393 = queryWeight, product of:
                1.217247 = boost
                3.5045679 = idf(docFreq=3558, maxDocs=43556)
                0.013852548 = queryNorm
              0.47422573 = fieldWeight in 940, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.5045679 = idf(docFreq=3558, maxDocs=43556)
                0.078125 = fieldNorm(doc=940)
          0.044434667 = weight(abstract_txt:suche in 940) [ClassicSimilarity], result of:
            0.044434667 = score(doc=940,freq=1.0), product of:
              0.10123923 = queryWeight, product of:
                1.3008765 = boost
                5.6180177 = idf(docFreq=429, maxDocs=43556)
                0.013852548 = queryNorm
              0.43890762 = fieldWeight in 940, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6180177 = idf(docFreq=429, maxDocs=43556)
                0.078125 = fieldNorm(doc=940)
          0.06061731 = weight(abstract_txt:dessen in 940) [ClassicSimilarity], result of:
            0.06061731 = score(doc=940,freq=1.0), product of:
              0.12452751 = queryWeight, product of:
                1.4427607 = boost
                6.2307644 = idf(docFreq=232, maxDocs=43556)
                0.013852548 = queryNorm
              0.48677847 = fieldWeight in 940, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2307644 = idf(docFreq=232, maxDocs=43556)
                0.078125 = fieldNorm(doc=940)
          0.05267164 = weight(abstract_txt:sowie in 940) [ClassicSimilarity], result of:
            0.05267164 = score(doc=940,freq=2.0), product of:
              0.10302418 = queryWeight, product of:
                1.6072257 = boost
                4.627353 = idf(docFreq=1157, maxDocs=43556)
                0.013852548 = queryNorm
              0.51125515 = fieldWeight in 940, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.627353 = idf(docFreq=1157, maxDocs=43556)
                0.078125 = fieldNorm(doc=940)
          0.033838186 = weight(abstract_txt:wird in 940) [ClassicSimilarity], result of:
            0.033838186 = score(doc=940,freq=1.0), product of:
              0.114582665 = queryWeight, product of:
                2.1882207 = boost
                3.7800553 = idf(docFreq=2701, maxDocs=43556)
                0.013852548 = queryNorm
              0.29531682 = fieldWeight in 940, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7800553 = idf(docFreq=2701, maxDocs=43556)
                0.078125 = fieldNorm(doc=940)
          0.2740414 = weight(abstract_txt:algorithmus in 940) [ClassicSimilarity], result of:
            0.2740414 = score(doc=940,freq=2.0), product of:
              0.30933714 = queryWeight, product of:
                2.784988 = boost
                8.018241 = idf(docFreq=38, maxDocs=43556)
                0.013852548 = queryNorm
              0.8858988 = fieldWeight in 940, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.018241 = idf(docFreq=38, maxDocs=43556)
                0.078125 = fieldNorm(doc=940)
        0.32 = coord(8/25)
    
  2. Voregger, M.: Angriff der Heinzelmännchen : Steuern in den Informationsfluten des Webs - mit Hilfe digitaler Agenten (1997) 0.17
    0.17408772 = sum of:
      0.17408772 = product of:
        2.1760964 = sum of:
          0.092680395 = weight(abstract_txt:internet in 377) [ClassicSimilarity], result of:
            0.092680395 = score(doc=377,freq=1.0), product of:
              0.06648582 = queryWeight, product of:
                1.2911354 = boost
                3.7172995 = idf(docFreq=2876, maxDocs=43556)
                0.013852548 = queryNorm
              1.3939873 = fieldWeight in 377, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7172995 = idf(docFreq=2876, maxDocs=43556)
                0.375 = fieldNorm(doc=377)
          2.083416 = weight(abstract_txt:agenten in 377) [ClassicSimilarity], result of:
            2.083416 = score(doc=377,freq=1.0), product of:
              0.6278714 = queryWeight, product of:
                5.1223235 = boost
                8.848589 = idf(docFreq=16, maxDocs=43556)
                0.013852548 = queryNorm
              3.3182209 = fieldWeight in 377, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.848589 = idf(docFreq=16, maxDocs=43556)
                0.375 = fieldNorm(doc=377)
        0.08 = coord(2/25)
    
  3. Weiner, M.: ¬Die Agenten kommen (2002) 0.17
    0.170819 = sum of:
      0.170819 = product of:
        0.854095 = sum of:
          0.035940487 = weight(abstract_txt:nach in 732) [ClassicSimilarity], result of:
            0.035940487 = score(doc=732,freq=3.0), product of:
              0.06093728 = queryWeight, product of:
                1.0092603 = boost
                4.358632 = idf(docFreq=1514, maxDocs=43556)
                0.013852548 = queryNorm
              0.5897947 = fieldWeight in 732, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.358632 = idf(docFreq=1514, maxDocs=43556)
                0.078125 = fieldNorm(doc=732)
          0.030878535 = weight(abstract_txt:informationen in 732) [ClassicSimilarity], result of:
            0.030878535 = score(doc=732,freq=1.0), product of:
              0.0794277 = queryWeight, product of:
                1.1522524 = boost
                4.976164 = idf(docFreq=816, maxDocs=43556)
                0.013852548 = queryNorm
              0.3887628 = fieldWeight in 732, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.976164 = idf(docFreq=816, maxDocs=43556)
                0.078125 = fieldNorm(doc=732)
          0.016179584 = weight(abstract_txt:eine in 732) [ClassicSimilarity], result of:
            0.016179584 = score(doc=732,freq=1.0), product of:
              0.05909393 = queryWeight, product of:
                1.217247 = boost
                3.5045679 = idf(docFreq=3558, maxDocs=43556)
                0.013852548 = queryNorm
              0.27379435 = fieldWeight in 732, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5045679 = idf(docFreq=3558, maxDocs=43556)
                0.078125 = fieldNorm(doc=732)
          0.019308416 = weight(abstract_txt:internet in 732) [ClassicSimilarity], result of:
            0.019308416 = score(doc=732,freq=1.0), product of:
              0.06648582 = queryWeight, product of:
                1.2911354 = boost
                3.7172995 = idf(docFreq=2876, maxDocs=43556)
                0.013852548 = queryNorm
              0.29041404 = fieldWeight in 732, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7172995 = idf(docFreq=2876, maxDocs=43556)
                0.078125 = fieldNorm(doc=732)
          0.75178796 = weight(abstract_txt:agenten in 732) [ClassicSimilarity], result of:
            0.75178796 = score(doc=732,freq=3.0), product of:
              0.6278714 = queryWeight, product of:
                5.1223235 = boost
                8.848589 = idf(docFreq=16, maxDocs=43556)
                0.013852548 = queryNorm
              1.1973598 = fieldWeight in 732, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.848589 = idf(docFreq=16, maxDocs=43556)
                0.078125 = fieldNorm(doc=732)
        0.2 = coord(5/25)
    
  4. Röscheisen, E.: Fin de such (2001) 0.17
    0.17058177 = sum of:
      0.17058177 = product of:
        1.0661361 = sum of:
          0.041500498 = weight(abstract_txt:nach in 494) [ClassicSimilarity], result of:
            0.041500498 = score(doc=494,freq=1.0), product of:
              0.06093728 = queryWeight, product of:
                1.0092603 = boost
                4.358632 = idf(docFreq=1514, maxDocs=43556)
                0.013852548 = queryNorm
              0.68103623 = fieldWeight in 494, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.358632 = idf(docFreq=1514, maxDocs=43556)
                0.15625 = fieldNorm(doc=494)
          0.08886933 = weight(abstract_txt:suche in 494) [ClassicSimilarity], result of:
            0.08886933 = score(doc=494,freq=1.0), product of:
              0.10123923 = queryWeight, product of:
                1.3008765 = boost
                5.6180177 = idf(docFreq=429, maxDocs=43556)
                0.013852548 = queryNorm
              0.87781525 = fieldWeight in 494, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6180177 = idf(docFreq=429, maxDocs=43556)
                0.15625 = fieldNorm(doc=494)
          0.06767637 = weight(abstract_txt:wird in 494) [ClassicSimilarity], result of:
            0.06767637 = score(doc=494,freq=1.0), product of:
              0.114582665 = queryWeight, product of:
                2.1882207 = boost
                3.7800553 = idf(docFreq=2701, maxDocs=43556)
                0.013852548 = queryNorm
              0.59063363 = fieldWeight in 494, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7800553 = idf(docFreq=2701, maxDocs=43556)
                0.15625 = fieldNorm(doc=494)
          0.8680899 = weight(abstract_txt:agenten in 494) [ClassicSimilarity], result of:
            0.8680899 = score(doc=494,freq=1.0), product of:
              0.6278714 = queryWeight, product of:
                5.1223235 = boost
                8.848589 = idf(docFreq=16, maxDocs=43556)
                0.013852548 = queryNorm
              1.382592 = fieldWeight in 494, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.848589 = idf(docFreq=16, maxDocs=43556)
                0.15625 = fieldNorm(doc=494)
        0.16 = coord(4/25)
    
  5. Göbel, R.: Semantic Web & Linked Data für professionelle Informationsangebote : Hoffnungsträger oder "alter Hut" - Eine Praxisbetrachtung für die Wirtschaftsinformationen (2010) 0.14
    0.1353472 = sum of:
      0.1353472 = product of:
        0.676736 = sum of:
          0.024900299 = weight(abstract_txt:nach in 1256) [ClassicSimilarity], result of:
            0.024900299 = score(doc=1256,freq=1.0), product of:
              0.06093728 = queryWeight, product of:
                1.0092603 = boost
                4.358632 = idf(docFreq=1514, maxDocs=43556)
                0.013852548 = queryNorm
              0.40862176 = fieldWeight in 1256, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.358632 = idf(docFreq=1514, maxDocs=43556)
                0.09375 = fieldNorm(doc=1256)
          0.03705424 = weight(abstract_txt:informationen in 1256) [ClassicSimilarity], result of:
            0.03705424 = score(doc=1256,freq=1.0), product of:
              0.0794277 = queryWeight, product of:
                1.1522524 = boost
                4.976164 = idf(docFreq=816, maxDocs=43556)
                0.013852548 = queryNorm
              0.46651536 = fieldWeight in 1256, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.976164 = idf(docFreq=816, maxDocs=43556)
                0.09375 = fieldNorm(doc=1256)
          0.053321604 = weight(abstract_txt:suche in 1256) [ClassicSimilarity], result of:
            0.053321604 = score(doc=1256,freq=1.0), product of:
              0.10123923 = queryWeight, product of:
                1.3008765 = boost
                5.6180177 = idf(docFreq=429, maxDocs=43556)
                0.013852548 = queryNorm
              0.5266892 = fieldWeight in 1256, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6180177 = idf(docFreq=429, maxDocs=43556)
                0.09375 = fieldNorm(doc=1256)
          0.04060583 = weight(abstract_txt:wird in 1256) [ClassicSimilarity], result of:
            0.04060583 = score(doc=1256,freq=1.0), product of:
              0.114582665 = queryWeight, product of:
                2.1882207 = boost
                3.7800553 = idf(docFreq=2701, maxDocs=43556)
                0.013852548 = queryNorm
              0.3543802 = fieldWeight in 1256, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7800553 = idf(docFreq=2701, maxDocs=43556)
                0.09375 = fieldNorm(doc=1256)
          0.520854 = weight(abstract_txt:agenten in 1256) [ClassicSimilarity], result of:
            0.520854 = score(doc=1256,freq=1.0), product of:
              0.6278714 = queryWeight, product of:
                5.1223235 = boost
                8.848589 = idf(docFreq=16, maxDocs=43556)
                0.013852548 = queryNorm
              0.8295552 = fieldWeight in 1256, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.848589 = idf(docFreq=16, maxDocs=43556)
                0.09375 = fieldNorm(doc=1256)
        0.2 = coord(5/25)