Search (7 results, page 1 of 1)

  • × theme_ss:"Data Mining"
  • × theme_ss:"Internet"
  • × type_ss:"a"
  1. Huvila, I.: Mining qualitative data on human information behaviour from the Web (2010) 0.05
    0.054016102 = product of:
      0.108032204 = sum of:
        0.108032204 = product of:
          0.21606441 = sum of:
            0.21606441 = weight(_text_:mining in 4676) [ClassicSimilarity], result of:
              0.21606441 = score(doc=4676,freq=6.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.75584245 = fieldWeight in 4676, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4676)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This paper discusses an approach of collecting qualitative data on human information behaviour that is based on mining web data using search engines. The approach is technically the same that has been used for some time in webometric research to make statistical inferences on web data, but the present paper shows how the same tools and data collecting methods can be used to gather data for qualitative data analysis on human information behaviour.
    Theme
    Data Mining
  2. Fenstermacher, K.D.; Ginsburg, M.: Client-side monitoring for Web mining (2003) 0.05
    0.053462073 = product of:
      0.10692415 = sum of:
        0.10692415 = product of:
          0.2138483 = sum of:
            0.2138483 = weight(_text_:mining in 1611) [ClassicSimilarity], result of:
              0.2138483 = score(doc=1611,freq=8.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.74808997 = fieldWeight in 1611, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1611)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    "Garbage in, garbage out" is a well-known phrase in computer analysis, and one that comes to mind when mining Web data to draw conclusions about Web users. The challenge is that data analysts wish to infer patterns of client-side behavior from server-side data. However, because only a fraction of the user's actions ever reaches the Web server, analysts must rely an incomplete data. In this paper, we propose a client-side monitoring system that is unobtrusive and supports flexible data collection. Moreover, the proposed framework encompasses client-side applications beyond the Web browser. Expanding monitoring beyond the browser to incorporate standard office productivity tools enables analysts to derive a much richer and more accurate picture of user behavior an the Web.
    Footnote
    Teil eines Themenheftes: "Web retrieval and mining: A machine learning perspective"
    Theme
    Data Mining
  3. Klein, H.: Web Content Mining (2004) 0.05
    0.053462073 = product of:
      0.10692415 = sum of:
        0.10692415 = product of:
          0.2138483 = sum of:
            0.2138483 = weight(_text_:mining in 3154) [ClassicSimilarity], result of:
              0.2138483 = score(doc=3154,freq=18.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.74808997 = fieldWeight in 3154, product of:
                  4.2426405 = tf(freq=18.0), with freq of:
                    18.0 = termFreq=18.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3154)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Web Mining - ein Schlagwort, das mit der Verbreitung des Internets immer öfter zu lesen und zu hören ist. Die gegenwärtige Forschung beschäftigt sich aber eher mit dem Nutzungsverhalten der Internetnutzer, und ein Blick in Tagungsprogramme einschlägiger Konferenzen (z.B. GOR - German Online Research) zeigt, dass die Analyse der Inhalte kaum Thema ist. Auf der GOR wurden 1999 zwei Vorträge zu diesem Thema gehalten, auf der Folgekonferenz 2001 kein einziger. Web Mining ist der Oberbegriff für zwei Typen von Web Mining: Web Usage Mining und Web Content Mining. Unter Web Usage Mining versteht man das Analysieren von Daten, wie sie bei der Nutzung des WWW anfallen und von den Servern protokolliert wenden. Man kann ermitteln, welche Seiten wie oft aufgerufen wurden, wie lange auf den Seiten verweilt wurde und vieles andere mehr. Beim Web Content Mining wird der Inhalt der Webseiten untersucht, der nicht nur Text, sondern auf Bilder, Video- und Audioinhalte enthalten kann. Die Software für die Analyse von Webseiten ist in den Grundzügen vorhanden, doch müssen die meisten Webseiten für die entsprechende Analysesoftware erst aufbereitet werden. Zuerst müssen die relevanten Websites ermittelt werden, die die gesuchten Inhalte enthalten. Das geschieht meist mit Suchmaschinen, von denen es mittlerweile Hunderte gibt. Allerdings kann man nicht davon ausgehen, dass die Suchmaschinen alle existierende Webseiten erfassen. Das ist unmöglich, denn durch das schnelle Wachstum des Internets kommen täglich Tausende von Webseiten hinzu, und bereits bestehende ändern sich der werden gelöscht. Oft weiß man auch nicht, wie die Suchmaschinen arbeiten, denn das gehört zu den Geschäftsgeheimnissen der Betreiber. Man muss also davon ausgehen, dass die Suchmaschinen nicht alle relevanten Websites finden (können). Der nächste Schritt ist das Herunterladen der Websites, dafür gibt es Software, die unter den Bezeichnungen OfflineReader oder Webspider zu finden ist. Das Ziel dieser Programme ist, die Website in einer Form herunterzuladen, die es erlaubt, die Website offline zu betrachten. Die Struktur der Website wird in der Regel beibehalten. Wer die Inhalte einer Website analysieren will, muss also alle Dateien mit seiner Analysesoftware verarbeiten können. Software für Inhaltsanalyse geht davon aus, dass nur Textinformationen in einer einzigen Datei verarbeitet werden. QDA Software (qualitative data analysis) verarbeitet dagegen auch Audiound Videoinhalte sowie internetspezifische Kommunikation wie z.B. Chats.
    Theme
    Data Mining
  4. Chen, H.; Chau, M.: Web mining : machine learning for Web applications (2003) 0.04
    0.037803393 = product of:
      0.075606786 = sum of:
        0.075606786 = product of:
          0.15121357 = sum of:
            0.15121357 = weight(_text_:mining in 4242) [ClassicSimilarity], result of:
              0.15121357 = score(doc=4242,freq=4.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.5289795 = fieldWeight in 4242, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4242)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Theme
    Data Mining
  5. Derek Doran, D.; Gokhale, S.S.: ¬A classification framework for web robots (2012) 0.04
    0.03564138 = product of:
      0.07128276 = sum of:
        0.07128276 = product of:
          0.14256552 = sum of:
            0.14256552 = weight(_text_:mining in 505) [ClassicSimilarity], result of:
              0.14256552 = score(doc=505,freq=2.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.49872664 = fieldWeight in 505, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.0625 = fieldNorm(doc=505)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Theme
    Data Mining
  6. Raan, A.F.J. van; Noyons, E.C.M.: Discovery of patterns of scientific and technological development and knowledge transfer (2002) 0.03
    0.031502828 = product of:
      0.063005656 = sum of:
        0.063005656 = product of:
          0.12601131 = sum of:
            0.12601131 = weight(_text_:mining in 3603) [ClassicSimilarity], result of:
              0.12601131 = score(doc=3603,freq=4.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.44081625 = fieldWeight in 3603, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3603)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This paper addresses a bibliometric methodology to discover the structure of the scientific 'landscape' in order to gain detailed insight into the development of MD fields, their interaction, and the transfer of knowledge between them. This methodology is appropriate to visualize the position of MD activities in relation to interdisciplinary MD developments, and particularly in relation to socio-economic problems. Furthermore, it allows the identification of the major actors. It even provides the possibility of foresight. We describe a first approach to apply bibliometric mapping as an instrument to investigate characteristics of knowledge transfer. In this paper we discuss the creation of 'maps of science' with help of advanced bibliometric methods. This 'bibliometric cartography' can be seen as a specific type of data-mining, applied to large amounts of scientific publications. As an example we describe the mapping of the field neuroscience, one of the largest and fast growing fields in the life sciences. The number of publications covered by this database is about 80,000 per year, the period covered is 1995-1998. Current research is going an to update the mapping for the years 1999-2002. This paper addresses the main lines of the methodology and its application in the study of knowledge transfer.
    Theme
    Data Mining
  7. Kong, S.; Ye, F.; Feng, L.; Zhao, Z.: Towards the prediction problems of bursting hashtags on Twitter (2015) 0.03
    0.031186208 = product of:
      0.062372416 = sum of:
        0.062372416 = product of:
          0.12474483 = sum of:
            0.12474483 = weight(_text_:mining in 2338) [ClassicSimilarity], result of:
              0.12474483 = score(doc=2338,freq=2.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.4363858 = fieldWeight in 2338, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2338)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Theme
    Data Mining