Search (12 results, page 1 of 1)

  • × year_i:[2010 TO 2020}
  • × theme_ss:"Data Mining"
  1. Jäger, L.: Von Big Data zu Big Brother (2018) 0.02
    0.021983063 = product of:
      0.102587625 = sum of:
        0.028240517 = weight(_text_:daten in 5234) [ClassicSimilarity], result of:
          0.028240517 = score(doc=5234,freq=2.0), product of:
            0.13425784 = queryWeight, product of:
              4.759573 = idf(docFreq=1029, maxDocs=44218)
              0.02820796 = queryNorm
            0.21034539 = fieldWeight in 5234, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.759573 = idf(docFreq=1029, maxDocs=44218)
              0.03125 = fieldNorm(doc=5234)
        0.06670353 = weight(_text_:datenschutz in 5234) [ClassicSimilarity], result of:
          0.06670353 = score(doc=5234,freq=2.0), product of:
            0.2063373 = queryWeight, product of:
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.02820796 = queryNorm
            0.32327422 = fieldWeight in 5234, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.03125 = fieldNorm(doc=5234)
        0.007643578 = product of:
          0.015287156 = sum of:
            0.015287156 = weight(_text_:22 in 5234) [ClassicSimilarity], result of:
              0.015287156 = score(doc=5234,freq=2.0), product of:
                0.09877947 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02820796 = queryNorm
                0.15476047 = fieldWeight in 5234, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5234)
          0.5 = coord(1/2)
      0.21428572 = coord(3/14)
    
    Abstract
    1983 bewegte ein einziges Thema die gesamte Bundesrepublik: die geplante Volkszählung. Jeder Haushalt in Westdeutschland sollte Fragebögen mit 36 Fragen zur Wohnsituation, den im Haushalt lebenden Personen und über ihre Einkommensverhältnisse ausfüllen. Es regte sich massiver Widerstand, hunderte Bürgerinitiativen formierten sich im ganzen Land gegen die Befragung. Man wollte nicht "erfasst" werden, die Privatsphäre war heilig. Es bestand die (berechtigte) Sorge, dass die Antworten auf den eigentlich anonymisierten Fragebögen Rückschlüsse auf die Identität der Befragten zulassen. Das Bundesverfassungsgericht gab den Klägern gegen den Zensus Recht: Die geplante Volkszählung verstieß gegen den Datenschutz und damit auch gegen das Grundgesetz. Sie wurde gestoppt. Nur eine Generation später geben wir sorglos jedes Mal beim Einkaufen die Bonuskarte der Supermarktkette heraus, um ein paar Punkte für ein Geschenk oder Rabatte beim nächsten Einkauf zu sammeln. Und dabei wissen wir sehr wohl, dass der Supermarkt damit unser Konsumverhalten bis ins letzte Detail erfährt. Was wir nicht wissen, ist, wer noch Zugang zu diesen Daten erhält. Deren Käufer bekommen nicht nur Zugriff auf unsere Einkäufe, sondern können über sie auch unsere Gewohnheiten, persönlichen Vorlieben und Einkommen ermitteln. Genauso unbeschwert surfen wir im Internet, googeln und shoppen, mailen und chatten. Google, Facebook und Microsoft schauen bei all dem nicht nur zu, sondern speichern auf alle Zeiten alles, was wir von uns geben, was wir einkaufen, was wir suchen, und verwenden es für ihre eigenen Zwecke. Sie durchstöbern unsere E-Mails, kennen unser persönliches Zeitmanagement, verfolgen unseren momentanen Standort, wissen um unsere politischen, religiösen und sexuellen Präferenzen (wer kennt ihn nicht, den Button "an Männern interessiert" oder "an Frauen interessiert"?), unsere engsten Freunde, mit denen wir online verbunden sind, unseren Beziehungsstatus, welche Schule wir besuchen oder besucht haben und vieles mehr.
    Date
    22. 1.2018 11:33:49
  2. Nohr, H.: Big Data im Lichte der EU-Datenschutz-Grundverordnung (2017) 0.01
    0.013476149 = product of:
      0.18866608 = sum of:
        0.18866608 = weight(_text_:datenschutz in 4076) [ClassicSimilarity], result of:
          0.18866608 = score(doc=4076,freq=4.0), product of:
            0.2063373 = queryWeight, product of:
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.02820796 = queryNorm
            0.9143576 = fieldWeight in 4076, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.0625 = fieldNorm(doc=4076)
      0.071428575 = coord(1/14)
    
    Abstract
    Der vorliegende Beitrag beschäftigt sich mit den Rahmenbedingungen für analytische Anwendungen wie Big Data, die durch das neue europäische Datenschutzrecht entstehen, insbesondere durch die EU-Datenschutz-Grundverordnung. Er stellt wesentliche Neuerungen vor und untersucht die spezifischen datenschutzrechtlichen Regelungen im Hinblick auf den Einsatz von Big Data sowie Voraussetzungen, die durch die Verordnung abverlangt werden.
  3. Bauckhage, C.: Moderne Textanalyse : neues Wissen für intelligente Lösungen (2016) 0.01
    0.0057054465 = product of:
      0.079876244 = sum of:
        0.079876244 = weight(_text_:daten in 2568) [ClassicSimilarity], result of:
          0.079876244 = score(doc=2568,freq=4.0), product of:
            0.13425784 = queryWeight, product of:
              4.759573 = idf(docFreq=1029, maxDocs=44218)
              0.02820796 = queryNorm
            0.5949466 = fieldWeight in 2568, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.759573 = idf(docFreq=1029, maxDocs=44218)
              0.0625 = fieldNorm(doc=2568)
      0.071428575 = coord(1/14)
    
    Abstract
    Im Zuge der immer größeren Verfügbarkeit von Daten (Big Data) und rasanter Fortschritte im Daten-basierten maschinellen Lernen haben wir in den letzten Jahren Durchbrüche in der künstlichen Intelligenz erlebt. Dieser Vortrag beleuchtet diese Entwicklungen insbesondere im Hinblick auf die automatische Analyse von Textdaten. Anhand einfacher Beispiele illustrieren wir, wie moderne Textanalyse abläuft und zeigen wiederum anhand von Beispielen, welche praktischen Anwendungsmöglichkeiten sich heutzutage in Branchen wie dem Verlagswesen, der Finanzindustrie oder dem Consulting ergeben.
  4. Loonus, Y.: Einsatzbereiche der KI und ihre Relevanz für Information Professionals (2017) 0.01
    0.0052407878 = product of:
      0.07337102 = sum of:
        0.07337102 = weight(_text_:daten in 5668) [ClassicSimilarity], result of:
          0.07337102 = score(doc=5668,freq=6.0), product of:
            0.13425784 = queryWeight, product of:
              4.759573 = idf(docFreq=1029, maxDocs=44218)
              0.02820796 = queryNorm
            0.5464934 = fieldWeight in 5668, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.759573 = idf(docFreq=1029, maxDocs=44218)
              0.046875 = fieldNorm(doc=5668)
      0.071428575 = coord(1/14)
    
    Abstract
    Es liegt in der Natur des Menschen, Erfahrungen und Ideen in Wort und Schrift mit anderen teilen zu wollen. So produzieren wir jeden Tag gigantische Mengen an Texten, die in digitaler Form geteilt und abgelegt werden. The Radicati Group schätzt, dass 2017 täglich 269 Milliarden E-Mails versendet und empfangen werden. Hinzu kommen größtenteils unstrukturierte Daten wie Soziale Medien, Presse, Websites und firmeninterne Systeme, beispielsweise in Form von CRM-Software oder PDF-Dokumenten. Der weltweite Bestand an unstrukturierten Daten wächst so rasant, dass es kaum möglich ist, seinen Umfang zu quantifizieren. Der Versuch, eine belastbare Zahl zu recherchieren, führt unweigerlich zu diversen Artikeln, die den Anteil unstrukturierter Texte am gesamten Datenbestand auf 80% schätzen. Auch wenn nicht mehr einwandfrei nachvollziehbar ist, woher diese Zahl stammt, kann bei kritischer Reflexion unseres Tagesablaufs kaum bezweifelt werden, dass diese Daten von großer wirtschaftlicher Relevanz sind.
  5. Ebrahimi, M.; ShafieiBavani, E.; Wong, R.; Chen, F.: Twitter user geolocation by filtering of highly mentioned users (2018) 0.00
    0.004144049 = product of:
      0.05801668 = sum of:
        0.05801668 = weight(_text_:media in 4286) [ClassicSimilarity], result of:
          0.05801668 = score(doc=4286,freq=4.0), product of:
            0.13212246 = queryWeight, product of:
              4.6838713 = idf(docFreq=1110, maxDocs=44218)
              0.02820796 = queryNorm
            0.43911293 = fieldWeight in 4286, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.6838713 = idf(docFreq=1110, maxDocs=44218)
              0.046875 = fieldNorm(doc=4286)
      0.071428575 = coord(1/14)
    
    Abstract
    Geolocated social media data provide a powerful source of information about places and regional human behavior. Because only a small amount of social media data have been geolocation-annotated, inference techniques play a substantial role to increase the volume of annotated data. Conventional research in this area has been based on the text content of posts from a given user or the social network of the user, with some recent crossovers between the text- and network-based approaches. This paper proposes a novel approach to categorize highly-mentioned users (celebrities) into Local and Global types, and consequently use Local celebrities as location indicators. A label propagation algorithm is then used over the refined social network for geolocation inference. Finally, we propose a hybrid approach by merging a text-based method as a back-off strategy into our network-based approach. Empirical experiments over three standard Twitter benchmark data sets demonstrate that our approach outperforms state-of-the-art user geolocation methods.
  6. Bella, A. La; Fronzetti Colladon, A.; Battistoni, E.; Castellan, S.; Francucci, M.: Assessing perceived organizational leadership styles through twitter text mining (2018) 0.00
    0.002930285 = product of:
      0.04102399 = sum of:
        0.04102399 = weight(_text_:media in 2400) [ClassicSimilarity], result of:
          0.04102399 = score(doc=2400,freq=2.0), product of:
            0.13212246 = queryWeight, product of:
              4.6838713 = idf(docFreq=1110, maxDocs=44218)
              0.02820796 = queryNorm
            0.31049973 = fieldWeight in 2400, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.6838713 = idf(docFreq=1110, maxDocs=44218)
              0.046875 = fieldNorm(doc=2400)
      0.071428575 = coord(1/14)
    
    Abstract
    We propose a text classification tool based on support vector machines for the assessment of organizational leadership styles, as appearing to Twitter users. We collected Twitter data over 51 days, related to the first 30 Italian organizations in the 2015 ranking of Forbes Global 2000-out of which we selected the five with the most relevant volumes of tweets. We analyzed the communication of the company leaders, together with the dialogue among the stakeholders of each company, to understand the association with perceived leadership styles and dimensions. To assess leadership profiles, we referred to the 10-factor model developed by Barchiesi and La Bella in 2007. We maintain the distinctiveness of the approach we propose, as it allows a rapid assessment of the perceived leadership capabilities of an enterprise, as they emerge from its social media interactions. It can also be used to show how companies respond and manage their communication when specific events take place, and to assess their stakeholder's reactions.
  7. Wongthontham, P.; Abu-Salih, B.: Ontology-based approach for semantic data extraction from social big data : state-of-the-art and research directions (2018) 0.00
    0.002930285 = product of:
      0.04102399 = sum of:
        0.04102399 = weight(_text_:media in 4097) [ClassicSimilarity], result of:
          0.04102399 = score(doc=4097,freq=2.0), product of:
            0.13212246 = queryWeight, product of:
              4.6838713 = idf(docFreq=1110, maxDocs=44218)
              0.02820796 = queryNorm
            0.31049973 = fieldWeight in 4097, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.6838713 = idf(docFreq=1110, maxDocs=44218)
              0.046875 = fieldNorm(doc=4097)
      0.071428575 = coord(1/14)
    
    Abstract
    A challenge of managing and extracting useful knowledge from social media data sources has attracted much attention from academic and industry. To address this challenge, semantic analysis of textual data is focused in this paper. We propose an ontology-based approach to extract semantics of textual data and define the domain of data. In other words, we semantically analyse the social data at two levels i.e. the entity level and the domain level. We have chosen Twitter as a social channel challenge for a purpose of concept proof. Domain knowledge is captured in ontologies which are then used to enrich the semantics of tweets provided with specific semantic conceptual representation of entities that appear in the tweets. Case studies are used to demonstrate this approach. We experiment and evaluate our proposed approach with a public dataset collected from Twitter and from the politics domain. The ontology-based approach leverages entity extraction and concept mappings in terms of quantity and accuracy of concept identification.
  8. Tonkin, E.L.; Tourte, G.J.L.: Working with text. tools, techniques and approaches for text mining (2016) 0.00
    0.0024419043 = product of:
      0.034186658 = sum of:
        0.034186658 = weight(_text_:media in 4019) [ClassicSimilarity], result of:
          0.034186658 = score(doc=4019,freq=2.0), product of:
            0.13212246 = queryWeight, product of:
              4.6838713 = idf(docFreq=1110, maxDocs=44218)
              0.02820796 = queryNorm
            0.25874978 = fieldWeight in 4019, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.6838713 = idf(docFreq=1110, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4019)
      0.071428575 = coord(1/14)
    
    Abstract
    What is text mining, and how can it be used? What relevance do these methods have to everyday work in information science and the digital humanities? How does one develop competences in text mining? Working with Text provides a series of cross-disciplinary perspectives on text mining and its applications. As text mining raises legal and ethical issues, the legal background of text mining and the responsibilities of the engineer are discussed in this book. Chapters provide an introduction to the use of the popular GATE text mining package with data drawn from social media, the use of text mining to support semantic search, the development of an authority system to support content tagging, and recent techniques in automatic language evaluation. Focused studies describe text mining on historical texts, automated indexing using constrained vocabularies, and the use of natural language processing to explore the climate science literature. Interviews are included that offer a glimpse into the real-life experience of working within commercial and academic text mining.
  9. Mining text data (2012) 0.00
    0.0019535234 = product of:
      0.027349325 = sum of:
        0.027349325 = weight(_text_:media in 362) [ClassicSimilarity], result of:
          0.027349325 = score(doc=362,freq=2.0), product of:
            0.13212246 = queryWeight, product of:
              4.6838713 = idf(docFreq=1110, maxDocs=44218)
              0.02820796 = queryNorm
            0.20699982 = fieldWeight in 362, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.6838713 = idf(docFreq=1110, maxDocs=44218)
              0.03125 = fieldNorm(doc=362)
      0.071428575 = coord(1/14)
    
    Content
    Inhalt: An Introduction to Text Mining.- Information Extraction from Text.- A Survey of Text Summarization Techniques.- A Survey of Text Clustering Algorithms.- Dimensionality Reduction and Topic Modeling.- A Survey of Text Classification Algorithms.- Transfer Learning for Text Mining.- Probabilistic Models for Text Mining.- Mining Text Streams.- Translingual Mining from Text Data.- Text Mining in Multimedia.- Text Analytics in Social Media.- A Survey of Opinion Mining and Sentiment Analysis.- Biomedical Text Mining: A Survey of Recent Progress.- Index.
  10. Hallonsten, O.; Holmberg, D.: Analyzing structural stratification in the Swedish higher education system : data contextualization with policy-history analysis (2013) 0.00
    6.8246236E-4 = product of:
      0.009554473 = sum of:
        0.009554473 = product of:
          0.019108946 = sum of:
            0.019108946 = weight(_text_:22 in 668) [ClassicSimilarity], result of:
              0.019108946 = score(doc=668,freq=2.0), product of:
                0.09877947 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02820796 = queryNorm
                0.19345059 = fieldWeight in 668, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=668)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Date
    22. 3.2013 19:43:01
  11. Vaughan, L.; Chen, Y.: Data mining from web search queries : a comparison of Google trends and Baidu index (2015) 0.00
    6.8246236E-4 = product of:
      0.009554473 = sum of:
        0.009554473 = product of:
          0.019108946 = sum of:
            0.019108946 = weight(_text_:22 in 1605) [ClassicSimilarity], result of:
              0.019108946 = score(doc=1605,freq=2.0), product of:
                0.09877947 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02820796 = queryNorm
                0.19345059 = fieldWeight in 1605, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1605)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Source
    Journal of the Association for Information Science and Technology. 66(2015) no.1, S.13-22
  12. Fonseca, F.; Marcinkowski, M.; Davis, C.: Cyber-human systems of thought and understanding (2019) 0.00
    6.8246236E-4 = product of:
      0.009554473 = sum of:
        0.009554473 = product of:
          0.019108946 = sum of:
            0.019108946 = weight(_text_:22 in 5011) [ClassicSimilarity], result of:
              0.019108946 = score(doc=5011,freq=2.0), product of:
                0.09877947 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02820796 = queryNorm
                0.19345059 = fieldWeight in 5011, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5011)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Date
    7. 3.2019 16:32:22