Search (8 results, page 1 of 1)

  • × year_i:[2020 TO 2030}
  • × theme_ss:"Data Mining"
  1. Borgman, C.L.; Wofford, M.F.; Golshan, M.S.; Darch, P.T.: Collaborative qualitative research at scale : reflections on 20 years of acquiring global data and making data global (2021) 0.02
    0.024194509 = product of:
      0.096778035 = sum of:
        0.096778035 = weight(_text_:data in 239) [ClassicSimilarity], result of:
          0.096778035 = score(doc=239,freq=28.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.65359366 = fieldWeight in 239, product of:
              5.2915025 = tf(freq=28.0), with freq of:
                28.0 = termFreq=28.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=239)
      0.25 = coord(1/4)
    
    Abstract
    A 5-year project to study scientific data uses in geography, starting in 1999, evolved into 20 years of research on data practices in sensor networks, environmental sciences, biology, seismology, undersea science, biomedicine, astronomy, and other fields. By emulating the "team science" approaches of the scientists studied, the UCLA Center for Knowledge Infrastructures accumulated a comprehensive collection of qualitative data about how scientists generate, manage, use, and reuse data across domains. Building upon Paul N. Edwards's model of "making global data"-collecting signals via consistent methods, technologies, and policies-to "make data global"-comparing and integrating those data, the research team has managed and exploited these data as a collaborative resource. This article reflects on the social, technical, organizational, economic, and policy challenges the team has encountered in creating new knowledge from data old and new. We reflect on continuity over generations of students and staff, transitions between grants, transfer of legacy data between software tools, research methods, and the role of professional data managers in the social sciences.
    Theme
    Data Mining
  2. Mandl, T.: Text Mining und Data Mining (2023) 0.02
    0.022174634 = product of:
      0.088698536 = sum of:
        0.088698536 = weight(_text_:data in 774) [ClassicSimilarity], result of:
          0.088698536 = score(doc=774,freq=12.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.59902847 = fieldWeight in 774, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0546875 = fieldNorm(doc=774)
      0.25 = coord(1/4)
    
    Abstract
    Text und Data Mining sind ein Bündel von Technologien, die eng mit den Themenfeldern Statistik, Maschinelles Lernen und dem Erkennen von Mustern verbunden sind. Die üblichen Definitionen beziehen eine Vielzahl von verschiedenen Verfahren mit ein, ohne eine exakte Grenze zu ziehen. Data Mining bezeichnet die Suche nach Mustern, Regelmäßigkeiten oder Auffälligkeiten in stark strukturierten und vor allem numerischen Daten. "Any algorithm that enumerates patterns from, or fits models to, data is a data mining algorithm." Numerische Daten und Datenbankinhalte werden als strukturierte Daten bezeichnet. Dagegen gelten Textdokumente in natürlicher Sprache als unstrukturierte Daten.
    Theme
    Data Mining
  3. Jones, K.M.L.; Rubel, A.; LeClere, E.: ¬A matter of trust : higher education institutions as information fiduciaries in an age of educational data mining and learning analytics (2020) 0.02
    0.017108103 = product of:
      0.06843241 = sum of:
        0.06843241 = weight(_text_:data in 5968) [ClassicSimilarity], result of:
          0.06843241 = score(doc=5968,freq=14.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.46216056 = fieldWeight in 5968, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5968)
      0.25 = coord(1/4)
    
    Abstract
    Higher education institutions are mining and analyzing student data to effect educational, political, and managerial outcomes. Done under the banner of "learning analytics," this work can-and often does-surface sensitive data and information about, inter alia, a student's demographics, academic performance, offline and online movements, physical fitness, mental wellbeing, and social network. With these data, institutions and third parties are able to describe student life, predict future behaviors, and intervene to address academic or other barriers to student success (however defined). Learning analytics, consequently, raise serious issues concerning student privacy, autonomy, and the appropriate flow of student data. We argue that issues around privacy lead to valid questions about the degree to which students should trust their institution to use learning analytics data and other artifacts (algorithms, predictive scores) with their interests in mind. We argue that higher education institutions are paradigms of information fiduciaries. As such, colleges and universities have a special responsibility to their students. In this article, we use the information fiduciary concept to analyze cases when learning analytics violate an institution's responsibility to its students.
    Theme
    Data Mining
  4. Wiegmann, S.: Hättest du die Titanic überlebt? : Eine kurze Einführung in das Data Mining mit freier Software (2023) 0.02
    0.015679834 = product of:
      0.06271934 = sum of:
        0.06271934 = weight(_text_:data in 876) [ClassicSimilarity], result of:
          0.06271934 = score(doc=876,freq=6.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.42357713 = fieldWeight in 876, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0546875 = fieldNorm(doc=876)
      0.25 = coord(1/4)
    
    Abstract
    Am 10. April 1912 ging Elisabeth Walton Allen an Bord der "Titanic", um ihr Hab und Gut nach England zu holen. Eines Nachts wurde sie von ihrer aufgelösten Tante geweckt, deren Kajüte unter Wasser stand. Wie steht es um Elisabeths Chancen und hätte man selbst das Unglück damals überlebt? Das Titanic-Orakel ist eine algorithmusbasierte App, die entsprechende Prognosen aufstellt und im Rahmen des Kurses "Data Science" am Department Information der HAW Hamburg entstanden ist. Dieser Beitrag zeigt Schritt für Schritt, wie die App unter Verwendung freier Software entwickelt wurde. Code und Daten werden zur Nachnutzung bereitgestellt.
    Theme
    Data Mining
  5. Organisciak, P.; Schmidt, B.M.; Downie, J.S.: Giving shape to large digital libraries through exploratory data analysis (2022) 0.02
    0.015519011 = product of:
      0.062076043 = sum of:
        0.062076043 = weight(_text_:data in 473) [ClassicSimilarity], result of:
          0.062076043 = score(doc=473,freq=8.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.4192326 = fieldWeight in 473, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=473)
      0.25 = coord(1/4)
    
    Abstract
    The emergence of large multi-institutional digital libraries has opened the door to aggregate-level examinations of the published word. Such large-scale analysis offers a new way to pursue traditional problems in the humanities and social sciences, using digital methods to ask routine questions of large corpora. However, inquiry into multiple centuries of books is constrained by the burdens of scale, where statistical inference is technically complex and limited by hurdles to access and flexibility. This work examines the role that exploratory data analysis and visualization tools may play in understanding large bibliographic datasets. We present one such tool, HathiTrust+Bookworm, which allows multifaceted exploration of the multimillion work HathiTrust Digital Library, and center it in the broader space of scholarly tools for exploratory data analysis.
    Theme
    Data Mining
  6. Datentracking in der Wissenschaft : Aggregation und Verwendung bzw. Verkauf von Nutzungsdaten durch Wissenschaftsverlage. Ein Informationspapier des Ausschusses für Wissenschaftliche Bibliotheken und Informationssysteme der Deutschen Forschungsgemeinschaft (2021) 0.01
    0.010973599 = product of:
      0.043894395 = sum of:
        0.043894395 = weight(_text_:data in 248) [ClassicSimilarity], result of:
          0.043894395 = score(doc=248,freq=4.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.29644224 = fieldWeight in 248, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=248)
      0.25 = coord(1/4)
    
    Abstract
    Das Informationspapier beschreibt die digitale Nachverfolgung von wissenschaftlichen Aktivitäten. Wissenschaftlerinnen und Wissenschaftler nutzen täglich eine Vielzahl von digitalen Informationsressourcen wie zum Beispiel Literatur- und Volltextdatenbanken. Häufig fallen dabei Nutzungsspuren an, die Aufschluss geben über gesuchte und genutzte Inhalte, Verweildauern und andere Arten der wissenschaftlichen Aktivität. Diese Nutzungsspuren können von den Anbietenden der Informationsressourcen festgehalten, aggregiert und weiterverwendet oder verkauft werden. Das Informationspapier legt die Transformation von Wissenschaftsverlagen hin zu Data Analytics Businesses dar, verweist auf die Konsequenzen daraus für die Wissenschaft und deren Einrichtungen und benennt die zum Einsatz kommenden Typen der Datengewinnung. Damit dient es vor allem der Darstellung gegenwärtiger Praktiken und soll zu Diskussionen über deren Konsequenzen für die Wissenschaft anregen. Es richtet sich an alle Wissenschaftlerinnen und Wissenschaftler sowie alle Akteure in der Wissenschaftslandschaft.
    Theme
    Data Mining
  7. Lowe, D.B.; Dollinger, I.; Koster, T.; Herbert, B.E.: Text mining for type of research classification (2021) 0.01
    0.0077595054 = product of:
      0.031038022 = sum of:
        0.031038022 = weight(_text_:data in 720) [ClassicSimilarity], result of:
          0.031038022 = score(doc=720,freq=2.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.2096163 = fieldWeight in 720, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=720)
      0.25 = coord(1/4)
    
    Theme
    Data Mining
  8. Goldberg, D.M.; Zaman, N.; Brahma, A.; Aloiso, M.: Are mortgage loan closing delay risks predictable? : A predictive analysis using text mining on discussion threads (2022) 0.01
    0.006466255 = product of:
      0.02586502 = sum of:
        0.02586502 = weight(_text_:data in 501) [ClassicSimilarity], result of:
          0.02586502 = score(doc=501,freq=2.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.17468026 = fieldWeight in 501, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=501)
      0.25 = coord(1/4)
    
    Theme
    Data Mining