Search (137 results, page 1 of 7)

Luo, L.; Ju, J.; Li, Y.-F.; Haffari, G.; Xiong, B.; Pan, S.: ChatRule: mining logical rules with large language models for knowledge graph reasoning (2023) 0.09
```
0.092265114 = product of:
  0.18453023 = sum of:
    0.18453023 = sum of:
      0.14503014 = weight(_text_:mining in 1171) [ClassicSimilarity], result of:
        0.14503014 = score(doc=1171,freq=4.0), product of:
          0.3290036 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.058308665 = queryNorm
          0.44081625 = fieldWeight in 1171, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1171)
      0.039500095 = weight(_text_:22 in 1171) [ClassicSimilarity], result of:
        0.039500095 = score(doc=1171,freq=2.0), product of:
          0.204187 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.058308665 = queryNorm
          0.19345059 = fieldWeight in 1171, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1171)
  0.5 = coord(1/2)
```
Abstract

Logical rules are essential for uncovering the logical connections between relations, which could improve the reasoning performance and provide interpretable results on knowledge graphs (KGs). Although there have been many efforts to mine meaningful logical rules over KGs, existing methods suffer from the computationally intensive searches over the rule space and a lack of scalability for large-scale KGs. Besides, they often ignore the semantics of relations which is crucial for uncovering logical connections. Recently, large language models (LLMs) have shown impressive performance in the field of natural language processing and various applications, owing to their emergent ability and generalizability. In this paper, we propose a novel framework, ChatRule, unleashing the power of large language models for mining logical rules over knowledge graphs. Specifically, the framework is initiated with an LLM-based rule generator, leveraging both the semantic and structural information of KGs to prompt LLMs to generate logical rules. To refine the generated rules, a rule ranking module estimates the rule quality by incorporating facts from existing KGs. Last, a rule validator harnesses the reasoning ability of LLMs to validate the logical correctness of ranked rules through chain-of-thought reasoning. ChatRule is evaluated on four large-scale KGs, w.r.t. different rule quality metrics and downstream tasks, showing the effectiveness and scalability of our method.

Date

23.11.2023 19:07:22
Al-Khatib, K.; Ghosa, T.; Hou, Y.; Waard, A. de; Freitag, D.: Argument mining for scholarly document processing : taking stock and looking ahead (2021) 0.09
```
0.08791984 = product of:
  0.17583968 = sum of:
    0.17583968 = product of:
      0.35167935 = sum of:
        0.35167935 = weight(_text_:mining in 568) [ClassicSimilarity], result of:
          0.35167935 = score(doc=568,freq=12.0), product of:
            0.3290036 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.058308665 = queryNorm
            1.0689225 = fieldWeight in 568, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0546875 = fieldNorm(doc=568)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Argument mining targets structures in natural language related to interpretation and persuasion. Most scholarly discourse involves interpreting experimental evidence and attempting to persuade other scientists to adopt the same conclusions, which could benefit from argument mining techniques. However, While various argument mining studies have addressed student essays and news articles, those that target scientific discourse are still scarce. This paper surveys existing work in argument mining of scholarly discourse, and provides an overview of current models, data, tasks, and applications. We identify a number of key challenges confronting argument mining in the scientific domain, and suggest some possible solutions and future directions.
Mandl, T.: Text Mining und Data Mining (2023) 0.09
```
0.08791984 = product of:
  0.17583968 = sum of:
    0.17583968 = product of:
      0.35167935 = sum of:
        0.35167935 = weight(_text_:mining in 774) [ClassicSimilarity], result of:
          0.35167935 = score(doc=774,freq=12.0), product of:
            0.3290036 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.058308665 = queryNorm
            1.0689225 = fieldWeight in 774, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0546875 = fieldNorm(doc=774)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Text und Data Mining sind ein Bündel von Technologien, die eng mit den Themenfeldern Statistik, Maschinelles Lernen und dem Erkennen von Mustern verbunden sind. Die üblichen Definitionen beziehen eine Vielzahl von verschiedenen Verfahren mit ein, ohne eine exakte Grenze zu ziehen. Data Mining bezeichnet die Suche nach Mustern, Regelmäßigkeiten oder Auffälligkeiten in stark strukturierten und vor allem numerischen Daten. "Any algorithm that enumerates patterns from, or fits models to, data is a data mining algorithm." Numerische Daten und Datenbankinhalte werden als strukturierte Daten bezeichnet. Dagegen gelten Textdokumente in natürlicher Sprache als unstrukturierte Daten.

Theme

Data Mining
Lowe, D.B.; Dollinger, I.; Koster, T.; Herbert, B.E.: Text mining for type of research classification (2021) 0.07
```
0.06879383 = product of:
  0.13758767 = sum of:
    0.13758767 = product of:
      0.27517533 = sum of:
        0.27517533 = weight(_text_:mining in 720) [ClassicSimilarity], result of:
          0.27517533 = score(doc=720,freq=10.0), product of:
            0.3290036 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.058308665 = queryNorm
            0.83639 = fieldWeight in 720, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.046875 = fieldNorm(doc=720)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This project brought together undergraduate students in Computer Science with librarians to mine abstracts of articles from the Texas A&M University Libraries' institutional repository, OAKTrust, in order to probe the creation of new metadata to improve discovery and use. The mining operation task consisted simply of classifying the articles into two categories of research type: basic research ("for understanding," "curiosity-based," or "knowledge-based") and applied research ("use-based"). These categories are fundamental especially for funders but are also important to researchers. The mining-to-classification steps took several iterations, but ultimately, we achieved good results with the toolkit BERT (Bidirectional Encoder Representations from Transformers). The project and its workflows represent a preview of what may lie ahead in the future of crafting metadata using text mining techniques to enhance discoverability.

Theme

Data Mining
Thelwall, M.; Thelwall, S.: ¬A thematic analysis of highly retweeted early COVID-19 tweets : consensus, information, dissent and lockdown life (2020) 0.06
```
0.064129055 = sum of:
  0.04437901 = product of:
    0.13313703 = sum of:
      0.13313703 = weight(_text_:themes in 178) [ClassicSimilarity], result of:
        0.13313703 = score(doc=178,freq=2.0), product of:
          0.3748681 = queryWeight, product of:
            6.429029 = idf(docFreq=193, maxDocs=44218)
            0.058308665 = queryNorm
          0.35515702 = fieldWeight in 178, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            6.429029 = idf(docFreq=193, maxDocs=44218)
            0.0390625 = fieldNorm(doc=178)
    0.33333334 = coord(1/3)
  0.019750047 = product of:
    0.039500095 = sum of:
      0.039500095 = weight(_text_:22 in 178) [ClassicSimilarity], result of:
        0.039500095 = score(doc=178,freq=2.0), product of:
          0.204187 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.058308665 = queryNorm
          0.19345059 = fieldWeight in 178, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=178)
    0.5 = coord(1/2)
```
Abstract

Purpose Public attitudes towards COVID-19 and social distancing are critical in reducing its spread. It is therefore important to understand public reactions and information dissemination in all major forms, including on social media. This article investigates important issues reflected on Twitter in the early stages of the public reaction to COVID-19. Design/methodology/approach A thematic analysis of the most retweeted English-language tweets mentioning COVID-19 during March 10-29, 2020. Findings The main themes identified for the 87 qualifying tweets accounting for 14 million retweets were: lockdown life; attitude towards social restrictions; politics; safety messages; people with COVID-19; support for key workers; work; and COVID-19 facts/news. Research limitations/implications Twitter played many positive roles, mainly through unofficial tweets. Users shared social distancing information, helped build support for social distancing, criticised government responses, expressed support for key workers and helped each other cope with social isolation. A few popular tweets not supporting social distancing show that government messages sometimes failed. Practical implications Public health campaigns in future may consider encouraging grass roots social web activity to support campaign goals. At a methodological level, analysing retweet counts emphasised politics and ignored practical implementation issues. Originality/value This is the first qualitative analysis of general COVID-19-related retweeting.

Date

20. 1.2015 18:30:22
Belabbes, M.A.; Ruthven, I.; Moshfeghi, Y.; Rasmussen Pennington, D.: Information overload : a concept analysis (2023) 0.06
```
0.064129055 = sum of:
  0.04437901 = product of:
    0.13313703 = sum of:
      0.13313703 = weight(_text_:themes in 950) [ClassicSimilarity], result of:
        0.13313703 = score(doc=950,freq=2.0), product of:
          0.3748681 = queryWeight, product of:
            6.429029 = idf(docFreq=193, maxDocs=44218)
            0.058308665 = queryNorm
          0.35515702 = fieldWeight in 950, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            6.429029 = idf(docFreq=193, maxDocs=44218)
            0.0390625 = fieldNorm(doc=950)
    0.33333334 = coord(1/3)
  0.019750047 = product of:
    0.039500095 = sum of:
      0.039500095 = weight(_text_:22 in 950) [ClassicSimilarity], result of:
        0.039500095 = score(doc=950,freq=2.0), product of:
          0.204187 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.058308665 = queryNorm
          0.19345059 = fieldWeight in 950, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=950)
    0.5 = coord(1/2)
```
Abstract

Purpose With the shift to an information-based society and to the de-centralisation of information, information overload has attracted a growing interest in the computer and information science research communities. However, there is no clear understanding of the meaning of the term, and while there have been many proposed definitions, there is no consensus. The goal of this work was to define the concept of "information overload". In order to do so, a concept analysis using Rodgers' approach was performed. Design/methodology/approach A concept analysis using Rodgers' approach based on a corpus of documents published between 2010 and September 2020 was conducted. One surrogate for "information overload", which is "cognitive overload" was identified. The corpus of documents consisted of 151 documents for information overload and ten for cognitive overload. All documents were from the fields of computer science and information science, and were retrieved from three databases: Association for Computing Machinery (ACM) Digital Library, SCOPUS and Library and Information Science Abstracts (LISA). Findings The themes identified from the authors' concept analysis allowed us to extract the triggers, manifestations and consequences of information overload. They found triggers related to information characteristics, information need, the working environment, the cognitive abilities of individuals and the information environment. In terms of manifestations, they found that information overload manifests itself both emotionally and cognitively. The consequences of information overload were both internal and external. These findings allowed them to provide a definition of information overload. Originality/value Through the authors' concept analysis, they were able to clarify the components of information overload and provide a definition of the concept.

Date

22. 4.2023 19:27:56

Heesen, H.; Jüngels, L.: ¬Der Regierungsentwurf der Text und Data Mining-Schranken (§§ 44b, 60d UrhG-E) : ein Überblick zu den geplanten Regelungen für Kultur- und Wissenschaftseinrichtungen (2021) 0.06

0.061531074 = product of:
  0.12306215 = sum of:
    0.12306215 = product of:
      0.2461243 = sum of:
        0.2461243 = weight(_text_:mining in 190) [ClassicSimilarity], result of:
          0.2461243 = score(doc=190,freq=2.0), product of:
            0.3290036 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.058308665 = queryNorm
            0.74808997 = fieldWeight in 190, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.09375 = fieldNorm(doc=190)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Moulaison-Sandy, H.; Adkins, D.; Bossaller, J.; Cho, H.: ¬An automated approach to describing fiction : a methodology to use book reviews to identify affect (2021) 0.05
```
0.050760545 = product of:
  0.10152109 = sum of:
    0.10152109 = product of:
      0.20304218 = sum of:
        0.20304218 = weight(_text_:mining in 710) [ClassicSimilarity], result of:
          0.20304218 = score(doc=710,freq=4.0), product of:
            0.3290036 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.058308665 = queryNorm
            0.61714274 = fieldWeight in 710, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0546875 = fieldNorm(doc=710)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Subject headings and genre terms are notoriously difficult to apply, yet are important for fiction. The current project functions as a proof of concept, using a text-mining methodology to identify affective information (emotion and tone) about fiction titles from professional book reviews as a potential first step in automating the subject analysis process. Findings are presented and discussed, comparing results to the range of aboutness and isness information in library cataloging records. The methodology is likewise presented, and how future work might expand on the current project to enhance catalog records through text-mining is explored.

Wiegmann, S.: Hättest du die Titanic überlebt? : Eine kurze Einführung in das Data Mining mit freier Software (2023) 0.05

0.050760545 = product of:
  0.10152109 = sum of:
    0.10152109 = product of:
      0.20304218 = sum of:
        0.20304218 = weight(_text_:mining in 876) [ClassicSimilarity], result of:
          0.20304218 = score(doc=876,freq=4.0), product of:
            0.3290036 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.058308665 = queryNorm
            0.61714274 = fieldWeight in 876, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0546875 = fieldNorm(doc=876)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Theme: Data Mining

Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.05

0.046304807 = product of:
  0.092609614 = sum of:
    0.092609614 = product of:
      0.27782884 = sum of:
        0.27782884 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
          0.27782884 = score(doc=862,freq=2.0), product of:
            0.49434152 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.058308665 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Source: https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN

Kang, X.; Wu, Y.; Ren, W.: Toward action comprehension for searching : mining actionable intents in query entities (2020) 0.04
```
0.04440623 = product of:
  0.08881246 = sum of:
    0.08881246 = product of:
      0.17762493 = sum of:
        0.17762493 = weight(_text_:mining in 5613) [ClassicSimilarity], result of:
          0.17762493 = score(doc=5613,freq=6.0), product of:
            0.3290036 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.058308665 = queryNorm
            0.5398875 = fieldWeight in 5613, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5613)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Understanding search engine users' intents has been a popular study in information retrieval, which directly affects the quality of retrieved information. One of the fundamental problems in this field is to find a connection between the entity in a query and the potential intents of the users, the latter of which would further reveal important information for facilitating the users' future actions. In this article, we present a novel research method for mining the actionable intents for search users, by generating a ranked list of the potentially most informative actions based on a massive pool of action samples. We compare different search strategies and their combinations for retrieving the action pool and develop three criteria for measuring the informativeness of the selected action samples, that is, the significance of an action sample within the pool, the representativeness of an action sample for the other candidate samples, and the diverseness of an action sample with respect to the selected actions. Our experiment, based on the Action Mining (AM) query entity data set from the Actionable Knowledge Graph (AKG) task at NTCIR-13, suggests that the proposed approach is effective in generating an informative and early-satisfying ranking of potential actions for search users.
Jones, K.M.L.; Rubel, A.; LeClere, E.: ¬A matter of trust : higher education institutions as information fiduciaries in an age of educational data mining and learning analytics (2020) 0.04
```
0.04440623 = product of:
  0.08881246 = sum of:
    0.08881246 = product of:
      0.17762493 = sum of:
        0.17762493 = weight(_text_:mining in 5968) [ClassicSimilarity], result of:
          0.17762493 = score(doc=5968,freq=6.0), product of:
            0.3290036 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.058308665 = queryNorm
            0.5398875 = fieldWeight in 5968, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5968)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Higher education institutions are mining and analyzing student data to effect educational, political, and managerial outcomes. Done under the banner of "learning analytics," this work can-and often does-surface sensitive data and information about, inter alia, a student's demographics, academic performance, offline and online movements, physical fitness, mental wellbeing, and social network. With these data, institutions and third parties are able to describe student life, predict future behaviors, and intervene to address academic or other barriers to student success (however defined). Learning analytics, consequently, raise serious issues concerning student privacy, autonomy, and the appropriate flow of student data. We argue that issues around privacy lead to valid questions about the degree to which students should trust their institution to use learning analytics data and other artifacts (algorithms, predictive scores) with their interests in mind. We argue that higher education institutions are paradigms of information fiduciaries. As such, colleges and universities have a special responsibility to their students. In this article, we use the information fiduciary concept to analyze cases when learning analytics violate an institution's responsibility to its students.

Theme

Data Mining
Goldberg, D.M.; Zaman, N.; Brahma, A.; Aloiso, M.: Are mortgage loan closing delay risks predictable? : A predictive analysis using text mining on discussion threads (2022) 0.04
```
0.04440623 = product of:
  0.08881246 = sum of:
    0.08881246 = product of:
      0.17762493 = sum of:
        0.17762493 = weight(_text_:mining in 501) [ClassicSimilarity], result of:
          0.17762493 = score(doc=501,freq=6.0), product of:
            0.3290036 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.058308665 = queryNorm
            0.5398875 = fieldWeight in 501, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0390625 = fieldNorm(doc=501)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Loan processors and underwriters at mortgage firms seek to gather substantial supporting documentation to properly understand and model loan risks. In doing so, loan originations become prone to closing delays, risking client dissatisfaction and consequent revenue losses. We collaborate with a large national mortgage firm to examine the extent to which these delays are predictable, using internal discussion threads to prioritize interventions for loans most at risk. Substantial work experience is required to predict delays, and we find that even highly trained employees have difficulty predicting delays by reviewing discussion threads. We develop an array of methods to predict loan delays. We apply four modern out-of-the-box sentiment analysis techniques, two dictionary-based and two rule-based, to predict delays. We contrast these approaches with domain-specific approaches, including firm-provided keyword searches and "smoke terms" derived using machine learning. Performance varies widely across sentiment approaches; while some sentiment approaches prioritize the top-ranking records well, performance quickly declines thereafter. The firm-provided keyword searches perform at the rate of random chance. We observe that the domain-specific smoke term approaches consistently outperform other approaches and offer better prediction than loan and borrower characteristics. We conclude that text mining solutions would greatly assist mortgage firms in delay prevention.

Theme

Data Mining
Dietz, K.: en.wikipedia.org > 6 Mio. Artikel (2020) 0.04
```
0.038587343 = product of:
  0.077174686 = sum of:
    0.077174686 = product of:
      0.23152405 = sum of:
        0.23152405 = weight(_text_:3a in 5669) [ClassicSimilarity], result of:
          0.23152405 = score(doc=5669,freq=2.0), product of:
            0.49434152 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.058308665 = queryNorm
            0.46834838 = fieldWeight in 5669, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5669)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)
```
Content

"Die Englischsprachige Wikipedia verfügt jetzt über mehr als 6 Millionen Artikel. An zweiter Stelle kommt die deutschsprachige Wikipedia mit 2.3 Millionen Artikeln, an dritter Stelle steht die französischsprachige Wikipedia mit 2.1 Millionen Artikeln (via Researchbuzz: Firehose <https://rbfirehose.com/2020/01/24/techcrunch-wikipedia-now-has-more-than-6-million-articles-in-english/> und Techcrunch <https://techcrunch.com/2020/01/23/wikipedia-english-six-million-articles/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+Techcrunch+%28TechCrunch%29&guccounter=1&guce_referrer=aHR0cHM6Ly9yYmZpcmVob3NlLmNvbS8yMDIwLzAxLzI0L3RlY2hjcnVuY2gtd2lraXBlZGlhLW5vdy1oYXMtbW9yZS10aGFuLTYtbWlsbGlvbi1hcnRpY2xlcy1pbi1lbmdsaXNoLw&guce_referrer_sig=AQAAAK0zHfjdDZ_spFZBF_z-zDjtL5iWvuKDumFTzm4HvQzkUfE2pLXQzGS6FGB_y-VISdMEsUSvkNsg2U_NWQ4lwWSvOo3jvXo1I3GtgHpP8exukVxYAnn5mJspqX50VHIWFADHhs5AerkRn3hMRtf_R3F1qmEbo8EROZXp328HMC-o>). 250120 via digithek ch = #fineBlog s.a.: Angesichts der Veröffentlichung des 6-millionsten Artikels vergangene Woche in der englischsprachigen Wikipedia hat die Community-Zeitungsseite "Wikipedia Signpost" ein Moratorium bei der Veröffentlichung von Unternehmensartikeln gefordert. Das sei kein Vorwurf gegen die Wikimedia Foundation, aber die derzeitigen Maßnahmen, um die Enzyklopädie gegen missbräuchliches undeklariertes Paid Editing zu schützen, funktionierten ganz klar nicht. *"Da die ehrenamtlichen Autoren derzeit von Werbung in Gestalt von Wikipedia-Artikeln überwältigt werden, und da die WMF nicht in der Lage zu sein scheint, dem irgendetwas entgegenzusetzen, wäre der einzige gangbare Weg für die Autoren, fürs erste die Neuanlage von Artikeln über Unternehmen zu untersagen"*, schreibt der Benutzer Smallbones in seinem Editorial <https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2020-01-27/From_the_editor> zur heutigen Ausgabe."
Gabler, S.: Vergabe von DDC-Sachgruppen mittels eines Schlagwort-Thesaurus (2021) 0.04
```
0.038587343 = product of:
  0.077174686 = sum of:
    0.077174686 = product of:
      0.23152405 = sum of:
        0.23152405 = weight(_text_:3a in 1000) [ClassicSimilarity], result of:
          0.23152405 = score(doc=1000,freq=2.0), product of:
            0.49434152 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.058308665 = queryNorm
            0.46834838 = fieldWeight in 1000, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1000)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)
```
Content

Master thesis Master of Science (Library and Information Studies) (MSc), Universität Wien. Advisor: Christoph Steiner. Vgl.: https://www.researchgate.net/publication/371680244_Vergabe_von_DDC-Sachgruppen_mittels_eines_Schlagwort-Thesaurus. DOI: 10.25365/thesis.70030. Vgl. dazu die Präsentation unter: https://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=web&cd=&ved=0CAIQw7AJahcKEwjwoZzzytz_AhUAAAAAHQAAAAAQAg&url=https%3A%2F%2Fwiki.dnb.de%2Fdownload%2Fattachments%2F252121510%2FDA3%2520Workshop-Gabler.pdf%3Fversion%3D1%26modificationDate%3D1671093170000%26api%3Dv2&psig=AOvVaw0szwENK1or3HevgvIDOfjx&ust=1687719410889597&opi=89978449.
Koya, K.; Chowdhury, G.: Cultural heritage information practices and iSchools education for achieving sustainable development (2020) 0.04
```
0.038433354 = product of:
  0.07686671 = sum of:
    0.07686671 = product of:
      0.23060012 = sum of:
        0.23060012 = weight(_text_:themes in 5877) [ClassicSimilarity], result of:
          0.23060012 = score(doc=5877,freq=6.0), product of:
            0.3748681 = queryWeight, product of:
              6.429029 = idf(docFreq=193, maxDocs=44218)
              0.058308665 = queryNorm
            0.61515003 = fieldWeight in 5877, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              6.429029 = idf(docFreq=193, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5877)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)
```
Abstract

Since 2015, the United Nations Educational, Scientific and Cultural Organization (UNESCO) began the process of inculcating culture as part of the United Nations' (UN) post-2015 Sustainable (former Millennium) Development Goals, which member countries agreed to achieve by 2030. By conducting a thematic analysis of the 25 UN commissioned reports and policy documents, this research identifies 14 broad cultural heritage information themes that need to be practiced in order to achieve cultural sustainability, of which information platforms, information sharing, information broadcast, information quality, information usage training, information access, information collection, and contribution appear to be the significant themes. An investigation of education on cultural heritage informatics and digital humanities at iSchools (www.ischools.org) using a gap analysis framework demonstrates the core information science skills required for cultural heritage education. The research demonstrates that: (i) a thematic analysis of cultural heritage policy documents can be used to explore the key themes for cultural informatics education and research that can lead to sustainable development; and (ii) cultural heritage information education should cover a series of skills that can be categorized in five key areas, viz., information, technology, leadership, application, and people and user skills.
Almeida, P. de; Gnoli, C.: Fiction in a phenomenon-based classification (2021) 0.04
```
0.037656844 = product of:
  0.07531369 = sum of:
    0.07531369 = product of:
      0.22594105 = sum of:
        0.22594105 = weight(_text_:themes in 712) [ClassicSimilarity], result of:
          0.22594105 = score(doc=712,freq=4.0), product of:
            0.3748681 = queryWeight, product of:
              6.429029 = idf(docFreq=193, maxDocs=44218)
              0.058308665 = queryNorm
            0.60272145 = fieldWeight in 712, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              6.429029 = idf(docFreq=193, maxDocs=44218)
              0.046875 = fieldNorm(doc=712)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)
```
Abstract

In traditional classification, fictional works are indexed only by their form, genre, and language, while their subject content is believed to be irrelevant. However, recent research suggests that this may not be the best approach. We tested indexing of a small sample of selected fictional works by Integrative Levels Classification (ILC2), a freely faceted system based on phenomena instead of disciplines and considered the structure of the resulting classmarks. Issues in the process of subject analysis, such as selection of relevant vs. non-relevant themes and citation order of relevant ones, are identified and discussed. Some phenomena that are covered in scholarly literature can also be identified as relevant themes in fictional literature and expressed in classmarks. This can allow for hybrid search and retrieval systems covering both fiction and nonfiction, which will result in better leveraging of the knowledge contained in fictional works.
Jones, K.M.L.; Asher, A.; Goben, A.; Perry, M.R.; Salo, D.; Briney, K.A.; Robertshaw, M.B.: "We're being tracked at all times" : student perspectives of their privacy in relation to learning analytics in higher education (2020) 0.04
```
0.036257535 = product of:
  0.07251507 = sum of:
    0.07251507 = product of:
      0.14503014 = sum of:
        0.14503014 = weight(_text_:mining in 5936) [ClassicSimilarity], result of:
          0.14503014 = score(doc=5936,freq=4.0), product of:
            0.3290036 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.058308665 = queryNorm
            0.44081625 = fieldWeight in 5936, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5936)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Higher education institutions are continuing to develop their capacity for learning analytics (LA), which is a sociotechnical data-mining and analytic practice. Institutions rarely inform their students about LA practices, and there exist significant privacy concerns. Without a clear student voice in the design of LA, institutions put themselves in an ethical gray area. To help fill this gap in practice and add to the growing literature on students' privacy perspectives, this study reports findings from over 100 interviews with undergraduate students at eight U.S. higher education institutions. Findings demonstrate that students lacked awareness of educational data-mining and analytic practices, as well as the data on which they rely. Students see potential in LA, but they presented nuanced arguments about when and with whom data should be shared; they also expressed why informed consent was valuable and necessary. The study uncovered perspectives on institutional trust that were heretofore unknown, as well as what actions might violate that trust. Institutions must balance their desire to implement LA with their obligation to educate students about their analytic practices and treat them as partners in the design of analytic strategies reliant on student data in order to protect their intellectual privacy.
Wang, F.; Wang, X.: Tracing theory diffusion : a text mining and citation-based analysis of TAM (2020) 0.04
```
0.036257535 = product of:
  0.07251507 = sum of:
    0.07251507 = product of:
      0.14503014 = sum of:
        0.14503014 = weight(_text_:mining in 5980) [ClassicSimilarity], result of:
          0.14503014 = score(doc=5980,freq=4.0), product of:
            0.3290036 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.058308665 = queryNorm
            0.44081625 = fieldWeight in 5980, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5980)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Theory is a kind of condensed human knowledge. This paper is to examine the mechanism of interdisciplinary diffusion of theoretical knowledge by tracing the diffusion of a representative theory, the Technology Acceptance Model (TAM). Design/methodology/approach Based on the full-scale dataset of Web of Science (WoS), the citations of Davis's original work about TAM were analysed and the interdisciplinary diffusion paths of TAM were delineated, a supervised machine learning method was used to extract theory incidents, and a content analysis was used to categorize the patterns of theory evolution. Findings It is found that the diffusion of a theory is intertwined with its evolution. In the process, the role that a participating discipline play is related to its knowledge distance from the original disciplines of TAM. With the distance increases, the capacity to support theory development and innovation weakens, while that to assume analytical tools for practical problems increases. During the diffusion, a theory evolves into new extensions in four theoretical construction patterns, elaboration, proliferation, competition and integration. Research limitations/implications The study does not only deepen the understanding of the trajectory of a theory but also enriches the research of knowledge diffusion and innovation. Originality/value The study elaborates the relationship between theory diffusion and theory development, reveals the roles of the participating disciplines played in theory diffusion and vice versa, interprets four patterns of theory evolution and uses text mining technique to extract theory incidents, which makes up for the shortcomings of citation analysis and content analysis used in previous studies.
Urs, S.R.; Minhaj, M.: Evolution of data science and its education in iSchools : an impressionistic study using curriculum analysis (2023) 0.04
```
0.036257535 = product of:
  0.07251507 = sum of:
    0.07251507 = product of:
      0.14503014 = sum of:
        0.14503014 = weight(_text_:mining in 960) [ClassicSimilarity], result of:
          0.14503014 = score(doc=960,freq=4.0), product of:
            0.3290036 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.058308665 = queryNorm
            0.44081625 = fieldWeight in 960, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0390625 = fieldNorm(doc=960)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Data Science (DS) has emerged from the shadows of its parents-statistics and computer science-into an independent field since its origin nearly six decades ago. Its evolution and education have taken many sharp turns. We present an impressionistic study of the evolution of DS anchored to Kuhn's four stages of paradigm shifts. First, we construct the landscape of DS based on curriculum analysis of the 32 iSchools across the world offering graduate-level DS programs. Second, we paint the "field" as it emerges from the word frequency patterns, ranking, and clustering of course titles based on text mining. Third, we map the curriculum to the landscape of DS and project the same onto the Edison Data Science Framework (2017) and ACM Data Science Knowledge Areas (2021). Our study shows that the DS programs of iSchools align well with the field and correspond to the Knowledge Areas and skillsets. iSchool's DS curriculums exhibit a bias toward "data visualization" along with machine learning, data mining, natural language processing, and artificial intelligence; go light on statistics; slanted toward ontologies and health informatics; and surprisingly minimal thrust toward eScience/research data management, which we believe would add a distinctive iSchool flavor to the DS.

Search (137 results, page 1 of 7)

Authors

Languages

Types

Themes

Subjects

Classifications