Search (127 results, page 1 of 7)

Luo, L.; Ju, J.; Li, Y.-F.; Haffari, G.; Xiong, B.; Pan, S.: ChatRule: mining logical rules with large language models for knowledge graph reasoning (2023) 0.08
```
0.080165744 = product of:
  0.16033149 = sum of:
    0.16033149 = sum of:
      0.12601131 = weight(_text_:mining in 1171) [ClassicSimilarity], result of:
        0.12601131 = score(doc=1171,freq=4.0), product of:
          0.28585905 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.05066224 = queryNorm
          0.44081625 = fieldWeight in 1171, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1171)
      0.034320172 = weight(_text_:22 in 1171) [ClassicSimilarity], result of:
        0.034320172 = score(doc=1171,freq=2.0), product of:
          0.17741053 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05066224 = queryNorm
          0.19345059 = fieldWeight in 1171, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1171)
  0.5 = coord(1/2)
```
Abstract

Logical rules are essential for uncovering the logical connections between relations, which could improve the reasoning performance and provide interpretable results on knowledge graphs (KGs). Although there have been many efforts to mine meaningful logical rules over KGs, existing methods suffer from the computationally intensive searches over the rule space and a lack of scalability for large-scale KGs. Besides, they often ignore the semantics of relations which is crucial for uncovering logical connections. Recently, large language models (LLMs) have shown impressive performance in the field of natural language processing and various applications, owing to their emergent ability and generalizability. In this paper, we propose a novel framework, ChatRule, unleashing the power of large language models for mining logical rules over knowledge graphs. Specifically, the framework is initiated with an LLM-based rule generator, leveraging both the semantic and structural information of KGs to prompt LLMs to generate logical rules. To refine the generated rules, a rule ranking module estimates the rule quality by incorporating facts from existing KGs. Last, a rule validator harnesses the reasoning ability of LLMs to validate the logical correctness of ranked rules through chain-of-thought reasoning. ChatRule is evaluated on four large-scale KGs, w.r.t. different rule quality metrics and downstream tasks, showing the effectiveness and scalability of our method.

Date

23.11.2023 19:07:22
Al-Khatib, K.; Ghosa, T.; Hou, Y.; Waard, A. de; Freitag, D.: Argument mining for scholarly document processing : taking stock and looking ahead (2021) 0.08
```
0.076390296 = product of:
  0.15278059 = sum of:
    0.15278059 = product of:
      0.30556118 = sum of:
        0.30556118 = weight(_text_:mining in 568) [ClassicSimilarity], result of:
          0.30556118 = score(doc=568,freq=12.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            1.0689225 = fieldWeight in 568, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0546875 = fieldNorm(doc=568)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Argument mining targets structures in natural language related to interpretation and persuasion. Most scholarly discourse involves interpreting experimental evidence and attempting to persuade other scientists to adopt the same conclusions, which could benefit from argument mining techniques. However, While various argument mining studies have addressed student essays and news articles, those that target scientific discourse are still scarce. This paper surveys existing work in argument mining of scholarly discourse, and provides an overview of current models, data, tasks, and applications. We identify a number of key challenges confronting argument mining in the scientific domain, and suggest some possible solutions and future directions.
Mandl, T.: Text Mining und Data Mining (2023) 0.08
```
0.076390296 = product of:
  0.15278059 = sum of:
    0.15278059 = product of:
      0.30556118 = sum of:
        0.30556118 = weight(_text_:mining in 774) [ClassicSimilarity], result of:
          0.30556118 = score(doc=774,freq=12.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            1.0689225 = fieldWeight in 774, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0546875 = fieldNorm(doc=774)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Text und Data Mining sind ein Bündel von Technologien, die eng mit den Themenfeldern Statistik, Maschinelles Lernen und dem Erkennen von Mustern verbunden sind. Die üblichen Definitionen beziehen eine Vielzahl von verschiedenen Verfahren mit ein, ohne eine exakte Grenze zu ziehen. Data Mining bezeichnet die Suche nach Mustern, Regelmäßigkeiten oder Auffälligkeiten in stark strukturierten und vor allem numerischen Daten. "Any algorithm that enumerates patterns from, or fits models to, data is a data mining algorithm." Numerische Daten und Datenbankinhalte werden als strukturierte Daten bezeichnet. Dagegen gelten Textdokumente in natürlicher Sprache als unstrukturierte Daten.

Theme

Data Mining
Lowe, D.B.; Dollinger, I.; Koster, T.; Herbert, B.E.: Text mining for type of research classification (2021) 0.06
```
0.059772413 = product of:
  0.11954483 = sum of:
    0.11954483 = product of:
      0.23908965 = sum of:
        0.23908965 = weight(_text_:mining in 720) [ClassicSimilarity], result of:
          0.23908965 = score(doc=720,freq=10.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.83639 = fieldWeight in 720, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.046875 = fieldNorm(doc=720)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This project brought together undergraduate students in Computer Science with librarians to mine abstracts of articles from the Texas A&M University Libraries' institutional repository, OAKTrust, in order to probe the creation of new metadata to improve discovery and use. The mining operation task consisted simply of classifying the articles into two categories of research type: basic research ("for understanding," "curiosity-based," or "knowledge-based") and applied research ("use-based"). These categories are fundamental especially for funders but are also important to researchers. The mining-to-classification steps took several iterations, but ultimately, we achieved good results with the toolkit BERT (Bidirectional Encoder Representations from Transformers). The project and its workflows represent a preview of what may lie ahead in the future of crafting metadata using text mining techniques to enhance discoverability.

Theme

Data Mining

Heesen, H.; Jüngels, L.: ¬Der Regierungsentwurf der Text und Data Mining-Schranken (§§ 44b, 60d UrhG-E) : ein Überblick zu den geplanten Regelungen für Kultur- und Wissenschaftseinrichtungen (2021) 0.05

0.053462073 = product of:
  0.10692415 = sum of:
    0.10692415 = product of:
      0.2138483 = sum of:
        0.2138483 = weight(_text_:mining in 190) [ClassicSimilarity], result of:
          0.2138483 = score(doc=190,freq=2.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.74808997 = fieldWeight in 190, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.09375 = fieldNorm(doc=190)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Moulaison-Sandy, H.; Adkins, D.; Bossaller, J.; Cho, H.: ¬An automated approach to describing fiction : a methodology to use book reviews to identify affect (2021) 0.04
```
0.044103958 = product of:
  0.088207915 = sum of:
    0.088207915 = product of:
      0.17641583 = sum of:
        0.17641583 = weight(_text_:mining in 710) [ClassicSimilarity], result of:
          0.17641583 = score(doc=710,freq=4.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.61714274 = fieldWeight in 710, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0546875 = fieldNorm(doc=710)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Subject headings and genre terms are notoriously difficult to apply, yet are important for fiction. The current project functions as a proof of concept, using a text-mining methodology to identify affective information (emotion and tone) about fiction titles from professional book reviews as a potential first step in automating the subject analysis process. Findings are presented and discussed, comparing results to the range of aboutness and isness information in library cataloging records. The methodology is likewise presented, and how future work might expand on the current project to enhance catalog records through text-mining is explored.

Wiegmann, S.: Hättest du die Titanic überlebt? : Eine kurze Einführung in das Data Mining mit freier Software (2023) 0.04

0.044103958 = product of:
  0.088207915 = sum of:
    0.088207915 = product of:
      0.17641583 = sum of:
        0.17641583 = weight(_text_:mining in 876) [ClassicSimilarity], result of:
          0.17641583 = score(doc=876,freq=4.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.61714274 = fieldWeight in 876, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0546875 = fieldNorm(doc=876)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Theme: Data Mining

Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.04

0.04023253 = product of:
  0.08046506 = sum of:
    0.08046506 = product of:
      0.24139518 = sum of:
        0.24139518 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
          0.24139518 = score(doc=862,freq=2.0), product of:
            0.429515 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.05066224 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Source: https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN

Kang, X.; Wu, Y.; Ren, W.: Toward action comprehension for searching : mining actionable intents in query entities (2020) 0.04
```
0.038582932 = product of:
  0.077165864 = sum of:
    0.077165864 = product of:
      0.15433173 = sum of:
        0.15433173 = weight(_text_:mining in 5613) [ClassicSimilarity], result of:
          0.15433173 = score(doc=5613,freq=6.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.5398875 = fieldWeight in 5613, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5613)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Understanding search engine users' intents has been a popular study in information retrieval, which directly affects the quality of retrieved information. One of the fundamental problems in this field is to find a connection between the entity in a query and the potential intents of the users, the latter of which would further reveal important information for facilitating the users' future actions. In this article, we present a novel research method for mining the actionable intents for search users, by generating a ranked list of the potentially most informative actions based on a massive pool of action samples. We compare different search strategies and their combinations for retrieving the action pool and develop three criteria for measuring the informativeness of the selected action samples, that is, the significance of an action sample within the pool, the representativeness of an action sample for the other candidate samples, and the diverseness of an action sample with respect to the selected actions. Our experiment, based on the Action Mining (AM) query entity data set from the Actionable Knowledge Graph (AKG) task at NTCIR-13, suggests that the proposed approach is effective in generating an informative and early-satisfying ranking of potential actions for search users.
Jones, K.M.L.; Rubel, A.; LeClere, E.: ¬A matter of trust : higher education institutions as information fiduciaries in an age of educational data mining and learning analytics (2020) 0.04
```
0.038582932 = product of:
  0.077165864 = sum of:
    0.077165864 = product of:
      0.15433173 = sum of:
        0.15433173 = weight(_text_:mining in 5968) [ClassicSimilarity], result of:
          0.15433173 = score(doc=5968,freq=6.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.5398875 = fieldWeight in 5968, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5968)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Higher education institutions are mining and analyzing student data to effect educational, political, and managerial outcomes. Done under the banner of "learning analytics," this work can-and often does-surface sensitive data and information about, inter alia, a student's demographics, academic performance, offline and online movements, physical fitness, mental wellbeing, and social network. With these data, institutions and third parties are able to describe student life, predict future behaviors, and intervene to address academic or other barriers to student success (however defined). Learning analytics, consequently, raise serious issues concerning student privacy, autonomy, and the appropriate flow of student data. We argue that issues around privacy lead to valid questions about the degree to which students should trust their institution to use learning analytics data and other artifacts (algorithms, predictive scores) with their interests in mind. We argue that higher education institutions are paradigms of information fiduciaries. As such, colleges and universities have a special responsibility to their students. In this article, we use the information fiduciary concept to analyze cases when learning analytics violate an institution's responsibility to its students.

Theme

Data Mining
Goldberg, D.M.; Zaman, N.; Brahma, A.; Aloiso, M.: Are mortgage loan closing delay risks predictable? : A predictive analysis using text mining on discussion threads (2022) 0.04
```
0.038582932 = product of:
  0.077165864 = sum of:
    0.077165864 = product of:
      0.15433173 = sum of:
        0.15433173 = weight(_text_:mining in 501) [ClassicSimilarity], result of:
          0.15433173 = score(doc=501,freq=6.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.5398875 = fieldWeight in 501, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0390625 = fieldNorm(doc=501)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Loan processors and underwriters at mortgage firms seek to gather substantial supporting documentation to properly understand and model loan risks. In doing so, loan originations become prone to closing delays, risking client dissatisfaction and consequent revenue losses. We collaborate with a large national mortgage firm to examine the extent to which these delays are predictable, using internal discussion threads to prioritize interventions for loans most at risk. Substantial work experience is required to predict delays, and we find that even highly trained employees have difficulty predicting delays by reviewing discussion threads. We develop an array of methods to predict loan delays. We apply four modern out-of-the-box sentiment analysis techniques, two dictionary-based and two rule-based, to predict delays. We contrast these approaches with domain-specific approaches, including firm-provided keyword searches and "smoke terms" derived using machine learning. Performance varies widely across sentiment approaches; while some sentiment approaches prioritize the top-ranking records well, performance quickly declines thereafter. The firm-provided keyword searches perform at the rate of random chance. We observe that the domain-specific smoke term approaches consistently outperform other approaches and offer better prediction than loan and borrower characteristics. We conclude that text mining solutions would greatly assist mortgage firms in delay prevention.

Theme

Data Mining
Dietz, K.: en.wikipedia.org > 6 Mio. Artikel (2020) 0.03
```
0.03352711 = product of:
  0.06705422 = sum of:
    0.06705422 = product of:
      0.20116265 = sum of:
        0.20116265 = weight(_text_:3a in 5669) [ClassicSimilarity], result of:
          0.20116265 = score(doc=5669,freq=2.0), product of:
            0.429515 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.05066224 = queryNorm
            0.46834838 = fieldWeight in 5669, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5669)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)
```
Content

"Die Englischsprachige Wikipedia verfügt jetzt über mehr als 6 Millionen Artikel. An zweiter Stelle kommt die deutschsprachige Wikipedia mit 2.3 Millionen Artikeln, an dritter Stelle steht die französischsprachige Wikipedia mit 2.1 Millionen Artikeln (via Researchbuzz: Firehose <https://rbfirehose.com/2020/01/24/techcrunch-wikipedia-now-has-more-than-6-million-articles-in-english/> und Techcrunch <https://techcrunch.com/2020/01/23/wikipedia-english-six-million-articles/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+Techcrunch+%28TechCrunch%29&guccounter=1&guce_referrer=aHR0cHM6Ly9yYmZpcmVob3NlLmNvbS8yMDIwLzAxLzI0L3RlY2hjcnVuY2gtd2lraXBlZGlhLW5vdy1oYXMtbW9yZS10aGFuLTYtbWlsbGlvbi1hcnRpY2xlcy1pbi1lbmdsaXNoLw&guce_referrer_sig=AQAAAK0zHfjdDZ_spFZBF_z-zDjtL5iWvuKDumFTzm4HvQzkUfE2pLXQzGS6FGB_y-VISdMEsUSvkNsg2U_NWQ4lwWSvOo3jvXo1I3GtgHpP8exukVxYAnn5mJspqX50VHIWFADHhs5AerkRn3hMRtf_R3F1qmEbo8EROZXp328HMC-o>). 250120 via digithek ch = #fineBlog s.a.: Angesichts der Veröffentlichung des 6-millionsten Artikels vergangene Woche in der englischsprachigen Wikipedia hat die Community-Zeitungsseite "Wikipedia Signpost" ein Moratorium bei der Veröffentlichung von Unternehmensartikeln gefordert. Das sei kein Vorwurf gegen die Wikimedia Foundation, aber die derzeitigen Maßnahmen, um die Enzyklopädie gegen missbräuchliches undeklariertes Paid Editing zu schützen, funktionierten ganz klar nicht. *"Da die ehrenamtlichen Autoren derzeit von Werbung in Gestalt von Wikipedia-Artikeln überwältigt werden, und da die WMF nicht in der Lage zu sein scheint, dem irgendetwas entgegenzusetzen, wäre der einzige gangbare Weg für die Autoren, fürs erste die Neuanlage von Artikeln über Unternehmen zu untersagen"*, schreibt der Benutzer Smallbones in seinem Editorial <https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2020-01-27/From_the_editor> zur heutigen Ausgabe."
Gabler, S.: Vergabe von DDC-Sachgruppen mittels eines Schlagwort-Thesaurus (2021) 0.03
```
0.03352711 = product of:
  0.06705422 = sum of:
    0.06705422 = product of:
      0.20116265 = sum of:
        0.20116265 = weight(_text_:3a in 1000) [ClassicSimilarity], result of:
          0.20116265 = score(doc=1000,freq=2.0), product of:
            0.429515 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.05066224 = queryNorm
            0.46834838 = fieldWeight in 1000, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1000)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)
```
Content

Master thesis Master of Science (Library and Information Studies) (MSc), Universität Wien. Advisor: Christoph Steiner. Vgl.: https://www.researchgate.net/publication/371680244_Vergabe_von_DDC-Sachgruppen_mittels_eines_Schlagwort-Thesaurus. DOI: 10.25365/thesis.70030. Vgl. dazu die Präsentation unter: https://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=web&cd=&ved=0CAIQw7AJahcKEwjwoZzzytz_AhUAAAAAHQAAAAAQAg&url=https%3A%2F%2Fwiki.dnb.de%2Fdownload%2Fattachments%2F252121510%2FDA3%2520Workshop-Gabler.pdf%3Fversion%3D1%26modificationDate%3D1671093170000%26api%3Dv2&psig=AOvVaw0szwENK1or3HevgvIDOfjx&ust=1687719410889597&opi=89978449.
Jones, K.M.L.; Asher, A.; Goben, A.; Perry, M.R.; Salo, D.; Briney, K.A.; Robertshaw, M.B.: "We're being tracked at all times" : student perspectives of their privacy in relation to learning analytics in higher education (2020) 0.03
```
0.031502828 = product of:
  0.063005656 = sum of:
    0.063005656 = product of:
      0.12601131 = sum of:
        0.12601131 = weight(_text_:mining in 5936) [ClassicSimilarity], result of:
          0.12601131 = score(doc=5936,freq=4.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.44081625 = fieldWeight in 5936, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5936)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Higher education institutions are continuing to develop their capacity for learning analytics (LA), which is a sociotechnical data-mining and analytic practice. Institutions rarely inform their students about LA practices, and there exist significant privacy concerns. Without a clear student voice in the design of LA, institutions put themselves in an ethical gray area. To help fill this gap in practice and add to the growing literature on students' privacy perspectives, this study reports findings from over 100 interviews with undergraduate students at eight U.S. higher education institutions. Findings demonstrate that students lacked awareness of educational data-mining and analytic practices, as well as the data on which they rely. Students see potential in LA, but they presented nuanced arguments about when and with whom data should be shared; they also expressed why informed consent was valuable and necessary. The study uncovered perspectives on institutional trust that were heretofore unknown, as well as what actions might violate that trust. Institutions must balance their desire to implement LA with their obligation to educate students about their analytic practices and treat them as partners in the design of analytic strategies reliant on student data in order to protect their intellectual privacy.
Wang, F.; Wang, X.: Tracing theory diffusion : a text mining and citation-based analysis of TAM (2020) 0.03
```
0.031502828 = product of:
  0.063005656 = sum of:
    0.063005656 = product of:
      0.12601131 = sum of:
        0.12601131 = weight(_text_:mining in 5980) [ClassicSimilarity], result of:
          0.12601131 = score(doc=5980,freq=4.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.44081625 = fieldWeight in 5980, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5980)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Theory is a kind of condensed human knowledge. This paper is to examine the mechanism of interdisciplinary diffusion of theoretical knowledge by tracing the diffusion of a representative theory, the Technology Acceptance Model (TAM). Design/methodology/approach Based on the full-scale dataset of Web of Science (WoS), the citations of Davis's original work about TAM were analysed and the interdisciplinary diffusion paths of TAM were delineated, a supervised machine learning method was used to extract theory incidents, and a content analysis was used to categorize the patterns of theory evolution. Findings It is found that the diffusion of a theory is intertwined with its evolution. In the process, the role that a participating discipline play is related to its knowledge distance from the original disciplines of TAM. With the distance increases, the capacity to support theory development and innovation weakens, while that to assume analytical tools for practical problems increases. During the diffusion, a theory evolves into new extensions in four theoretical construction patterns, elaboration, proliferation, competition and integration. Research limitations/implications The study does not only deepen the understanding of the trajectory of a theory but also enriches the research of knowledge diffusion and innovation. Originality/value The study elaborates the relationship between theory diffusion and theory development, reveals the roles of the participating disciplines played in theory diffusion and vice versa, interprets four patterns of theory evolution and uses text mining technique to extract theory incidents, which makes up for the shortcomings of citation analysis and content analysis used in previous studies.
Urs, S.R.; Minhaj, M.: Evolution of data science and its education in iSchools : an impressionistic study using curriculum analysis (2023) 0.03
```
0.031502828 = product of:
  0.063005656 = sum of:
    0.063005656 = product of:
      0.12601131 = sum of:
        0.12601131 = weight(_text_:mining in 960) [ClassicSimilarity], result of:
          0.12601131 = score(doc=960,freq=4.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.44081625 = fieldWeight in 960, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0390625 = fieldNorm(doc=960)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Data Science (DS) has emerged from the shadows of its parents-statistics and computer science-into an independent field since its origin nearly six decades ago. Its evolution and education have taken many sharp turns. We present an impressionistic study of the evolution of DS anchored to Kuhn's four stages of paradigm shifts. First, we construct the landscape of DS based on curriculum analysis of the 32 iSchools across the world offering graduate-level DS programs. Second, we paint the "field" as it emerges from the word frequency patterns, ranking, and clustering of course titles based on text mining. Third, we map the curriculum to the landscape of DS and project the same onto the Edison Data Science Framework (2017) and ACM Data Science Knowledge Areas (2021). Our study shows that the DS programs of iSchools align well with the field and correspond to the Knowledge Areas and skillsets. iSchool's DS curriculums exhibit a bias toward "data visualization" along with machine learning, data mining, natural language processing, and artificial intelligence; go light on statistics; slanted toward ontologies and health informatics; and surprisingly minimal thrust toward eScience/research data management, which we believe would add a distinctive iSchool flavor to the DS.
Wei, W.; Liu, Y.-P.; Wei, L-R.: Feature-level sentiment analysis based on rules and fine-grained domain ontology (2020) 0.03
```
0.026731037 = product of:
  0.053462073 = sum of:
    0.053462073 = product of:
      0.10692415 = sum of:
        0.10692415 = weight(_text_:mining in 5876) [ClassicSimilarity], result of:
          0.10692415 = score(doc=5876,freq=2.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.37404498 = fieldWeight in 5876, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.046875 = fieldNorm(doc=5876)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Mining product reviews and sentiment analysis are of great significance, whether for academic research purposes or optimizing business strategies. We propose a feature-level sentiment analysis framework based on rules parsing and fine-grained domain ontology for Chinese reviews. Fine-grained ontology is used to describe synonymous expressions of product features, which are reflected in word changes in online reviews. First, a semiautomatic construction method is developed by using Word2Vec for fine-grained ontology. Then, featurelevel sentiment analysis that combines rules parsing and the fine-grained domain ontology is conducted to extract explicit and implicit features from product reviews. Finally, the domain sentiment dictionary and context sentiment dictionary are established to identify sentiment polarities for the extracted feature-sentiment combinations. An experiment is conducted on the basis of product reviews crawled from Chinese e-commerce websites. The results demonstrate the effectiveness of our approach.
Wang, X.; High, A.; Wang, X.; Zhao, K.: Predicting users' continued engagement in online health communities from the quantity and quality of received support (2021) 0.03
```
0.026731037 = product of:
  0.053462073 = sum of:
    0.053462073 = product of:
      0.10692415 = sum of:
        0.10692415 = weight(_text_:mining in 242) [ClassicSimilarity], result of:
          0.10692415 = score(doc=242,freq=2.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.37404498 = fieldWeight in 242, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.046875 = fieldNorm(doc=242)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This article presents a rare insight into the migration of municipality record-keeping databases. The migration of a database for preservation purposes poses Online health communities (OHCs) have been major resources for people with similar health concerns to interact with each other. They offer easily accessible platforms for users to seek, receive, and provide supports by posting. Taking the advantage of text mining and machine learning techniques, we identified social support type(s) in each post and a new user's support needs in an OHC. We examined a user's first-time support-seeking experience by measuring both quantity and quality of received support. Our results revealed that the amount and match of received support are positive and significant predictors of new users' continued engagement. Our outcomes can provide insight for designing and managing a sustainable OHC by retaining users.

Datentracking in der Wissenschaft : Aggregation und Verwendung bzw. Verkauf von Nutzungsdaten durch Wissenschaftsverlage. Ein Informationspapier des Ausschusses für Wissenschaftliche Bibliotheken und Informationssysteme der Deutschen Forschungsgemeinschaft (2021) 0.03

0.026731037 = product of:
  0.053462073 = sum of:
    0.053462073 = product of:
      0.10692415 = sum of:
        0.10692415 = weight(_text_:mining in 248) [ClassicSimilarity], result of:
          0.10692415 = score(doc=248,freq=2.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.37404498 = fieldWeight in 248, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.046875 = fieldNorm(doc=248)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Theme: Data Mining

Organisciak, P.; Schmidt, B.M.; Downie, J.S.: Giving shape to large digital libraries through exploratory data analysis (2022) 0.03

0.026731037 = product of:
  0.053462073 = sum of:
    0.053462073 = product of:
      0.10692415 = sum of:
        0.10692415 = weight(_text_:mining in 473) [ClassicSimilarity], result of:
          0.10692415 = score(doc=473,freq=2.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.37404498 = fieldWeight in 473, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.046875 = fieldNorm(doc=473)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Theme: Data Mining

Search (127 results, page 1 of 7)

Authors

Languages

Types

Themes

Subjects

Classifications