Search (93 results, page 1 of 5)

  • × language_ss:"e"
  • × year_i:[2020 TO 2030}
  1. Luo, L.; Ju, J.; Li, Y.-F.; Haffari, G.; Xiong, B.; Pan, S.: ChatRule: mining logical rules with large language models for knowledge graph reasoning (2023) 0.08
    0.080165744 = product of:
      0.16033149 = sum of:
        0.16033149 = sum of:
          0.12601131 = weight(_text_:mining in 1171) [ClassicSimilarity], result of:
            0.12601131 = score(doc=1171,freq=4.0), product of:
              0.28585905 = queryWeight, product of:
                5.642448 = idf(docFreq=425, maxDocs=44218)
                0.05066224 = queryNorm
              0.44081625 = fieldWeight in 1171, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.642448 = idf(docFreq=425, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1171)
          0.034320172 = weight(_text_:22 in 1171) [ClassicSimilarity], result of:
            0.034320172 = score(doc=1171,freq=2.0), product of:
              0.17741053 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.05066224 = queryNorm
              0.19345059 = fieldWeight in 1171, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1171)
      0.5 = coord(1/2)
    
    Abstract
    Logical rules are essential for uncovering the logical connections between relations, which could improve the reasoning performance and provide interpretable results on knowledge graphs (KGs). Although there have been many efforts to mine meaningful logical rules over KGs, existing methods suffer from the computationally intensive searches over the rule space and a lack of scalability for large-scale KGs. Besides, they often ignore the semantics of relations which is crucial for uncovering logical connections. Recently, large language models (LLMs) have shown impressive performance in the field of natural language processing and various applications, owing to their emergent ability and generalizability. In this paper, we propose a novel framework, ChatRule, unleashing the power of large language models for mining logical rules over knowledge graphs. Specifically, the framework is initiated with an LLM-based rule generator, leveraging both the semantic and structural information of KGs to prompt LLMs to generate logical rules. To refine the generated rules, a rule ranking module estimates the rule quality by incorporating facts from existing KGs. Last, a rule validator harnesses the reasoning ability of LLMs to validate the logical correctness of ranked rules through chain-of-thought reasoning. ChatRule is evaluated on four large-scale KGs, w.r.t. different rule quality metrics and downstream tasks, showing the effectiveness and scalability of our method.
    Date
    23.11.2023 19:07:22
  2. Al-Khatib, K.; Ghosa, T.; Hou, Y.; Waard, A. de; Freitag, D.: Argument mining for scholarly document processing : taking stock and looking ahead (2021) 0.08
    0.076390296 = product of:
      0.15278059 = sum of:
        0.15278059 = product of:
          0.30556118 = sum of:
            0.30556118 = weight(_text_:mining in 568) [ClassicSimilarity], result of:
              0.30556118 = score(doc=568,freq=12.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                1.0689225 = fieldWeight in 568, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=568)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Argument mining targets structures in natural language related to interpretation and persuasion. Most scholarly discourse involves interpreting experimental evidence and attempting to persuade other scientists to adopt the same conclusions, which could benefit from argument mining techniques. However, While various argument mining studies have addressed student essays and news articles, those that target scientific discourse are still scarce. This paper surveys existing work in argument mining of scholarly discourse, and provides an overview of current models, data, tasks, and applications. We identify a number of key challenges confronting argument mining in the scientific domain, and suggest some possible solutions and future directions.
  3. Lowe, D.B.; Dollinger, I.; Koster, T.; Herbert, B.E.: Text mining for type of research classification (2021) 0.06
    0.059772413 = product of:
      0.11954483 = sum of:
        0.11954483 = product of:
          0.23908965 = sum of:
            0.23908965 = weight(_text_:mining in 720) [ClassicSimilarity], result of:
              0.23908965 = score(doc=720,freq=10.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.83639 = fieldWeight in 720, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.046875 = fieldNorm(doc=720)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This project brought together undergraduate students in Computer Science with librarians to mine abstracts of articles from the Texas A&M University Libraries' institutional repository, OAKTrust, in order to probe the creation of new metadata to improve discovery and use. The mining operation task consisted simply of classifying the articles into two categories of research type: basic research ("for understanding," "curiosity-based," or "knowledge-based") and applied research ("use-based"). These categories are fundamental especially for funders but are also important to researchers. The mining-to-classification steps took several iterations, but ultimately, we achieved good results with the toolkit BERT (Bidirectional Encoder Representations from Transformers). The project and its workflows represent a preview of what may lie ahead in the future of crafting metadata using text mining techniques to enhance discoverability.
    Theme
    Data Mining
  4. Moulaison-Sandy, H.; Adkins, D.; Bossaller, J.; Cho, H.: ¬An automated approach to describing fiction : a methodology to use book reviews to identify affect (2021) 0.04
    0.044103958 = product of:
      0.088207915 = sum of:
        0.088207915 = product of:
          0.17641583 = sum of:
            0.17641583 = weight(_text_:mining in 710) [ClassicSimilarity], result of:
              0.17641583 = score(doc=710,freq=4.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.61714274 = fieldWeight in 710, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=710)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Subject headings and genre terms are notoriously difficult to apply, yet are important for fiction. The current project functions as a proof of concept, using a text-mining methodology to identify affective information (emotion and tone) about fiction titles from professional book reviews as a potential first step in automating the subject analysis process. Findings are presented and discussed, comparing results to the range of aboutness and isness information in library cataloging records. The methodology is likewise presented, and how future work might expand on the current project to enhance catalog records through text-mining is explored.
  5. Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.04
    0.04023253 = product of:
      0.08046506 = sum of:
        0.08046506 = product of:
          0.24139518 = sum of:
            0.24139518 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
              0.24139518 = score(doc=862,freq=2.0), product of:
                0.429515 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.05066224 = queryNorm
                0.56201804 = fieldWeight in 862, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=862)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Source
    https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN
  6. Kang, X.; Wu, Y.; Ren, W.: Toward action comprehension for searching : mining actionable intents in query entities (2020) 0.04
    0.038582932 = product of:
      0.077165864 = sum of:
        0.077165864 = product of:
          0.15433173 = sum of:
            0.15433173 = weight(_text_:mining in 5613) [ClassicSimilarity], result of:
              0.15433173 = score(doc=5613,freq=6.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.5398875 = fieldWeight in 5613, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5613)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Understanding search engine users' intents has been a popular study in information retrieval, which directly affects the quality of retrieved information. One of the fundamental problems in this field is to find a connection between the entity in a query and the potential intents of the users, the latter of which would further reveal important information for facilitating the users' future actions. In this article, we present a novel research method for mining the actionable intents for search users, by generating a ranked list of the potentially most informative actions based on a massive pool of action samples. We compare different search strategies and their combinations for retrieving the action pool and develop three criteria for measuring the informativeness of the selected action samples, that is, the significance of an action sample within the pool, the representativeness of an action sample for the other candidate samples, and the diverseness of an action sample with respect to the selected actions. Our experiment, based on the Action Mining (AM) query entity data set from the Actionable Knowledge Graph (AKG) task at NTCIR-13, suggests that the proposed approach is effective in generating an informative and early-satisfying ranking of potential actions for search users.
  7. Jones, K.M.L.; Rubel, A.; LeClere, E.: ¬A matter of trust : higher education institutions as information fiduciaries in an age of educational data mining and learning analytics (2020) 0.04
    0.038582932 = product of:
      0.077165864 = sum of:
        0.077165864 = product of:
          0.15433173 = sum of:
            0.15433173 = weight(_text_:mining in 5968) [ClassicSimilarity], result of:
              0.15433173 = score(doc=5968,freq=6.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.5398875 = fieldWeight in 5968, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5968)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Higher education institutions are mining and analyzing student data to effect educational, political, and managerial outcomes. Done under the banner of "learning analytics," this work can-and often does-surface sensitive data and information about, inter alia, a student's demographics, academic performance, offline and online movements, physical fitness, mental wellbeing, and social network. With these data, institutions and third parties are able to describe student life, predict future behaviors, and intervene to address academic or other barriers to student success (however defined). Learning analytics, consequently, raise serious issues concerning student privacy, autonomy, and the appropriate flow of student data. We argue that issues around privacy lead to valid questions about the degree to which students should trust their institution to use learning analytics data and other artifacts (algorithms, predictive scores) with their interests in mind. We argue that higher education institutions are paradigms of information fiduciaries. As such, colleges and universities have a special responsibility to their students. In this article, we use the information fiduciary concept to analyze cases when learning analytics violate an institution's responsibility to its students.
    Theme
    Data Mining
  8. Goldberg, D.M.; Zaman, N.; Brahma, A.; Aloiso, M.: Are mortgage loan closing delay risks predictable? : A predictive analysis using text mining on discussion threads (2022) 0.04
    0.038582932 = product of:
      0.077165864 = sum of:
        0.077165864 = product of:
          0.15433173 = sum of:
            0.15433173 = weight(_text_:mining in 501) [ClassicSimilarity], result of:
              0.15433173 = score(doc=501,freq=6.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.5398875 = fieldWeight in 501, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=501)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Loan processors and underwriters at mortgage firms seek to gather substantial supporting documentation to properly understand and model loan risks. In doing so, loan originations become prone to closing delays, risking client dissatisfaction and consequent revenue losses. We collaborate with a large national mortgage firm to examine the extent to which these delays are predictable, using internal discussion threads to prioritize interventions for loans most at risk. Substantial work experience is required to predict delays, and we find that even highly trained employees have difficulty predicting delays by reviewing discussion threads. We develop an array of methods to predict loan delays. We apply four modern out-of-the-box sentiment analysis techniques, two dictionary-based and two rule-based, to predict delays. We contrast these approaches with domain-specific approaches, including firm-provided keyword searches and "smoke terms" derived using machine learning. Performance varies widely across sentiment approaches; while some sentiment approaches prioritize the top-ranking records well, performance quickly declines thereafter. The firm-provided keyword searches perform at the rate of random chance. We observe that the domain-specific smoke term approaches consistently outperform other approaches and offer better prediction than loan and borrower characteristics. We conclude that text mining solutions would greatly assist mortgage firms in delay prevention.
    Theme
    Data Mining
  9. Jones, K.M.L.; Asher, A.; Goben, A.; Perry, M.R.; Salo, D.; Briney, K.A.; Robertshaw, M.B.: "We're being tracked at all times" : student perspectives of their privacy in relation to learning analytics in higher education (2020) 0.03
    0.031502828 = product of:
      0.063005656 = sum of:
        0.063005656 = product of:
          0.12601131 = sum of:
            0.12601131 = weight(_text_:mining in 5936) [ClassicSimilarity], result of:
              0.12601131 = score(doc=5936,freq=4.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.44081625 = fieldWeight in 5936, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5936)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Higher education institutions are continuing to develop their capacity for learning analytics (LA), which is a sociotechnical data-mining and analytic practice. Institutions rarely inform their students about LA practices, and there exist significant privacy concerns. Without a clear student voice in the design of LA, institutions put themselves in an ethical gray area. To help fill this gap in practice and add to the growing literature on students' privacy perspectives, this study reports findings from over 100 interviews with undergraduate students at eight U.S. higher education institutions. Findings demonstrate that students lacked awareness of educational data-mining and analytic practices, as well as the data on which they rely. Students see potential in LA, but they presented nuanced arguments about when and with whom data should be shared; they also expressed why informed consent was valuable and necessary. The study uncovered perspectives on institutional trust that were heretofore unknown, as well as what actions might violate that trust. Institutions must balance their desire to implement LA with their obligation to educate students about their analytic practices and treat them as partners in the design of analytic strategies reliant on student data in order to protect their intellectual privacy.
  10. Wang, F.; Wang, X.: Tracing theory diffusion : a text mining and citation-based analysis of TAM (2020) 0.03
    0.031502828 = product of:
      0.063005656 = sum of:
        0.063005656 = product of:
          0.12601131 = sum of:
            0.12601131 = weight(_text_:mining in 5980) [ClassicSimilarity], result of:
              0.12601131 = score(doc=5980,freq=4.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.44081625 = fieldWeight in 5980, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5980)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Theory is a kind of condensed human knowledge. This paper is to examine the mechanism of interdisciplinary diffusion of theoretical knowledge by tracing the diffusion of a representative theory, the Technology Acceptance Model (TAM). Design/methodology/approach Based on the full-scale dataset of Web of Science (WoS), the citations of Davis's original work about TAM were analysed and the interdisciplinary diffusion paths of TAM were delineated, a supervised machine learning method was used to extract theory incidents, and a content analysis was used to categorize the patterns of theory evolution. Findings It is found that the diffusion of a theory is intertwined with its evolution. In the process, the role that a participating discipline play is related to its knowledge distance from the original disciplines of TAM. With the distance increases, the capacity to support theory development and innovation weakens, while that to assume analytical tools for practical problems increases. During the diffusion, a theory evolves into new extensions in four theoretical construction patterns, elaboration, proliferation, competition and integration. Research limitations/implications The study does not only deepen the understanding of the trajectory of a theory but also enriches the research of knowledge diffusion and innovation. Originality/value The study elaborates the relationship between theory diffusion and theory development, reveals the roles of the participating disciplines played in theory diffusion and vice versa, interprets four patterns of theory evolution and uses text mining technique to extract theory incidents, which makes up for the shortcomings of citation analysis and content analysis used in previous studies.
  11. Urs, S.R.; Minhaj, M.: Evolution of data science and its education in iSchools : an impressionistic study using curriculum analysis (2023) 0.03
    0.031502828 = product of:
      0.063005656 = sum of:
        0.063005656 = product of:
          0.12601131 = sum of:
            0.12601131 = weight(_text_:mining in 960) [ClassicSimilarity], result of:
              0.12601131 = score(doc=960,freq=4.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.44081625 = fieldWeight in 960, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=960)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Data Science (DS) has emerged from the shadows of its parents-statistics and computer science-into an independent field since its origin nearly six decades ago. Its evolution and education have taken many sharp turns. We present an impressionistic study of the evolution of DS anchored to Kuhn's four stages of paradigm shifts. First, we construct the landscape of DS based on curriculum analysis of the 32 iSchools across the world offering graduate-level DS programs. Second, we paint the "field" as it emerges from the word frequency patterns, ranking, and clustering of course titles based on text mining. Third, we map the curriculum to the landscape of DS and project the same onto the Edison Data Science Framework (2017) and ACM Data Science Knowledge Areas (2021). Our study shows that the DS programs of iSchools align well with the field and correspond to the Knowledge Areas and skillsets. iSchool's DS curriculums exhibit a bias toward "data visualization" along with machine learning, data mining, natural language processing, and artificial intelligence; go light on statistics; slanted toward ontologies and health informatics; and surprisingly minimal thrust toward eScience/research data management, which we believe would add a distinctive iSchool flavor to the DS.
  12. Wei, W.; Liu, Y.-P.; Wei, L-R.: Feature-level sentiment analysis based on rules and fine-grained domain ontology (2020) 0.03
    0.026731037 = product of:
      0.053462073 = sum of:
        0.053462073 = product of:
          0.10692415 = sum of:
            0.10692415 = weight(_text_:mining in 5876) [ClassicSimilarity], result of:
              0.10692415 = score(doc=5876,freq=2.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.37404498 = fieldWeight in 5876, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5876)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Mining product reviews and sentiment analysis are of great significance, whether for academic research purposes or optimizing business strategies. We propose a feature-level sentiment analysis framework based on rules parsing and fine-grained domain ontology for Chinese reviews. Fine-grained ontology is used to describe synonymous expressions of product features, which are reflected in word changes in online reviews. First, a semiautomatic construction method is developed by using Word2Vec for fine-grained ontology. Then, featurelevel sentiment analysis that combines rules parsing and the fine-grained domain ontology is conducted to extract explicit and implicit features from product reviews. Finally, the domain sentiment dictionary and context sentiment dictionary are established to identify sentiment polarities for the extracted feature-sentiment combinations. An experiment is conducted on the basis of product reviews crawled from Chinese e-commerce websites. The results demonstrate the effectiveness of our approach.
  13. Wang, X.; High, A.; Wang, X.; Zhao, K.: Predicting users' continued engagement in online health communities from the quantity and quality of received support (2021) 0.03
    0.026731037 = product of:
      0.053462073 = sum of:
        0.053462073 = product of:
          0.10692415 = sum of:
            0.10692415 = weight(_text_:mining in 242) [ClassicSimilarity], result of:
              0.10692415 = score(doc=242,freq=2.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.37404498 = fieldWeight in 242, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.046875 = fieldNorm(doc=242)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This article presents a rare insight into the migration of municipality record-keeping databases. The migration of a database for preservation purposes poses Online health communities (OHCs) have been major resources for people with similar health concerns to interact with each other. They offer easily accessible platforms for users to seek, receive, and provide supports by posting. Taking the advantage of text mining and machine learning techniques, we identified social support type(s) in each post and a new user's support needs in an OHC. We examined a user's first-time support-seeking experience by measuring both quantity and quality of received support. Our results revealed that the amount and match of received support are positive and significant predictors of new users' continued engagement. Our outcomes can provide insight for designing and managing a sustainable OHC by retaining users.
  14. Organisciak, P.; Schmidt, B.M.; Downie, J.S.: Giving shape to large digital libraries through exploratory data analysis (2022) 0.03
    0.026731037 = product of:
      0.053462073 = sum of:
        0.053462073 = product of:
          0.10692415 = sum of:
            0.10692415 = weight(_text_:mining in 473) [ClassicSimilarity], result of:
              0.10692415 = score(doc=473,freq=2.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.37404498 = fieldWeight in 473, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.046875 = fieldNorm(doc=473)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Theme
    Data Mining
  15. Bi, Y.: Sentiment classification in social media data by combining triplet belief functions (2022) 0.03
    0.026731037 = product of:
      0.053462073 = sum of:
        0.053462073 = product of:
          0.10692415 = sum of:
            0.10692415 = weight(_text_:mining in 613) [ClassicSimilarity], result of:
              0.10692415 = score(doc=613,freq=2.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.37404498 = fieldWeight in 613, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.046875 = fieldNorm(doc=613)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Sentiment analysis is an emerging technique that caters for semantic orientation and opinion mining. It is increasingly used to analyze online reviews and posts for identifying people's opinions and attitudes to products and events in order to improve business performance of companies and aid to make better organizing strategies of events. This paper presents an innovative approach to combining the outputs of sentiment classifiers under the framework of belief functions. It consists of the formulation of sentiment classifier outputs in the triplet evidence structure and the development of general formulas for combining triplet functions derived from sentiment classification results via three evidential combination rules along with comparative analyses. The empirical studies have been conducted on examining the effectiveness of our method for sentiment classification individually and in combination, and the results demonstrate that the best combined classifiers by our method outperforms the best individual classifiers over five review datasets.
  16. Wang, H.; Song, Y.-Q.; Wang, L.-T.: Memory model for web ad effect based on multimodal features (2020) 0.02
    0.022275863 = product of:
      0.044551726 = sum of:
        0.044551726 = product of:
          0.08910345 = sum of:
            0.08910345 = weight(_text_:mining in 5512) [ClassicSimilarity], result of:
              0.08910345 = score(doc=5512,freq=2.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.31170416 = fieldWeight in 5512, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5512)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Web ad effect evaluation is a challenging problem in web marketing research. Although the analysis of web ad effectiveness has achieved excellent results, there are still some deficiencies. First, there is a lack of an in-depth study of the relevance between advertisements and web content. Second, there is not a thorough analysis of the impacts of users and advertising features on user browsing behaviors. And last, the evaluation index of the web advertisement effect is not adequate. Given the above problems, we conducted our work by studying the observer's behavioral pattern based on multimodal features. First, we analyze the correlation between ads and links with different searching results and further assess the influence of relevance on the observer's attention to web ads using eye-movement features. Then we investigate the user's behavioral sequence and propose the directional frequent-browsing pattern algorithm for mining the user's most commonly used browsing patterns. Finally, we offer the novel use of "memory" as a new measure of advertising effectiveness and further build an advertising memory model with integrated multimodal features for predicting the efficacy of web ads. A large number of experiments have proved the superiority of our method.
  17. Borgman, C.L.; Wofford, M.F.; Golshan, M.S.; Darch, P.T.: Collaborative qualitative research at scale : reflections on 20 years of acquiring global data and making data global (2021) 0.02
    0.022275863 = product of:
      0.044551726 = sum of:
        0.044551726 = product of:
          0.08910345 = sum of:
            0.08910345 = weight(_text_:mining in 239) [ClassicSimilarity], result of:
              0.08910345 = score(doc=239,freq=2.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.31170416 = fieldWeight in 239, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=239)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Theme
    Data Mining
  18. Wang, X.; Song, N.; Zhou, H.; Cheng, H.: ¬The representation of argumentation in scientific papers : a comparative analysis of two research areas (2022) 0.02
    0.022275863 = product of:
      0.044551726 = sum of:
        0.044551726 = product of:
          0.08910345 = sum of:
            0.08910345 = weight(_text_:mining in 567) [ClassicSimilarity], result of:
              0.08910345 = score(doc=567,freq=2.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.31170416 = fieldWeight in 567, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=567)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Scientific papers are essential manifestations of evolving scientific knowledge, and arguments are an important avenue to communicate research results. This study aims to understand how the argumentation process is represented in scientific papers, which is important for knowledge representation, discovery, and retrieval. First, based on fundamental argument theory and scientific discourse ontologies, a coding schema, including 17 categories was constructed. Thereafter, annotation experiments were conducted with 40 scientific articles randomly selected from two different research areas (library and information science and biomedical sciences). Statistical analysis and the sequential pattern mining method were then employed; the ratio of different argumentation units and evidence types were calculated, the argumentation semantics of figures and tables analyzed, and the argumentation structures extracted. A correlation analysis between argumentation and rhetorical structures was also performed to further reveal how argumentation was represented within scientific discourses. The results indicated a difference in the proportion of the argumentation units in the two types of scientific papers, as well as a similar linear construction with differences in the specific argument structures of each knowledge domain and a clear correlation between argumentation and rhetorical structure.
  19. Liu, J.; Zhou, Z.; Gao, M.; Tang, J.; Fan, W.: Aspect sentiment mining of short bullet screen comments from online TV series (2023) 0.02
    0.022275863 = product of:
      0.044551726 = sum of:
        0.044551726 = product of:
          0.08910345 = sum of:
            0.08910345 = weight(_text_:mining in 1018) [ClassicSimilarity], result of:
              0.08910345 = score(doc=1018,freq=2.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.31170416 = fieldWeight in 1018, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1018)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  20. Safder, I.; Ali, M.; Aljohani, N.R.; Nawaz, R.; Hassan, S.-U.: Neural machine translation for in-text citation classification (2023) 0.02
    0.022275863 = product of:
      0.044551726 = sum of:
        0.044551726 = product of:
          0.08910345 = sum of:
            0.08910345 = weight(_text_:mining in 1053) [ClassicSimilarity], result of:
              0.08910345 = score(doc=1053,freq=2.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.31170416 = fieldWeight in 1053, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1053)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The quality of scientific publications can be measured by quantitative indices such as the h-index, Source Normalized Impact per Paper, or g-index. However, these measures lack to explain the function or reasons for citations and the context of citations from citing publication to cited publication. We argue that citation context may be considered while calculating the impact of research work. However, mining citation context from unstructured full-text publications is a challenging task. In this paper, we compiled a data set comprising 9,518 citations context. We developed a deep learning-based architecture for citation context classification. Unlike feature-based state-of-the-art models, our proposed focal-loss and class-weight-aware BiLSTM model with pretrained GloVe embedding vectors use citation context as input to outperform them in multiclass citation context classification tasks. Our model improves on the baseline state-of-the-art by achieving an F1 score of 0.80 with an accuracy of 0.81 for citation context classification. Moreover, we delve into the effects of using different word embeddings on the performance of the classification model and draw a comparison between fastText, GloVe, and spaCy pretrained word embeddings.

Types

  • a 90
  • el 2
  • m 2
  • p 2
  • More… Less…