Search (55 results, page 1 of 3)

  • × year_i:[2010 TO 2020}
  • × theme_ss:"Data Mining"
  1. Blake, C.: Text mining (2011) 0.09
    0.088207915 = product of:
      0.17641583 = sum of:
        0.17641583 = product of:
          0.35283166 = sum of:
            0.35283166 = weight(_text_:mining in 1599) [ClassicSimilarity], result of:
              0.35283166 = score(doc=1599,freq=4.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                1.2342855 = fieldWeight in 1599, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.109375 = fieldNorm(doc=1599)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Theme
    Data Mining
  2. Tonkin, E.L.; Tourte, G.J.L.: Working with text. tools, techniques and approaches for text mining (2016) 0.09
    0.08627405 = product of:
      0.1725481 = sum of:
        0.1725481 = product of:
          0.3450962 = sum of:
            0.3450962 = weight(_text_:mining in 4019) [ClassicSimilarity], result of:
              0.3450962 = score(doc=4019,freq=30.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                1.2072251 = fieldWeight in 4019, product of:
                  5.477226 = tf(freq=30.0), with freq of:
                    30.0 = termFreq=30.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4019)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    What is text mining, and how can it be used? What relevance do these methods have to everyday work in information science and the digital humanities? How does one develop competences in text mining? Working with Text provides a series of cross-disciplinary perspectives on text mining and its applications. As text mining raises legal and ethical issues, the legal background of text mining and the responsibilities of the engineer are discussed in this book. Chapters provide an introduction to the use of the popular GATE text mining package with data drawn from social media, the use of text mining to support semantic search, the development of an authority system to support content tagging, and recent techniques in automatic language evaluation. Focused studies describe text mining on historical texts, automated indexing using constrained vocabularies, and the use of natural language processing to explore the climate science literature. Interviews are included that offer a glimpse into the real-life experience of working within commercial and academic text mining.
    LCSH
    Data mining
    RSWK
    Text Mining / Aufsatzsammlung
    Subject
    Text Mining / Aufsatzsammlung
    Data mining
    Theme
    Data Mining
  3. Mining text data (2012) 0.09
    0.08546503 = product of:
      0.17093006 = sum of:
        0.17093006 = product of:
          0.34186012 = sum of:
            0.34186012 = weight(_text_:mining in 362) [ClassicSimilarity], result of:
              0.34186012 = score(doc=362,freq=46.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                1.1959045 = fieldWeight in 362, product of:
                  6.78233 = tf(freq=46.0), with freq of:
                    46.0 = termFreq=46.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.03125 = fieldNorm(doc=362)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Text mining applications have experienced tremendous advances because of web 2.0 and social networking applications. Recent advances in hardware and software technology have lead to a number of unique scenarios where text mining algorithms are learned. Mining Text Data introduces an important niche in the text analytics field, and is an edited volume contributed by leading international researchers and practitioners focused on social networks & data mining. This book contains a wide swath in topics across social networks & data mining. Each chapter contains a comprehensive survey including the key research content on the topic, and the future directions of research in the field. There is a special focus on Text Embedded with Heterogeneous and Multimedia Data which makes the mining process much more challenging. A number of methods have been designed such as transfer learning and cross-lingual mining for such cases. Mining Text Data simplifies the content, so that advanced-level students, practitioners and researchers in computer science can benefit from this book. Academic and corporate libraries, as well as ACM, IEEE, and Management Science focused on information security, electronic commerce, databases, data mining, machine learning, and statistics are the primary buyers for this reference book.
    Content
    Inhalt: An Introduction to Text Mining.- Information Extraction from Text.- A Survey of Text Summarization Techniques.- A Survey of Text Clustering Algorithms.- Dimensionality Reduction and Topic Modeling.- A Survey of Text Classification Algorithms.- Transfer Learning for Text Mining.- Probabilistic Models for Text Mining.- Mining Text Streams.- Translingual Mining from Text Data.- Text Mining in Multimedia.- Text Analytics in Social Media.- A Survey of Opinion Mining and Sentiment Analysis.- Biomedical Text Mining: A Survey of Recent Progress.- Index.
    LCSH
    Data mining
    RSWK
    Text Mining / Aufsatzsammlung
    Subject
    Text Mining / Aufsatzsammlung
    Data mining
    Theme
    Data Mining
  4. Vaughan, L.; Chen, Y.: Data mining from web search queries : a comparison of Google trends and Baidu index (2015) 0.08
    0.080165744 = product of:
      0.16033149 = sum of:
        0.16033149 = sum of:
          0.12601131 = weight(_text_:mining in 1605) [ClassicSimilarity], result of:
            0.12601131 = score(doc=1605,freq=4.0), product of:
              0.28585905 = queryWeight, product of:
                5.642448 = idf(docFreq=425, maxDocs=44218)
                0.05066224 = queryNorm
              0.44081625 = fieldWeight in 1605, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.642448 = idf(docFreq=425, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1605)
          0.034320172 = weight(_text_:22 in 1605) [ClassicSimilarity], result of:
            0.034320172 = score(doc=1605,freq=2.0), product of:
              0.17741053 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.05066224 = queryNorm
              0.19345059 = fieldWeight in 1605, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1605)
      0.5 = coord(1/2)
    
    Source
    Journal of the Association for Information Science and Technology. 66(2015) no.1, S.13-22
    Theme
    Data Mining
  5. Varathan, K.D.; Giachanou, A.; Crestani, F.: Comparative opinion mining : a review (2017) 0.07
    0.07388068 = product of:
      0.14776136 = sum of:
        0.14776136 = product of:
          0.29552272 = sum of:
            0.29552272 = weight(_text_:mining in 3540) [ClassicSimilarity], result of:
              0.29552272 = score(doc=3540,freq=22.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                1.0338057 = fieldWeight in 3540, product of:
                  4.690416 = tf(freq=22.0), with freq of:
                    22.0 = termFreq=22.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3540)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Opinion mining refers to the use of natural language processing, text analysis, and computational linguistics to identify and extract subjective information in textual material. Opinion mining, also known as sentiment analysis, has received a lot of attention in recent times, as it provides a number of tools to analyze public opinion on a number of different topics. Comparative opinion mining is a subfield of opinion mining which deals with identifying and extracting information that is expressed in a comparative form (e.g., "paper X is better than the Y"). Comparative opinion mining plays a very important role when one tries to evaluate something because it provides a reference point for the comparison. This paper provides a review of the area of comparative opinion mining. It is the first review that cover specifically this topic as all previous reviews dealt mostly with general opinion mining. This survey covers comparative opinion mining from two different angles. One from the perspective of techniques and the other from the perspective of comparative opinion elements. It also incorporates preprocessing tools as well as data set that were used by past researchers that can be useful to future researchers in the field of comparative opinion mining.
    Theme
    Data Mining
  6. Liu, B.: Web data mining : exploring hyperlinks, contents, and usage data (2011) 0.07
    0.07128276 = product of:
      0.14256552 = sum of:
        0.14256552 = product of:
          0.28513104 = sum of:
            0.28513104 = weight(_text_:mining in 354) [ClassicSimilarity], result of:
              0.28513104 = score(doc=354,freq=32.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.9974533 = fieldWeight in 354, product of:
                  5.656854 = tf(freq=32.0), with freq of:
                    32.0 = termFreq=32.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.03125 = fieldNorm(doc=354)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Web mining aims to discover useful information and knowledge from the Web hyperlink structure, page contents, and usage data. Although Web mining uses many conventional data mining techniques, it is not purely an application of traditional data mining due to the semistructured and unstructured nature of the Web data and its heterogeneity. It has also developed many of its own algorithms and techniques. Liu has written a comprehensive text on Web data mining. Key topics of structure mining, content mining, and usage mining are covered both in breadth and in depth. His book brings together all the essential concepts and algorithms from related areas such as data mining, machine learning, and text processing to form an authoritative and coherent text. The book offers a rich blend of theory and practice, addressing seminal research ideas, as well as examining the technology from a practical point of view. It is suitable for students, researchers and practitioners interested in Web mining both as a learning text and a reference book. Lecturers can readily use it for classes on data mining, Web mining, and Web search. Additional teaching materials such as lecture slides, datasets, and implemented algorithms are available online.
    RSWK
    World Wide Web / Data Mining
    Subject
    World Wide Web / Data Mining
    Theme
    Data Mining
  7. Mandl, T.: Text mining und data minig (2013) 0.06
    0.063005656 = product of:
      0.12601131 = sum of:
        0.12601131 = product of:
          0.25202262 = sum of:
            0.25202262 = weight(_text_:mining in 713) [ClassicSimilarity], result of:
              0.25202262 = score(doc=713,freq=4.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.8816325 = fieldWeight in 713, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.078125 = fieldNorm(doc=713)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Theme
    Data Mining
  8. Miao, Q.; Li, Q.; Zeng, D.: Fine-grained opinion mining by integrating multiple review sources (2010) 0.06
    0.062372416 = product of:
      0.12474483 = sum of:
        0.12474483 = product of:
          0.24948967 = sum of:
            0.24948967 = weight(_text_:mining in 4104) [ClassicSimilarity], result of:
              0.24948967 = score(doc=4104,freq=8.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.8727716 = fieldWeight in 4104, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4104)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    With the rapid development of Web 2.0, online reviews have become extremely valuable sources for mining customers' opinions. Fine-grained opinion mining has attracted more and more attention of both applied and theoretical research. In this article, the authors study how to automatically mine product features and opinions from multiple review sources. Specifically, they propose an integration strategy to solve the issue. Within the integration strategy, the authors mine domain knowledge from semistructured reviews and then exploit the domain knowledge to assist product feature extraction and sentiment orientation identification from unstructured reviews. Finally, feature-opinion tuples are generated. Experimental results on real-world datasets show that the proposed approach is effective.
    Theme
    Data Mining
  9. Winterhalter, C.: Licence to mine : ein Überblick über Rahmenbedingungen von Text and Data Mining und den aktuellen Stand der Diskussion (2016) 0.06
    0.061732687 = product of:
      0.123465374 = sum of:
        0.123465374 = product of:
          0.24693075 = sum of:
            0.24693075 = weight(_text_:mining in 673) [ClassicSimilarity], result of:
              0.24693075 = score(doc=673,freq=6.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.86381996 = fieldWeight in 673, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.0625 = fieldNorm(doc=673)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Der Artikel gibt einen Überblick über die Möglichkeiten der Anwendung von Text and Data Mining (TDM) und ähnlichen Verfahren auf der Grundlage bestehender Regelungen in Lizenzverträgen zu kostenpflichtigen elektronischen Ressourcen, die Debatte über zusätzliche Lizenzen für TDM am Beispiel von Elseviers TDM Policy und den Stand der Diskussion über die Einführung von Schrankenregelungen im Urheberrecht für TDM zu nichtkommerziellen wissenschaftlichen Zwecken.
    Theme
    Data Mining
  10. Hallonsten, O.; Holmberg, D.: Analyzing structural stratification in the Swedish higher education system : data contextualization with policy-history analysis (2013) 0.06
    0.06171181 = product of:
      0.12342362 = sum of:
        0.12342362 = sum of:
          0.08910345 = weight(_text_:mining in 668) [ClassicSimilarity], result of:
            0.08910345 = score(doc=668,freq=2.0), product of:
              0.28585905 = queryWeight, product of:
                5.642448 = idf(docFreq=425, maxDocs=44218)
                0.05066224 = queryNorm
              0.31170416 = fieldWeight in 668, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.642448 = idf(docFreq=425, maxDocs=44218)
                0.0390625 = fieldNorm(doc=668)
          0.034320172 = weight(_text_:22 in 668) [ClassicSimilarity], result of:
            0.034320172 = score(doc=668,freq=2.0), product of:
              0.17741053 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.05066224 = queryNorm
              0.19345059 = fieldWeight in 668, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=668)
      0.5 = coord(1/2)
    
    Date
    22. 3.2013 19:43:01
    Theme
    Data Mining
  11. Fonseca, F.; Marcinkowski, M.; Davis, C.: Cyber-human systems of thought and understanding (2019) 0.06
    0.06171181 = product of:
      0.12342362 = sum of:
        0.12342362 = sum of:
          0.08910345 = weight(_text_:mining in 5011) [ClassicSimilarity], result of:
            0.08910345 = score(doc=5011,freq=2.0), product of:
              0.28585905 = queryWeight, product of:
                5.642448 = idf(docFreq=425, maxDocs=44218)
                0.05066224 = queryNorm
              0.31170416 = fieldWeight in 5011, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.642448 = idf(docFreq=425, maxDocs=44218)
                0.0390625 = fieldNorm(doc=5011)
          0.034320172 = weight(_text_:22 in 5011) [ClassicSimilarity], result of:
            0.034320172 = score(doc=5011,freq=2.0), product of:
              0.17741053 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.05066224 = queryNorm
              0.19345059 = fieldWeight in 5011, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=5011)
      0.5 = coord(1/2)
    
    Date
    7. 3.2019 16:32:22
    Theme
    Data Mining
  12. Huvila, I.: Mining qualitative data on human information behaviour from the Web (2010) 0.05
    0.054016102 = product of:
      0.108032204 = sum of:
        0.108032204 = product of:
          0.21606441 = sum of:
            0.21606441 = weight(_text_:mining in 4676) [ClassicSimilarity], result of:
              0.21606441 = score(doc=4676,freq=6.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.75584245 = fieldWeight in 4676, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4676)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This paper discusses an approach of collecting qualitative data on human information behaviour that is based on mining web data using search engines. The approach is technically the same that has been used for some time in webometric research to make statistical inferences on web data, but the present paper shows how the same tools and data collecting methods can be used to gather data for qualitative data analysis on human information behaviour.
    Theme
    Data Mining
  13. Short, M.: Text mining and subject analysis for fiction; or, using machine learning and information extraction to assign subject headings to dime novels (2019) 0.05
    0.054016102 = product of:
      0.108032204 = sum of:
        0.108032204 = product of:
          0.21606441 = sum of:
            0.21606441 = weight(_text_:mining in 5481) [ClassicSimilarity], result of:
              0.21606441 = score(doc=5481,freq=6.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.75584245 = fieldWeight in 5481, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5481)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This article describes multiple experiments in text mining at Northern Illinois University that were undertaken to improve the efficiency and accuracy of cataloging. It focuses narrowly on subject analysis of dime novels, a format of inexpensive fiction that was popular in the United States between 1860 and 1915. NIU holds more than 55,000 dime novels in its collections, which it is in the process of comprehensively digitizing. Classification, keyword extraction, named-entity recognition, clustering, and topic modeling are discussed as means of assigning subject headings to improve their discoverability by researchers and to increase the productivity of digitization workflows.
    Theme
    Data Mining
  14. Chen, Y.-L.; Liu, Y.-H.; Ho, W.-L.: ¬A text mining approach to assist the general public in the retrieval of legal documents (2013) 0.05
    0.053462073 = product of:
      0.10692415 = sum of:
        0.10692415 = product of:
          0.2138483 = sum of:
            0.2138483 = weight(_text_:mining in 521) [ClassicSimilarity], result of:
              0.2138483 = score(doc=521,freq=8.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.74808997 = fieldWeight in 521, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.046875 = fieldNorm(doc=521)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Applying text mining techniques to legal issues has been an emerging research topic in recent years. Although some previous studies focused on assisting professionals in the retrieval of related legal documents, they did not take into account the general public and their difficulty in describing legal problems in professional legal terms. Because this problem has not been addressed by previous research, this study aims to design a text-mining-based method that allows the general public to use everyday vocabulary to search for and retrieve criminal judgments. The experimental results indicate that our method can help the general public, who are not familiar with professional legal terms, to acquire relevant criminal judgments more accurately and effectively.
    Theme
    Data Mining
  15. Qiu, X.Y.; Srinivasan, P.; Hu, Y.: Supervised learning models to predict firm performance with annual reports : an empirical study (2014) 0.05
    0.053462073 = product of:
      0.10692415 = sum of:
        0.10692415 = product of:
          0.2138483 = sum of:
            0.2138483 = weight(_text_:mining in 1205) [ClassicSimilarity], result of:
              0.2138483 = score(doc=1205,freq=8.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.74808997 = fieldWeight in 1205, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1205)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Text mining and machine learning methodologies have been applied toward knowledge discovery in several domains, such as biomedicine and business. Interestingly, in the business domain, the text mining and machine learning community has minimally explored company annual reports with their mandatory disclosures. In this study, we explore the question "How can annual reports be used to predict change in company performance from one year to the next?" from a text mining perspective. Our article contributes a systematic study of the potential of company mandatory disclosures using a computational viewpoint in the following aspects: (a) We characterize our research problem along distinct dimensions to gain a reasonably comprehensive understanding of the capacity of supervised learning methods in predicting change in company performance using annual reports, and (b) our findings from unbiased systematic experiments provide further evidence about the economic incentives faced by analysts in their stock recommendations and speculations on analysts having access to more information in producing earnings forecast.
    Theme
    Data Mining
  16. Drees, B.: Text und data mining : Herausforderungen und Möglichkeiten für Bibliotheken (2016) 0.05
    0.053462073 = product of:
      0.10692415 = sum of:
        0.10692415 = product of:
          0.2138483 = sum of:
            0.2138483 = weight(_text_:mining in 3952) [ClassicSimilarity], result of:
              0.2138483 = score(doc=3952,freq=8.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.74808997 = fieldWeight in 3952, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3952)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Text und Data Mining (TDM) gewinnt als wissenschaftliche Methode zunehmend an Bedeutung und stellt wissenschaftliche Bibliotheken damit vor neue Herausforderungen, bietet gleichzeitig aber auch neue Möglichkeiten. Der vorliegende Beitrag gibt einen Überblick über das Thema TDM aus bibliothekarischer Sicht. Hierzu wird der Begriff Text und Data Mining im Kontext verwandter Begriffe diskutiert sowie Ziele, Aufgaben und Methoden von TDM erläutert. Diese werden anhand beispielhafter TDM-Anwendungen in Wissenschaft und Forschung illustriert. Ferner werden technische und rechtliche Probleme und Hindernisse im TDM-Kontext dargelegt. Abschließend wird die Relevanz von TDM für Bibliotheken, sowohl in ihrer Rolle als Informationsvermittler und -anbieter als auch als Anwender von TDM-Methoden, aufgezeigt. Zudem wurde im Rahmen dieser Arbeit eine Befragung der Betreiber von Dokumentenservern an Bibliotheken in Deutschland zum aktuellen Umgang mit TDM durchgeführt, die zeigt, dass hier noch viel Ausbaupotential besteht. Die dem Artikel zugrunde liegenden Forschungsdaten sind unter dem DOI 10.11588/data/10090 publiziert.
    Theme
    Data Mining
  17. Chardonnens, A.; Hengchen, S.: Text mining for cultural heritage institutions : a 5-step method for cultural heritage institutions (2017) 0.05
    0.050404526 = product of:
      0.10080905 = sum of:
        0.10080905 = product of:
          0.2016181 = sum of:
            0.2016181 = weight(_text_:mining in 646) [ClassicSimilarity], result of:
              0.2016181 = score(doc=646,freq=4.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.705306 = fieldWeight in 646, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.0625 = fieldNorm(doc=646)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Theme
    Data Mining
  18. Tu, Y.-N.; Hsu, S.-L.: Constructing conceptual trajectory maps to trace the development of research fields (2016) 0.05
    0.049810346 = product of:
      0.09962069 = sum of:
        0.09962069 = product of:
          0.19924138 = sum of:
            0.19924138 = weight(_text_:mining in 3059) [ClassicSimilarity], result of:
              0.19924138 = score(doc=3059,freq=10.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.6969917 = fieldWeight in 3059, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3059)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This study proposes a new method to construct and trace the trajectory of conceptual development of a research field by combining main path analysis, citation analysis, and text-mining techniques. Main path analysis, a method used commonly to trace the most critical path in a citation network, helps describe the developmental trajectory of a research field. This study extends the main path analysis method and applies text-mining techniques in the new method, which reflects the trajectory of conceptual development in an academic research field more accurately than citation frequency, which represents only the articles examined. Articles can be merged based on similarity of concepts, and by merging concepts the history of a research field can be described more precisely. The new method was applied to the "h-index" and "text mining" fields. The precision, recall, and F-measures of the h-index were 0.738, 0.652, and 0.658 and those of text-mining were 0.501, 0.653, and 0.551, respectively. Last, this study not only establishes the conceptual trajectory map of a research field, but also recommends keywords that are more precise than those used currently by researchers. These precise keywords could enable researchers to gather related works more quickly than before.
    Theme
    Data Mining
  19. Jäger, L.: Von Big Data zu Big Brother (2018) 0.05
    0.049369447 = product of:
      0.098738894 = sum of:
        0.098738894 = sum of:
          0.07128276 = weight(_text_:mining in 5234) [ClassicSimilarity], result of:
            0.07128276 = score(doc=5234,freq=2.0), product of:
              0.28585905 = queryWeight, product of:
                5.642448 = idf(docFreq=425, maxDocs=44218)
                0.05066224 = queryNorm
              0.24936332 = fieldWeight in 5234, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.642448 = idf(docFreq=425, maxDocs=44218)
                0.03125 = fieldNorm(doc=5234)
          0.027456136 = weight(_text_:22 in 5234) [ClassicSimilarity], result of:
            0.027456136 = score(doc=5234,freq=2.0), product of:
              0.17741053 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.05066224 = queryNorm
              0.15476047 = fieldWeight in 5234, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.03125 = fieldNorm(doc=5234)
      0.5 = coord(1/2)
    
    Date
    22. 1.2018 11:33:49
    Theme
    Data Mining
  20. Liu, X.; Yu, S.; Janssens, F.; Glänzel, W.; Moreau, Y.; Moor, B.de: Weighted hybrid clustering by combining text mining and bibliometrics on a large-scale journal database (2010) 0.05
    0.046299513 = product of:
      0.09259903 = sum of:
        0.09259903 = product of:
          0.18519805 = sum of:
            0.18519805 = weight(_text_:mining in 3464) [ClassicSimilarity], result of:
              0.18519805 = score(doc=3464,freq=6.0), product of:
                0.28585905 = queryWeight, product of:
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.05066224 = queryNorm
                0.64786494 = fieldWeight in 3464, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  5.642448 = idf(docFreq=425, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3464)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    We propose a new hybrid clustering framework to incorporate text mining with bibliometrics in journal set analysis. The framework integrates two different approaches: clustering ensemble and kernel-fusion clustering. To improve the flexibility and the efficiency of processing large-scale data, we propose an information-based weighting scheme to leverage the effect of multiple data sources in hybrid clustering. Three different algorithms are extended by the proposed weighting scheme and they are employed on a large journal set retrieved from the Web of Science (WoS) database. The clustering performance of the proposed algorithms is systematically evaluated using multiple evaluation methods, and they were cross-compared with alternative methods. Experimental results demonstrate that the proposed weighted hybrid clustering strategy is superior to other methods in clustering performance and efficiency. The proposed approach also provides a more refined structural mapping of journal sets, which is useful for monitoring and detecting new trends in different scientific fields.
    Theme
    Data Mining

Languages

  • e 47
  • d 8

Types

  • a 52
  • el 12
  • m 2
  • p 1
  • s 1
  • More… Less…