Search (683 results, page 1 of 35)

Blake, C.: Text mining (2011) 0.09

0.088207915 = product of:
  0.17641583 = sum of:
    0.17641583 = product of:
      0.35283166 = sum of:
        0.35283166 = weight(_text_:mining in 1599) [ClassicSimilarity], result of:
          0.35283166 = score(doc=1599,freq=4.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            1.2342855 = fieldWeight in 1599, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.109375 = fieldNorm(doc=1599)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Theme: Data Mining

Tonkin, E.L.; Tourte, G.J.L.: Working with text. tools, techniques and approaches for text mining (2016) 0.09
```
0.08627405 = product of:
  0.1725481 = sum of:
    0.1725481 = product of:
      0.3450962 = sum of:
        0.3450962 = weight(_text_:mining in 4019) [ClassicSimilarity], result of:
          0.3450962 = score(doc=4019,freq=30.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            1.2072251 = fieldWeight in 4019, product of:
              5.477226 = tf(freq=30.0), with freq of:
                30.0 = termFreq=30.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4019)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

What is text mining, and how can it be used? What relevance do these methods have to everyday work in information science and the digital humanities? How does one develop competences in text mining? Working with Text provides a series of cross-disciplinary perspectives on text mining and its applications. As text mining raises legal and ethical issues, the legal background of text mining and the responsibilities of the engineer are discussed in this book. Chapters provide an introduction to the use of the popular GATE text mining package with data drawn from social media, the use of text mining to support semantic search, the development of an authority system to support content tagging, and recent techniques in automatic language evaluation. Focused studies describe text mining on historical texts, automated indexing using constrained vocabularies, and the use of natural language processing to explore the climate science literature. Interviews are included that offer a glimpse into the real-life experience of working within commercial and academic text mining.

LCSH

Data mining

RSWK

Text Mining / Aufsatzsammlung

Subject

Text Mining / Aufsatzsammlung
Data mining

Theme

Data Mining

Vaughan, L.; Chen, Y.: Data mining from web search queries : a comparison of Google trends and Baidu index (2015) 0.08

0.080165744 = product of:
  0.16033149 = sum of:
    0.16033149 = sum of:
      0.12601131 = weight(_text_:mining in 1605) [ClassicSimilarity], result of:
        0.12601131 = score(doc=1605,freq=4.0), product of:
          0.28585905 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.05066224 = queryNorm
          0.44081625 = fieldWeight in 1605, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1605)
      0.034320172 = weight(_text_:22 in 1605) [ClassicSimilarity], result of:
        0.034320172 = score(doc=1605,freq=2.0), product of:
          0.17741053 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05066224 = queryNorm
          0.19345059 = fieldWeight in 1605, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1605)
  0.5 = coord(1/2)

Source: Journal of the Association for Information Science and Technology. 66(2015) no.1, S.13-22
Theme: Data Mining

Arbelaitz, O.; Martínez-Otzeta. J.M.; Muguerza, J.: User modeling in a social network for cognitively disabled people (2016) 0.07

0.074054174 = product of:
  0.14810835 = sum of:
    0.14810835 = sum of:
      0.10692415 = weight(_text_:mining in 2639) [ClassicSimilarity], result of:
        0.10692415 = score(doc=2639,freq=2.0), product of:
          0.28585905 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.05066224 = queryNorm
          0.37404498 = fieldWeight in 2639, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.046875 = fieldNorm(doc=2639)
      0.0411842 = weight(_text_:22 in 2639) [ClassicSimilarity], result of:
        0.0411842 = score(doc=2639,freq=2.0), product of:
          0.17741053 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05066224 = queryNorm
          0.23214069 = fieldWeight in 2639, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=2639)
  0.5 = coord(1/2)

Abstract: Online communities are becoming an important tool in the communication and participation processes in our society. However, the most widespread applications are difficult to use for people with disabilities, or may involve some risks if no previous training has been undertaken. This work describes a novel social network for cognitively disabled people along with a clustering-based method for modeling activity and socialization processes of its users in a noninvasive way. This closed social network is specifically designed for people with cognitive disabilities, called Guremintza, that provides the network administrators (e.g., social workers) with two types of reports: summary statistics of the network usage and behavior patterns discovered by a data mining process. Experiments made in an initial stage of the network show that the discovered patterns are meaningful to the social workers and they find them useful in monitoring the progress of the users.
Date: 22. 1.2016 12:02:26

Varathan, K.D.; Giachanou, A.; Crestani, F.: Comparative opinion mining : a review (2017) 0.07
```
0.07388068 = product of:
  0.14776136 = sum of:
    0.14776136 = product of:
      0.29552272 = sum of:
        0.29552272 = weight(_text_:mining in 3540) [ClassicSimilarity], result of:
          0.29552272 = score(doc=3540,freq=22.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            1.0338057 = fieldWeight in 3540, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3540)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Opinion mining refers to the use of natural language processing, text analysis, and computational linguistics to identify and extract subjective information in textual material. Opinion mining, also known as sentiment analysis, has received a lot of attention in recent times, as it provides a number of tools to analyze public opinion on a number of different topics. Comparative opinion mining is a subfield of opinion mining which deals with identifying and extracting information that is expressed in a comparative form (e.g., "paper X is better than the Y"). Comparative opinion mining plays a very important role when one tries to evaluate something because it provides a reference point for the comparison. This paper provides a review of the area of comparative opinion mining. It is the first review that cover specifically this topic as all previous reviews dealt mostly with general opinion mining. This survey covers comparative opinion mining from two different angles. One from the perspective of techniques and the other from the perspective of comparative opinion elements. It also incorporates preprocessing tools as well as data set that were used by past researchers that can be useful to future researchers in the field of comparative opinion mining.

Theme

Data Mining

Mandl, T.: Text mining und data minig (2013) 0.06

0.063005656 = product of:
  0.12601131 = sum of:
    0.12601131 = product of:
      0.25202262 = sum of:
        0.25202262 = weight(_text_:mining in 713) [ClassicSimilarity], result of:
          0.25202262 = score(doc=713,freq=4.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.8816325 = fieldWeight in 713, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.078125 = fieldNorm(doc=713)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Theme: Data Mining

Miao, Q.; Li, Q.; Zeng, D.: Fine-grained opinion mining by integrating multiple review sources (2010) 0.06
```
0.062372416 = product of:
  0.12474483 = sum of:
    0.12474483 = product of:
      0.24948967 = sum of:
        0.24948967 = weight(_text_:mining in 4104) [ClassicSimilarity], result of:
          0.24948967 = score(doc=4104,freq=8.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.8727716 = fieldWeight in 4104, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4104)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

With the rapid development of Web 2.0, online reviews have become extremely valuable sources for mining customers' opinions. Fine-grained opinion mining has attracted more and more attention of both applied and theoretical research. In this article, the authors study how to automatically mine product features and opinions from multiple review sources. Specifically, they propose an integration strategy to solve the issue. Within the integration strategy, the authors mine domain knowledge from semistructured reviews and then exploit the domain knowledge to assist product feature extraction and sentiment orientation identification from unstructured reviews. Finally, feature-opinion tuples are generated. Experimental results on real-world datasets show that the proposed approach is effective.

Theme

Data Mining

Winterhalter, C.: Licence to mine : ein Überblick über Rahmenbedingungen von Text and Data Mining und den aktuellen Stand der Diskussion (2016) 0.06

0.061732687 = product of:
  0.123465374 = sum of:
    0.123465374 = product of:
      0.24693075 = sum of:
        0.24693075 = weight(_text_:mining in 673) [ClassicSimilarity], result of:
          0.24693075 = score(doc=673,freq=6.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.86381996 = fieldWeight in 673, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0625 = fieldNorm(doc=673)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: Der Artikel gibt einen Überblick über die Möglichkeiten der Anwendung von Text and Data Mining (TDM) und ähnlichen Verfahren auf der Grundlage bestehender Regelungen in Lizenzverträgen zu kostenpflichtigen elektronischen Ressourcen, die Debatte über zusätzliche Lizenzen für TDM am Beispiel von Elseviers TDM Policy und den Stand der Diskussion über die Einführung von Schrankenregelungen im Urheberrecht für TDM zu nichtkommerziellen wissenschaftlichen Zwecken.
Theme: Data Mining

Cui, H.: Competency evaluation of plant character ontologies against domain literature (2010) 0.06
```
0.06171181 = product of:
  0.12342362 = sum of:
    0.12342362 = sum of:
      0.08910345 = weight(_text_:mining in 3466) [ClassicSimilarity], result of:
        0.08910345 = score(doc=3466,freq=2.0), product of:
          0.28585905 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.05066224 = queryNorm
          0.31170416 = fieldWeight in 3466, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.0390625 = fieldNorm(doc=3466)
      0.034320172 = weight(_text_:22 in 3466) [ClassicSimilarity], result of:
        0.034320172 = score(doc=3466,freq=2.0), product of:
          0.17741053 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05066224 = queryNorm
          0.19345059 = fieldWeight in 3466, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=3466)
  0.5 = coord(1/2)
```
Abstract

Specimen identification keys are still the most commonly created tools used by systematic biologists to access biodiversity information. Creating identification keys requires analyzing and synthesizing large amounts of information from specimens and their descriptions and is a very labor-intensive and time-consuming activity. Automating the generation of identification keys from text descriptions becomes a highly attractive text mining application in the biodiversity domain. Fine-grained semantic annotation of morphological descriptions of organisms is a necessary first step in generating keys from text. Machine-readable ontologies are needed in this process because most biological characters are only implied (i.e., not stated) in descriptions. The immediate question to ask is How well do existing ontologies support semantic annotation and automated key generation? With the intention to either select an existing ontology or develop a unified ontology based on existing ones, this paper evaluates the coverage, semantic consistency, and inter-ontology agreement of a biodiversity character ontology and three plant glossaries that may be turned into ontologies. The coverage and semantic consistency of the ontology/glossaries are checked against the authoritative domain literature, namely, Flora of North America and Flora of China. The evaluation results suggest that more work is needed to improve the coverage and interoperability of the ontology/glossaries. More concepts need to be added to the ontology/glossaries and careful work is needed to improve the semantic consistency. The method used in this paper to evaluate the ontology/glossaries can be used to propose new candidate concepts from the domain literature and suggest appropriate definitions.

Date

1. 6.2010 9:55:22
Yi, K.: Harnessing collective intelligence in social tagging using Delicious (2012) 0.06
```
0.06171181 = product of:
  0.12342362 = sum of:
    0.12342362 = sum of:
      0.08910345 = weight(_text_:mining in 515) [ClassicSimilarity], result of:
        0.08910345 = score(doc=515,freq=2.0), product of:
          0.28585905 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.05066224 = queryNorm
          0.31170416 = fieldWeight in 515, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.0390625 = fieldNorm(doc=515)
      0.034320172 = weight(_text_:22 in 515) [ClassicSimilarity], result of:
        0.034320172 = score(doc=515,freq=2.0), product of:
          0.17741053 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05066224 = queryNorm
          0.19345059 = fieldWeight in 515, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=515)
  0.5 = coord(1/2)
```
Abstract

A new collaborative approach in information organization and sharing has recently arisen, known as collaborative tagging or social indexing. A key element of collaborative tagging is the concept of collective intelligence (CI), which is a shared intelligence among all participants. This research investigates the phenomenon of social tagging in the context of CI with the aim to serve as a stepping-stone towards the mining of truly valuable social tags for web resources. This study focuses on assessing and evaluating the degree of CI embedded in social tagging over time in terms of two-parameter values, number of participants, and top frequency ranking window. Five different metrics were adopted and utilized for assessing the similarity between ranking lists: overlapList, overlapRank, Footrule, Fagin's measure, and the Inverse Rank measure. The result of this study demonstrates that a substantial degree of CI is most likely to be achieved when somewhere between the first 200 and 400 people have participated in tagging, and that a target degree of CI can be projected by controlling the two factors along with the selection of a similarity metric. The study also tests some experimental conditions for detecting social tags with high CI degree. The results of this study can be applicable to the study of filtering social tags based on CI; filtered social tags may be utilized for the metadata creation of tagged resources and possibly for the retrieval of tagged resources.

Date

25.12.2012 15:22:37

Hallonsten, O.; Holmberg, D.: Analyzing structural stratification in the Swedish higher education system : data contextualization with policy-history analysis (2013) 0.06

0.06171181 = product of:
  0.12342362 = sum of:
    0.12342362 = sum of:
      0.08910345 = weight(_text_:mining in 668) [ClassicSimilarity], result of:
        0.08910345 = score(doc=668,freq=2.0), product of:
          0.28585905 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.05066224 = queryNorm
          0.31170416 = fieldWeight in 668, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.0390625 = fieldNorm(doc=668)
      0.034320172 = weight(_text_:22 in 668) [ClassicSimilarity], result of:
        0.034320172 = score(doc=668,freq=2.0), product of:
          0.17741053 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05066224 = queryNorm
          0.19345059 = fieldWeight in 668, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=668)
  0.5 = coord(1/2)

Date: 22. 3.2013 19:43:01
Theme: Data Mining

Díaz-Faes, A.A.; Bordons, M.: Acknowledgments in scientific publications : presence in Spanish science and text patterns across disciplines (2014) 0.06
```
0.06171181 = product of:
  0.12342362 = sum of:
    0.12342362 = sum of:
      0.08910345 = weight(_text_:mining in 1351) [ClassicSimilarity], result of:
        0.08910345 = score(doc=1351,freq=2.0), product of:
          0.28585905 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.05066224 = queryNorm
          0.31170416 = fieldWeight in 1351, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1351)
      0.034320172 = weight(_text_:22 in 1351) [ClassicSimilarity], result of:
        0.034320172 = score(doc=1351,freq=2.0), product of:
          0.17741053 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05066224 = queryNorm
          0.19345059 = fieldWeight in 1351, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1351)
  0.5 = coord(1/2)
```
Abstract

The acknowledgments in scientific publications are an important feature in the scholarly communication process. This research analyzes funding acknowledgment presence in scientific publications and introduces a novel approach for discovering text patterns by discipline in the acknowledgment section of papers. First, the presence of acknowledgments in 38,257 English-language papers published by Spanish researchers in 2010 is studied by subject area on the basis of the funding acknowledgment information available in the Web of Science database. Funding acknowledgments are present in two thirds of Spanish articles, with significant differences by subject area, number of authors, impact factor of journals, and, in one specific area, basic/applied nature of research. Second, the existence of specific acknowledgment patterns in English-language papers of Spanish researchers in 4 selected subject categories (cardiac and cardiovascular systems, economics, evolutionary biology, and statistics and probability) is explored through a combination of text mining and multivariate analyses. "Peer interactive communication" predominates in the more theoretical or social-oriented fields (statistics and probability, economics), whereas the recognition of technical assistance is more common in experimental research (evolutionary biology), and the mention of potential conflicts of interest emerges forcefully in the clinical field (cardiac and cardiovascular systems). The systematic inclusion of structured data about acknowledgments in journal articles and bibliographic databases would have a positive impact on the study of collaboration practices in science.

Date

22. 8.2014 17:06:28
Nguyen, T.T.; Tho Thanh Quan, T.T.; Tuoi Thi Phan, T.T.: Sentiment search : an emerging trend on social media monitoring systems (2014) 0.06
```
0.06171181 = product of:
  0.12342362 = sum of:
    0.12342362 = sum of:
      0.08910345 = weight(_text_:mining in 1625) [ClassicSimilarity], result of:
        0.08910345 = score(doc=1625,freq=2.0), product of:
          0.28585905 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.05066224 = queryNorm
          0.31170416 = fieldWeight in 1625, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1625)
      0.034320172 = weight(_text_:22 in 1625) [ClassicSimilarity], result of:
        0.034320172 = score(doc=1625,freq=2.0), product of:
          0.17741053 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05066224 = queryNorm
          0.19345059 = fieldWeight in 1625, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1625)
  0.5 = coord(1/2)
```
Abstract

Purpose - The purpose of this paper is to discuss sentiment search, which not only retrieves data related to submitted keywords but also identifies sentiment opinion implied in the retrieved data and the subject targeted by this opinion. Design/methodology/approach - The authors propose a retrieval framework known as Cross-Domain Sentiment Search (CSS), which combines the usage of domain ontologies with specific linguistic rules to handle sentiment terms in textual data. The CSS framework also supports incrementally enriching domain ontologies when applied in new domains. Findings - The authors found that domain ontologies are extremely helpful when CSS is applied in specific domains. In the meantime, the embedded linguistic rules make CSS achieve better performance as compared to data mining techniques. Research limitations/implications - The approach has been initially applied in a real social monitoring system of a professional IT company. Thus, it is proved to be able to handle real data acquired from social media channels such as electronic newspapers or social networks. Originality/value - The authors have placed aspect-based sentiment analysis in the context of semantic search and introduced the CSS framework for the whole sentiment search process. The formal definitions of Sentiment Ontology and aspect-based sentiment analysis are also presented. This distinguishes the work from other related works.

Date

20. 1.2015 18:30:22

McCain, K.W.: Mining full-text journal articles to assess obliteration by incorporation : Herbert A. Simon's concepts of bounded rationality and satisficing in economics, management, and psychology (2015) 0.06

0.06171181 = product of:
  0.12342362 = sum of:
    0.12342362 = sum of:
      0.08910345 = weight(_text_:mining in 2260) [ClassicSimilarity], result of:
        0.08910345 = score(doc=2260,freq=2.0), product of:
          0.28585905 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.05066224 = queryNorm
          0.31170416 = fieldWeight in 2260, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2260)
      0.034320172 = weight(_text_:22 in 2260) [ClassicSimilarity], result of:
        0.034320172 = score(doc=2260,freq=2.0), product of:
          0.17741053 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05066224 = queryNorm
          0.19345059 = fieldWeight in 2260, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2260)
  0.5 = coord(1/2)

Date: 15.10.2015 19:22:55

Junger, U.; Schwens, U.: ¬Die inhaltliche Erschließung des schriftlichen kulturellen Erbes auf dem Weg in die Zukunft : Automatische Vergabe von Schlagwörtern in der Deutschen Nationalbibliothek (2017) 0.06
```
0.06171181 = product of:
  0.12342362 = sum of:
    0.12342362 = sum of:
      0.08910345 = weight(_text_:mining in 3780) [ClassicSimilarity], result of:
        0.08910345 = score(doc=3780,freq=2.0), product of:
          0.28585905 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.05066224 = queryNorm
          0.31170416 = fieldWeight in 3780, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.0390625 = fieldNorm(doc=3780)
      0.034320172 = weight(_text_:22 in 3780) [ClassicSimilarity], result of:
        0.034320172 = score(doc=3780,freq=2.0), product of:
          0.17741053 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05066224 = queryNorm
          0.19345059 = fieldWeight in 3780, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=3780)
  0.5 = coord(1/2)
```
Abstract

Wir leben im 21. Jahrhundert, und vieles, was vor hundert und noch vor fünfzig Jahren als Science Fiction abgetan worden wäre, ist mittlerweile Realität. Raumsonden fliegen zum Mars, machen dort Experimente und liefern Daten zur Erde zurück. Roboter werden für Routineaufgaben eingesetzt, zum Beispiel in der Industrie oder in der Medizin. Digitalisierung, künstliche Intelligenz und automatisierte Verfahren sind kaum mehr aus unserem Alltag wegzudenken. Grundlage vieler Prozesse sind lernende Algorithmen. Die fortschreitende digitale Transformation ist global und umfasst alle Lebens- und Arbeitsbereiche: Wirtschaft, Gesellschaft und Politik. Sie eröffnet neue Möglichkeiten, von denen auch Bibliotheken profitieren. Der starke Anstieg digitaler Publikationen, die einen wichtigen und prozentual immer größer werdenden Teil des Kulturerbes darstellen, sollte für Bibliotheken Anlass sein, diese Möglichkeiten aktiv aufzugreifen und einzusetzen. Die Auswertbarkeit digitaler Inhalte, beispielsweise durch Text- and Data-Mining (TDM), und die Entwicklung technischer Verfahren, mittels derer Inhalte miteinander vernetzt und semantisch in Beziehung gesetzt werden können, bieten Raum, auch bibliothekarische Erschließungsverfahren neu zu denken. Daher beschäftigt sich die Deutsche Nationalbibliothek (DNB) seit einigen Jahren mit der Frage, wie sich die Prozesse bei der Erschließung von Medienwerken verbessern und maschinell unterstützen lassen. Sie steht dabei im regelmäßigen kollegialen Austausch mit anderen Bibliotheken, die sich ebenfalls aktiv mit dieser Fragestellung befassen, sowie mit europäischen Nationalbibliotheken, die ihrerseits Interesse an dem Thema und den Erfahrungen der DNB haben. Als Nationalbibliothek mit umfangreichen Beständen an digitalen Publikationen hat die DNB auch Expertise bei der digitalen Langzeitarchivierung aufgebaut und ist im Netzwerk ihrer Partner als kompetente Gesprächspartnerin geschätzt.

Date

19. 8.2017 9:24:22
Song, M.; Kang, K.; An, J.Y.: Investigating drug-disease interactions in drug-symptom-disease triples via citation relations (2018) 0.06
```
0.06171181 = product of:
  0.12342362 = sum of:
    0.12342362 = sum of:
      0.08910345 = weight(_text_:mining in 4545) [ClassicSimilarity], result of:
        0.08910345 = score(doc=4545,freq=2.0), product of:
          0.28585905 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.05066224 = queryNorm
          0.31170416 = fieldWeight in 4545, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.0390625 = fieldNorm(doc=4545)
      0.034320172 = weight(_text_:22 in 4545) [ClassicSimilarity], result of:
        0.034320172 = score(doc=4545,freq=2.0), product of:
          0.17741053 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05066224 = queryNorm
          0.19345059 = fieldWeight in 4545, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=4545)
  0.5 = coord(1/2)
```
Abstract

With the growth in biomedical literature, the necessity of extracting useful information from the literature has increased. One approach to extracting biomedical knowledge involves using citation relations to discover entity relations. The assumption is that citation relations between any two articles connect knowledge entities across the articles, enabling the detection of implicit relationships among biomedical entities. The goal of this article is to examine the characteristics of biomedical entities connected via intermediate entities using citation relations aided by text mining. Based on the importance of symptoms as biomedical entities, we created triples connected via citation relations to identify drug-disease pairs with shared symptoms as intermediate entities. Drug-disease interactions built via citation relations were compared with co-occurrence-based interactions. Several types of analyses were adopted to examine the properties of the extracted entity pairs by comparing them with drug-disease interaction databases. We attempted to identify the characteristics of drug-disease pairs through citation relations in association with biomedical entities. The results showed that the citation relation-based approach resulted in diverse types of biomedical entities and preserved topical consistency. In addition, drug-disease pairs identified only via citation relations are interesting for clinical trials when they are examined using BITOLA.

Date

1.11.2018 18:19:22

Fonseca, F.; Marcinkowski, M.; Davis, C.: Cyber-human systems of thought and understanding (2019) 0.06

0.06171181 = product of:
  0.12342362 = sum of:
    0.12342362 = sum of:
      0.08910345 = weight(_text_:mining in 5011) [ClassicSimilarity], result of:
        0.08910345 = score(doc=5011,freq=2.0), product of:
          0.28585905 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.05066224 = queryNorm
          0.31170416 = fieldWeight in 5011, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.0390625 = fieldNorm(doc=5011)
      0.034320172 = weight(_text_:22 in 5011) [ClassicSimilarity], result of:
        0.034320172 = score(doc=5011,freq=2.0), product of:
          0.17741053 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05066224 = queryNorm
          0.19345059 = fieldWeight in 5011, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=5011)
  0.5 = coord(1/2)

Date: 7. 3.2019 16:32:22
Theme: Data Mining

Perovsek, M.; Kranjca, J.; Erjaveca, T.; Cestnika, B.; Lavraca, N.: TextFlows : a visual programming platform for text mining and natural language processing (2016) 0.06
```
0.059772413 = product of:
  0.11954483 = sum of:
    0.11954483 = product of:
      0.23908965 = sum of:
        0.23908965 = weight(_text_:mining in 2697) [ClassicSimilarity], result of:
          0.23908965 = score(doc=2697,freq=10.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.83639 = fieldWeight in 2697, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.046875 = fieldNorm(doc=2697)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Text mining and natural language processing are fast growing areas of research, with numerous applications in business, science and creative industries. This paper presents TextFlows, a web-based text mining and natural language processing platform supporting workflow construction, sharing and execution. The platform enables visual construction of text mining workflows through a web browser, and the execution of the constructed workflows on a processing cloud. This makes TextFlows an adaptable infrastructure for the construction and sharing of text processing workflows, which can be reused in various applications. The paper presents the implemented text mining and language processing modules, and describes some precomposed workflows. Their features are demonstrated on three use cases: comparison of document classifiers and of different part-of-speech taggers on a text categorization problem, and outlier detection in document corpora.

Huvila, I.: Mining qualitative data on human information behaviour from the Web (2010) 0.05

0.054016102 = product of:
  0.108032204 = sum of:
    0.108032204 = product of:
      0.21606441 = sum of:
        0.21606441 = weight(_text_:mining in 4676) [ClassicSimilarity], result of:
          0.21606441 = score(doc=4676,freq=6.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.75584245 = fieldWeight in 4676, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4676)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: This paper discusses an approach of collecting qualitative data on human information behaviour that is based on mining web data using search engines. The approach is technically the same that has been used for some time in webometric research to make statistical inferences on web data, but the present paper shows how the same tools and data collecting methods can be used to gather data for qualitative data analysis on human information behaviour.
Theme: Data Mining

Short, M.: Text mining and subject analysis for fiction; or, using machine learning and information extraction to assign subject headings to dime novels (2019) 0.05
```
0.054016102 = product of:
  0.108032204 = sum of:
    0.108032204 = product of:
      0.21606441 = sum of:
        0.21606441 = weight(_text_:mining in 5481) [ClassicSimilarity], result of:
          0.21606441 = score(doc=5481,freq=6.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.75584245 = fieldWeight in 5481, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5481)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This article describes multiple experiments in text mining at Northern Illinois University that were undertaken to improve the efficiency and accuracy of cataloging. It focuses narrowly on subject analysis of dime novels, a format of inexpensive fiction that was popular in the United States between 1860 and 1915. NIU holds more than 55,000 dime novels in its collections, which it is in the process of comprehensively digitizing. Classification, keyword extraction, named-entity recognition, clustering, and topic modeling are discussed as means of assigning subject headings to improve their discoverability by researchers and to increase the productivity of digitization workflows.

Theme

Data Mining

Search (683 results, page 1 of 35)

Authors

Languages

Types

Themes

Subjects

Classifications