Search (3333 results, page 2 of 167)

Cheung, D.W.; Kao, B.; Lee, J.: Discovering user access patterns on the World Wide Web (1998) 0.09

0.08639653 = product of:
  0.17279306 = sum of:
    0.17279306 = sum of:
      0.12474483 = weight(_text_:mining in 332) [ClassicSimilarity], result of:
        0.12474483 = score(doc=332,freq=2.0), product of:
          0.28585905 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.05066224 = queryNorm
          0.4363858 = fieldWeight in 332, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.0546875 = fieldNorm(doc=332)
      0.048048235 = weight(_text_:22 in 332) [ClassicSimilarity], result of:
        0.048048235 = score(doc=332,freq=2.0), product of:
          0.17741053 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05066224 = queryNorm
          0.2708308 = fieldWeight in 332, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0546875 = fieldNorm(doc=332)
  0.5 = coord(1/2)

Footnote: Contribution to a special issue of selected papers from the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'97), held Singapore, 22-23 Feb 1997

Tonkin, E.L.; Tourte, G.J.L.: Working with text. tools, techniques and approaches for text mining (2016) 0.09
```
0.08627405 = product of:
  0.1725481 = sum of:
    0.1725481 = product of:
      0.3450962 = sum of:
        0.3450962 = weight(_text_:mining in 4019) [ClassicSimilarity], result of:
          0.3450962 = score(doc=4019,freq=30.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            1.2072251 = fieldWeight in 4019, product of:
              5.477226 = tf(freq=30.0), with freq of:
                30.0 = termFreq=30.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4019)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

What is text mining, and how can it be used? What relevance do these methods have to everyday work in information science and the digital humanities? How does one develop competences in text mining? Working with Text provides a series of cross-disciplinary perspectives on text mining and its applications. As text mining raises legal and ethical issues, the legal background of text mining and the responsibilities of the engineer are discussed in this book. Chapters provide an introduction to the use of the popular GATE text mining package with data drawn from social media, the use of text mining to support semantic search, the development of an authority system to support content tagging, and recent techniques in automatic language evaluation. Focused studies describe text mining on historical texts, automated indexing using constrained vocabularies, and the use of natural language processing to explore the climate science literature. Interviews are included that offer a glimpse into the real-life experience of working within commercial and academic text mining.

LCSH

Data mining

RSWK

Text Mining / Aufsatzsammlung

Subject

Text Mining / Aufsatzsammlung
Data mining

Theme

Data Mining
Kulathuramaiyer, N.; Maurer, H.: Implications of emerging data mining (2009) 0.08
```
0.084530964 = product of:
  0.16906193 = sum of:
    0.16906193 = product of:
      0.33812386 = sum of:
        0.33812386 = weight(_text_:mining in 3144) [ClassicSimilarity], result of:
          0.33812386 = score(doc=3144,freq=20.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            1.1828341 = fieldWeight in 3144, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.046875 = fieldNorm(doc=3144)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Data Mining describes a technology that discovers non-trivial hidden patterns in a large collection of data. Although this technology has a tremendous impact on our lives, the invaluable contributions of this invisible technology often go unnoticed. This paper discusses advances in data mining while focusing on the emerging data mining capability. Such data mining applications perform multidimensional mining on a wide variety of heterogeneous data sources, providing solutions to many unresolved problems. This paper also highlights the advantages and disadvantages arising from the ever-expanding scope of data mining. Data Mining augments human intelligence by equipping us with a wealth of knowledge and by empowering us to perform our daily tasks better. As the mining scope and capacity increases, users and organizations become more willing to compromise privacy. The huge data stores of the 'master miners' allow them to gain deep insights into individual lifestyles and their social and behavioural patterns. Data integration and analysis capability of combining business and financial trends together with the ability to deterministically track market changes will drastically affect our lives.

Theme

Data Mining
Raghavan, V.V.; Deogun, J.S.; Sever, H.: Knowledge discovery and data mining : introduction (1998) 0.08
```
0.082510956 = product of:
  0.16502191 = sum of:
    0.16502191 = product of:
      0.33004382 = sum of:
        0.33004382 = weight(_text_:mining in 2899) [ClassicSimilarity], result of:
          0.33004382 = score(doc=2899,freq=14.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            1.1545684 = fieldWeight in 2899, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2899)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Defines knowledge discovery and database mining. The challenge for knowledge discovery in databases (KDD) is to automatically process large quantities of raw data, identifying the most significant and meaningful patterns, and present these as as knowledge appropriate for achieving a user's goals. Data mining is the process of deriving useful knowledge from real world databases through the application of pattern extraction techniques. Explains the goals of, and motivation for, research work on data mining. Discusses the nature of database contents, along with problems within the field of data mining

Footnote

Contribution to a special issue devoted to knowledge discovery and data mining

Theme

Data Mining
Zhou, L.; Chaovalit, P.: Ontology-supported polarity mining (2008) 0.08
```
0.082510956 = product of:
  0.16502191 = sum of:
    0.16502191 = product of:
      0.33004382 = sum of:
        0.33004382 = weight(_text_:mining in 1343) [ClassicSimilarity], result of:
          0.33004382 = score(doc=1343,freq=14.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            1.1545684 = fieldWeight in 1343, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1343)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Polarity mining provides an in-depth analysis of semantic orientations of text information. Motivated by its success in the area of topic mining, we propose an ontology-supported polarity mining (OSPM) approach. The approach aims to enhance polarity mining with ontology by providing detailed topic-specific information. OSPM was evaluated in the movie review domain using both supervised and unsupervised techniques. Results revealed that OSPM outperformed the baseline method without ontology support. The findings of this study not only advance the state of polarity mining research but also shed light on future research directions.

Theme

Data Mining
Ku, L.-W.; Ho, H.-W.; Chen, H.-H.: Opinion mining and relationship discovery using CopeOpi opinion analysis system (2009) 0.08
```
0.080165744 = product of:
  0.16033149 = sum of:
    0.16033149 = sum of:
      0.12601131 = weight(_text_:mining in 2938) [ClassicSimilarity], result of:
        0.12601131 = score(doc=2938,freq=4.0), product of:
          0.28585905 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.05066224 = queryNorm
          0.44081625 = fieldWeight in 2938, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2938)
      0.034320172 = weight(_text_:22 in 2938) [ClassicSimilarity], result of:
        0.034320172 = score(doc=2938,freq=2.0), product of:
          0.17741053 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05066224 = queryNorm
          0.19345059 = fieldWeight in 2938, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2938)
  0.5 = coord(1/2)
```
Abstract

We present CopeOpi, an opinion-analysis system, which extracts from the Web opinions about specific targets, summarizes the polarity and strength of these opinions, and tracks opinion variations over time. Objects that yield similar opinion tendencies over a certain time period may be correlated due to the latent causal events. CopeOpi discovers relationships among objects based on their opinion-tracking plots and collocations. Event bursts are detected from the tracking plots, and the strength of opinion relationships is determined by the coverage of these plots. To evaluate opinion mining, we use the NTCIR corpus annotated with opinion information at sentence and document levels. CopeOpi achieves sentence- and document-level f-measures of 62% and 74%. For relationship discovery, we collected 1.3M economics-related documents from 93 Web sources over 22 months, and analyzed collocation-based, opinion-based, and hybrid models. We consider as correlated company pairs that demonstrate similar stock-price variations, and selected these as the gold standard for evaluation. Results show that opinion-based and collocation-based models complement each other, and that integrated models perform the best. The top 25, 50, and 100 pairs discovered achieve precision rates of 1, 0.92, and 0.79, respectively.

Vaughan, L.; Chen, Y.: Data mining from web search queries : a comparison of Google trends and Baidu index (2015) 0.08

0.080165744 = product of:
  0.16033149 = sum of:
    0.16033149 = sum of:
      0.12601131 = weight(_text_:mining in 1605) [ClassicSimilarity], result of:
        0.12601131 = score(doc=1605,freq=4.0), product of:
          0.28585905 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.05066224 = queryNorm
          0.44081625 = fieldWeight in 1605, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1605)
      0.034320172 = weight(_text_:22 in 1605) [ClassicSimilarity], result of:
        0.034320172 = score(doc=1605,freq=2.0), product of:
          0.17741053 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05066224 = queryNorm
          0.19345059 = fieldWeight in 1605, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1605)
  0.5 = coord(1/2)

Source: Journal of the Association for Information Science and Technology. 66(2015) no.1, S.13-22
Theme: Data Mining

Fayyad, U.M.: Data mining and knowledge dicovery : making sense out of data (1996) 0.08

0.077165864 = product of:
  0.15433173 = sum of:
    0.15433173 = product of:
      0.30866346 = sum of:
        0.30866346 = weight(_text_:mining in 7007) [ClassicSimilarity], result of:
          0.30866346 = score(doc=7007,freq=6.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            1.079775 = fieldWeight in 7007, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.078125 = fieldNorm(doc=7007)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: Defines knowledge discovery and data mining (KDD) as the overall process of extracting high level knowledge from low level data. Outlines the KDD process. Explains how KDD is related to the fields of: statistics, pattern recognition, machine learning, artificial intelligence, databases and data warehouses
Theme: Data Mining

Borgelt, C.; Kruse, R.: Unsicheres Wissen nutzen (2002) 0.08

0.077165864 = product of:
  0.15433173 = sum of:
    0.15433173 = product of:
      0.30866346 = sum of:
        0.30866346 = weight(_text_:mining in 1104) [ClassicSimilarity], result of:
          0.30866346 = score(doc=1104,freq=6.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            1.079775 = fieldWeight in 1104, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.078125 = fieldNorm(doc=1104)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Footnote: Teil eines Heftthemas 'Data Mining'
Series: Data Mining
Theme: Data Mining

Garcia Marco, F.J.: ¬El factor humano en los sistemas de información (2003) 0.08

0.077165864 = product of:
  0.15433173 = sum of:
    0.15433173 = product of:
      0.30866346 = sum of:
        0.30866346 = weight(_text_:mining in 929) [ClassicSimilarity], result of:
          0.30866346 = score(doc=929,freq=6.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            1.079775 = fieldWeight in 929, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.078125 = fieldNorm(doc=929)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: Presents the concept and the techniques identified by the term data mining. Explains the principles and phases of developing a data mining process, and the main types of data mining tools

Al-Khatib, K.; Ghosa, T.; Hou, Y.; Waard, A. de; Freitag, D.: Argument mining for scholarly document processing : taking stock and looking ahead (2021) 0.08
```
0.076390296 = product of:
  0.15278059 = sum of:
    0.15278059 = product of:
      0.30556118 = sum of:
        0.30556118 = weight(_text_:mining in 568) [ClassicSimilarity], result of:
          0.30556118 = score(doc=568,freq=12.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            1.0689225 = fieldWeight in 568, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0546875 = fieldNorm(doc=568)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Argument mining targets structures in natural language related to interpretation and persuasion. Most scholarly discourse involves interpreting experimental evidence and attempting to persuade other scientists to adopt the same conclusions, which could benefit from argument mining techniques. However, While various argument mining studies have addressed student essays and news articles, those that target scientific discourse are still scarce. This paper surveys existing work in argument mining of scholarly discourse, and provides an overview of current models, data, tasks, and applications. We identify a number of key challenges confronting argument mining in the scientific domain, and suggest some possible solutions and future directions.
Mandl, T.: Text Mining und Data Mining (2023) 0.08
```
0.076390296 = product of:
  0.15278059 = sum of:
    0.15278059 = product of:
      0.30556118 = sum of:
        0.30556118 = weight(_text_:mining in 774) [ClassicSimilarity], result of:
          0.30556118 = score(doc=774,freq=12.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            1.0689225 = fieldWeight in 774, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0546875 = fieldNorm(doc=774)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Text und Data Mining sind ein Bündel von Technologien, die eng mit den Themenfeldern Statistik, Maschinelles Lernen und dem Erkennen von Mustern verbunden sind. Die üblichen Definitionen beziehen eine Vielzahl von verschiedenen Verfahren mit ein, ohne eine exakte Grenze zu ziehen. Data Mining bezeichnet die Suche nach Mustern, Regelmäßigkeiten oder Auffälligkeiten in stark strukturierten und vor allem numerischen Daten. "Any algorithm that enumerates patterns from, or fits models to, data is a data mining algorithm." Numerische Daten und Datenbankinhalte werden als strukturierte Daten bezeichnet. Dagegen gelten Textdokumente in natürlicher Sprache als unstrukturierte Daten.

Theme

Data Mining
Lischka, K.: Spurensuche im Datenwust : Data-Mining-Software fahndet nach kriminellen Mitarbeitern, guten Kunden - und bald vielleicht auch nach Terroristen (2002) 0.08
```
0.07577345 = product of:
  0.1515469 = sum of:
    0.1515469 = sum of:
      0.1309548 = weight(_text_:mining in 1178) [ClassicSimilarity], result of:
        0.1309548 = score(doc=1178,freq=12.0), product of:
          0.28585905 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.05066224 = queryNorm
          0.45810968 = fieldWeight in 1178, product of:
            3.4641016 = tf(freq=12.0), with freq of:
              12.0 = termFreq=12.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.0234375 = fieldNorm(doc=1178)
      0.0205921 = weight(_text_:22 in 1178) [ClassicSimilarity], result of:
        0.0205921 = score(doc=1178,freq=2.0), product of:
          0.17741053 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05066224 = queryNorm
          0.116070345 = fieldWeight in 1178, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0234375 = fieldNorm(doc=1178)
  0.5 = coord(1/2)
```
Content

"Ob man als Terrorist einen Anschlag gegen die Vereinigten Staaten plant, als Kassierer Scheine aus der Kasse unterschlägt oder für bestimmte Produkte besonders gerne Geld ausgibt - einen Unterschied macht Data-Mining-Software da nicht. Solche Programme analysieren riesige Daten- mengen und fällen statistische Urteile. Mit diesen Methoden wollen nun die For- scher des "Information Awaren in den Vereinigten Staaten Spuren von Terroristen in den Datenbanken von Behörden und privaten Unternehmen wie Kreditkartenfirmen finden. 200 Millionen Dollar umfasst der Jahresetat für die verschiedenen Forschungsprojekte. Dass solche Software in der Praxis funktioniert, zeigen die steigenden Umsätze der Anbieter so genannter Customer-Relationship-Management-Software. Im vergangenen Jahr ist das Potenzial für analytische CRM-Anwendungen laut dem Marktforschungsinstitut IDC weltweit um 22 Prozent gewachsen, bis zum Jahr 2006 soll es in Deutschland mit einem jährlichen Plus von 14,1 Prozent so weitergehen. Und das trotz schwacher Konjunktur - oder gerade deswegen. Denn ähnlich wie Data-Mining der USRegierung helfen soll, Terroristen zu finden, entscheiden CRM-Programme heute, welche Kunden für eine Firma profitabel sind. Und welche es künftig sein werden, wie Manuela Schnaubelt, Sprecherin des CRM-Anbieters SAP, beschreibt: "Die Kundenbewertung ist ein zentraler Bestandteil des analytischen CRM. Sie ermöglicht es Unternehmen, sich auf die für sie wichtigen und richtigen Kunden zu fokussieren. Darüber hinaus können Firmen mit speziellen Scoring- Verfahren ermitteln, welche Kunden langfristig in welchem Maße zum Unternehmenserfolg beitragen." Die Folgen der Bewertungen sind für die Betroffenen nicht immer positiv: Attraktive Kunden profitieren von individuellen Sonderangeboten und besonderer Zuwendung. Andere hängen vielleicht so lauge in der Warteschleife des Telefonservice, bis die profitableren Kunden abgearbeitet sind. So könnte eine praktische Umsetzung dessen aussehen, was SAP-Spreche-rin Schnaubelt abstrakt beschreibt: "In vielen Unternehmen wird Kundenbewertung mit der klassischen ABC-Analyse durchgeführt, bei der Kunden anhand von Daten wie dem Umsatz kategorisiert werden. A-Kunden als besonders wichtige Kunden werden anders betreut als C-Kunden." Noch näher am geplanten Einsatz von Data-Mining zur Terroristenjagd ist eine Anwendung, die heute viele Firmen erfolgreich nutzen: Sie spüren betrügende Mitarbeiter auf. Werner Sülzer vom großen CRM-Anbieter NCR Teradata beschreibt die Möglichkeiten so: "Heute hinterlässt praktisch jeder Täter - ob Mitarbeiter, Kunde oder Lieferant - Datenspuren bei seinen wirtschaftskriminellen Handlungen. Es muss vorrangig darum gehen, einzelne Spuren zu Handlungsmustern und Täterprofilen zu verdichten. Das gelingt mittels zentraler Datenlager und hoch entwickelter Such- und Analyseinstrumente." Von konkreten Erfolgen sprich: Entlas-sungen krimineller Mitarbeiter-nach Einsatz solcher Programme erzählen Unternehmen nicht gerne. Matthias Wilke von der "Beratungsstelle für Technologiefolgen und Qualifizierung" (BTQ) der Gewerkschaft Verdi weiß von einem Fall 'aus der Schweiz. Dort setzt die Handelskette "Pick Pay" das Programm "Lord Lose Prevention" ein. Zwei Monate nach Einfüh-rung seien Unterschlagungen im Wert von etwa 200 000 Franken ermittelt worden. Das kostete mehr als 50 verdächtige Kassiererinnen und Kassierer den Job.
Jede Kasse schickt die Daten zu Stornos, Rückgaben, Korrekturen und dergleichen an eine zentrale Datenbank. Aus den Informationen errechnet das Programm Kassiererprofile. Wessen Arbeit stark Durchschnitt abweicht, macht sich verdächtig. Die Kriterien" legen im Einzelnen die Revisionsabteilungen fest, doch generell gilt: "Bei Auffälligkeiten wie überdurchschnittlichvielenStornierungen, Off nen der Kassenschublade ohne Verkauf nach einem Storno oder Warenrücknahmen ohne Kassenbon, können die Vorgänge nachträglich einzelnen Personen zugeordnet werden", sagt Rene Schiller, Marketing-Chef des Lord-Herstellers Logware. Ein Kündigungsgrund ist eine solche Datensammlung vor Gericht nicht. Doch auf der Basis können Unternehmen gezielt Detektive einsetzen. Oder sie konfrontieren die Mitarbeiter mit dem Material; woraufhin Schuldige meist gestehen. Wilke sieht Programme wie Lord kritisch:"Jeder, der in dem Raster auffällt, kann ein potenzieller Betrüger oder Dieb sein und verdient besondere Beobachtung." Dabei könne man vom Standard abweichen, weil man unausgeschlafen und deshalb unkonzentriert sei. Hier tut sich für Wilke die Gefahr technisierter Leistungskontrolle auf. "Es ist ja nicht schwierig, mit den Programmen zu berechnen, wie lange beispielsweise das Kassieren eines Samstagseinkaufs durchschnittlich dauert." Die Betriebsräte - ihre Zustimmung ist beim Einsatz technischer Kon trolleinrichtungen nötig - verurteilen die wertende Software weniger eindeutig. Im Gegenteil: Bei Kaufhof und Edeka haben sie dem Einsatz zugestimmt. Denn: "Die wollen ja nicht, dass ganze Abteilungen wegen Inventurverlusten oder dergleichen unter Generalverdacht fallen", erklärt Gewerkschaftler Wilke: "Angesichts der Leistungen kommerzieller Data-Mining-Programme verblüfft es, dass in den Vereinigten Staaten das "Information Awareness Office" noch drei Jahre für Forschung und Erprobung der eigenen Programme veranschlagt. 2005 sollen frühe Prototypen zur Terroristensuche einesgetz werden. Doch schon jetzt regt sich Protest. Datenschützer wie Marc Botenberg vom Informationszentrum für Daten schutz sprechen vom "ehrgeizigsten öffentlichen Überwachungssystem, das je vorgeschlagen wurde". Sie warnen besonders davor, Daten aus der Internetnutzung und private Mails auszuwerten. Das Verteidigungsministerium rudert zurück. Man denke nicht daran, über die Software im Inland aktiv zu werden. "Das werden die Geheimdienste, die Spionageabwehr und die Strafverfolger tun", sagt Unterstaatssekretär Edward Aldridge. Man werde während der Entwicklung und der Tests mit konstruierten und einigen - aus Sicht der Datenschützer unbedenklichen - realen Informationen arbeiten. Zu denken gibt jedoch Aldriges Antwort auf die Frage, warum so viel Geld für die Entwicklung von Übersetzungssoftware eingeplant ist: Damit man Datenbanken in anderen Sprachen nutzen könne - sofern man auf sie rechtmäßigen Zugriff bekommt."

Theme

Data Mining

Chen, S.Y.; Liu, X.: ¬The contribution of data mining to information science : making sense of it all (2005) 0.08

0.075606786 = product of:
  0.15121357 = sum of:
    0.15121357 = product of:
      0.30242714 = sum of:
        0.30242714 = weight(_text_:mining in 4655) [ClassicSimilarity], result of:
          0.30242714 = score(doc=4655,freq=4.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            1.057959 = fieldWeight in 4655, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.09375 = fieldNorm(doc=4655)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Theme: Data Mining

Schultheiß, G.F.: ¬The Battle for Mindshare : Information Access and Retrieval in the Year 2010: NFAIS 46th Annual conference, Philadelphia, PA, 22.-24.2.2004 (2004) 0.07
```
0.074054174 = product of:
  0.14810835 = sum of:
    0.14810835 = sum of:
      0.10692415 = weight(_text_:mining in 2184) [ClassicSimilarity], result of:
        0.10692415 = score(doc=2184,freq=2.0), product of:
          0.28585905 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.05066224 = queryNorm
          0.37404498 = fieldWeight in 2184, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.046875 = fieldNorm(doc=2184)
      0.0411842 = weight(_text_:22 in 2184) [ClassicSimilarity], result of:
        0.0411842 = score(doc=2184,freq=2.0), product of:
          0.17741053 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05066224 = queryNorm
          0.23214069 = fieldWeight in 2184, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=2184)
  0.5 = coord(1/2)
```
Abstract

Trotz des weit gefassten Zeitrahmens brachte die Konferenz wichtige Aspekte der zukünftigen Entwicklungen deutlich zum Ausdruck. Es zeigte sich klar, dass es den generellen Nutzer nicht gibt und eine gute Kommunikation und Differenzierung für verschiedene Bewertungen dringend erforderlich ist. Professionelle Rechercheure sind für qualifizierte Analysen insbesondere im Patentbereich unumgänglich. Immer stärker wird nach der Verbindung unterschiedlicher Datenformate und Fachbereiche verlangt, die damit verbundenen Probleme insbesondere bei Verarbeitung und Archivierung sind erkannt und werden bearbeitet. Die Einbindung von Webinformationen hat sich IBM mit dem Web Fountain Projekt auf die Fahnen geschrieben, weist aber gleichzeitig darauf hin, dass die Auswertung großer Datenmengen durch Text Mining noch nicht ausgereift sei. Geschäftsmodelle, die sich für die Zukunft eignen sind noch nicht erkennbar. Nach wie vor wird auf das Anzeigengeschäft abgehoben. Die Open Access Initiative wird als unumgänglich angesehen, doch eine enge Zusammenarbeit mit den Verlegern wird empfohlen, um Schaden für Wissenschaft und Forschung durch unausgereiftes Vorgehen zu vermeiden.

Toldo, L.; Rippmann, F.: Integrated bioinformatics application for automated target discovery. (2005) 0.07

0.074054174 = product of:
  0.14810835 = sum of:
    0.14810835 = sum of:
      0.10692415 = weight(_text_:mining in 5260) [ClassicSimilarity], result of:
        0.10692415 = score(doc=5260,freq=2.0), product of:
          0.28585905 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.05066224 = queryNorm
          0.37404498 = fieldWeight in 5260, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.046875 = fieldNorm(doc=5260)
      0.0411842 = weight(_text_:22 in 5260) [ClassicSimilarity], result of:
        0.0411842 = score(doc=5260,freq=2.0), product of:
          0.17741053 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05066224 = queryNorm
          0.23214069 = fieldWeight in 5260, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=5260)
  0.5 = coord(1/2)

Abstract: In this article we present an in silico method that automatically assigns putative functions to DNA sequences. The annotations are at an increasingly conceptual level, up to identifying general biomedical fields to which the sequences could contribute. This bioinformatics data-mining system makes substantial use of several resources: a locally stored MEDLINE® database; a manually built classification system; the MeSH® taxonomy; relational technology; and bioinformatics methods. Knowledge is generated from various data sources by using well-defined semantics, and by exploiting direct links between them. A two-dimensional Concept Map(TM) displays the knowledge graph, which allows causal connections to be followed. The use of this method has been valuable and has saved considerable time in our in-house projects, and can be generally exploited for any sequence-annotation or knowledge-condensation task.
Date: 22. 7.2006 14:31:06

Arbelaitz, O.; Martínez-Otzeta. J.M.; Muguerza, J.: User modeling in a social network for cognitively disabled people (2016) 0.07

0.074054174 = product of:
  0.14810835 = sum of:
    0.14810835 = sum of:
      0.10692415 = weight(_text_:mining in 2639) [ClassicSimilarity], result of:
        0.10692415 = score(doc=2639,freq=2.0), product of:
          0.28585905 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.05066224 = queryNorm
          0.37404498 = fieldWeight in 2639, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.046875 = fieldNorm(doc=2639)
      0.0411842 = weight(_text_:22 in 2639) [ClassicSimilarity], result of:
        0.0411842 = score(doc=2639,freq=2.0), product of:
          0.17741053 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05066224 = queryNorm
          0.23214069 = fieldWeight in 2639, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=2639)
  0.5 = coord(1/2)

Abstract: Online communities are becoming an important tool in the communication and participation processes in our society. However, the most widespread applications are difficult to use for people with disabilities, or may involve some risks if no previous training has been undertaken. This work describes a novel social network for cognitively disabled people along with a clustering-based method for modeling activity and socialization processes of its users in a noninvasive way. This closed social network is specifically designed for people with cognitive disabilities, called Guremintza, that provides the network administrators (e.g., social workers) with two types of reports: summary statistics of the network usage and behavior patterns discovered by a data mining process. Experiments made in an initial stage of the network show that the discovered patterns are meaningful to the social workers and they find them useful in monitoring the progress of the users.
Date: 22. 1.2016 12:02:26

Priss, U.: Description logic and faceted knowledge representation (1999) 0.07
```
0.074054174 = product of:
  0.14810835 = sum of:
    0.14810835 = sum of:
      0.10692415 = weight(_text_:mining in 2655) [ClassicSimilarity], result of:
        0.10692415 = score(doc=2655,freq=2.0), product of:
          0.28585905 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.05066224 = queryNorm
          0.37404498 = fieldWeight in 2655, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.046875 = fieldNorm(doc=2655)
      0.0411842 = weight(_text_:22 in 2655) [ClassicSimilarity], result of:
        0.0411842 = score(doc=2655,freq=2.0), product of:
          0.17741053 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05066224 = queryNorm
          0.23214069 = fieldWeight in 2655, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=2655)
  0.5 = coord(1/2)
```
Abstract

The term "facet" was introduced into the field of library classification systems by Ranganathan in the 1930's [Ranganathan, 1962]. A facet is a viewpoint or aspect. In contrast to traditional classification systems, faceted systems are modular in that a domain is analyzed in terms of baseline facets which are then synthesized. In this paper, the term "facet" is used in a broader meaning. Facets can describe different aspects on the same level of abstraction or the same aspect on different levels of abstraction. The notion of facets is related to database views, multicontexts and conceptual scaling in formal concept analysis [Ganter and Wille, 1999], polymorphism in object-oriented design, aspect-oriented programming, views and contexts in description logic and semantic networks. This paper presents a definition of facets in terms of faceted knowledge representation that incorporates the traditional narrower notion of facets and potentially facilitates translation between different knowledge representation formalisms. A goal of this approach is a modular, machine-aided knowledge base design mechanism. A possible application is faceted thesaurus construction for information retrieval and data mining. Reasoning complexity depends on the size of the modules (facets). A more general analysis of complexity will be left for future research.

Date

22. 1.2016 17:30:31
Varathan, K.D.; Giachanou, A.; Crestani, F.: Comparative opinion mining : a review (2017) 0.07
```
0.07388068 = product of:
  0.14776136 = sum of:
    0.14776136 = product of:
      0.29552272 = sum of:
        0.29552272 = weight(_text_:mining in 3540) [ClassicSimilarity], result of:
          0.29552272 = score(doc=3540,freq=22.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            1.0338057 = fieldWeight in 3540, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3540)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Opinion mining refers to the use of natural language processing, text analysis, and computational linguistics to identify and extract subjective information in textual material. Opinion mining, also known as sentiment analysis, has received a lot of attention in recent times, as it provides a number of tools to analyze public opinion on a number of different topics. Comparative opinion mining is a subfield of opinion mining which deals with identifying and extracting information that is expressed in a comparative form (e.g., "paper X is better than the Y"). Comparative opinion mining plays a very important role when one tries to evaluate something because it provides a reference point for the comparison. This paper provides a review of the area of comparative opinion mining. It is the first review that cover specifically this topic as all previous reviews dealt mostly with general opinion mining. This survey covers comparative opinion mining from two different angles. One from the perspective of techniques and the other from the perspective of comparative opinion elements. It also incorporates preprocessing tools as well as data set that were used by past researchers that can be useful to future researchers in the field of comparative opinion mining.

Theme

Data Mining

Malaise, V.; Zweigenbaum, P.; Bachimont, B.: Mining defining contexts to help structuring differential ontologies (2005) 0.07

0.07128276 = product of:
  0.14256552 = sum of:
    0.14256552 = product of:
      0.28513104 = sum of:
        0.28513104 = weight(_text_:mining in 6598) [ClassicSimilarity], result of:
          0.28513104 = score(doc=6598,freq=2.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.9974533 = fieldWeight in 6598, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.125 = fieldNorm(doc=6598)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Search (3333 results, page 2 of 167)

Authors

Years

Languages

Types

Themes

Subjects

Classifications