Search (23 results, page 1 of 2)

  • × theme_ss:"Data Mining"
  1. Chowdhury, G.G.: Template mining for information extraction from digital documents (1999) 0.03
    0.02846486 = product of:
      0.05692972 = sum of:
        0.05692972 = product of:
          0.11385944 = sum of:
            0.11385944 = weight(_text_:22 in 4577) [ClassicSimilarity], result of:
              0.11385944 = score(doc=4577,freq=2.0), product of:
                0.21020399 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.060026903 = queryNorm
                0.5416616 = fieldWeight in 4577, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=4577)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    2. 4.2000 18:01:22
  2. Dang, X.H.; Ong. K.-L.: Knowledge discovery in data streams (2009) 0.03
    0.02817536 = product of:
      0.05635072 = sum of:
        0.05635072 = product of:
          0.11270144 = sum of:
            0.11270144 = weight(_text_:encyclopedia in 3829) [ClassicSimilarity], result of:
              0.11270144 = score(doc=3829,freq=2.0), product of:
                0.3194549 = queryWeight, product of:
                  5.321862 = idf(docFreq=586, maxDocs=44218)
                  0.060026903 = queryNorm
                0.35279295 = fieldWeight in 3829, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.321862 = idf(docFreq=586, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3829)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Encyclopedia of library and information sciences. 3rd ed. Ed.: M.J. Bates
  3. Maaten, L. van den; Hinton, G.: Visualizing non-metric similarities in multiple maps (2012) 0.03
    0.026496232 = product of:
      0.052992463 = sum of:
        0.052992463 = product of:
          0.15897739 = sum of:
            0.15897739 = weight(_text_:objects in 3884) [ClassicSimilarity], result of:
              0.15897739 = score(doc=3884,freq=4.0), product of:
                0.31904724 = queryWeight, product of:
                  5.315071 = idf(docFreq=590, maxDocs=44218)
                  0.060026903 = queryNorm
                0.49828792 = fieldWeight in 3884, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.315071 = idf(docFreq=590, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3884)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Abstract
    Techniques for multidimensional scaling visualize objects as points in a low-dimensional metric map. As a result, the visualizations are subject to the fundamental limitations of metric spaces. These limitations prevent multidimensional scaling from faithfully representing non-metric similarity data such as word associations or event co-occurrences. In particular, multidimensional scaling cannot faithfully represent intransitive pairwise similarities in a visualization, and it cannot faithfully visualize "central" objects. In this paper, we present an extension of a recently proposed multidimensional scaling technique called t-SNE. The extension aims to address the problems of traditional multidimensional scaling techniques when these techniques are used to visualize non-metric similarities. The new technique, called multiple maps t-SNE, alleviates these problems by constructing a collection of maps that reveal complementary structure in the similarity data. We apply multiple maps t-SNE to a large data set of word association data and to a data set of NIPS co-authorships, demonstrating its ability to successfully visualize non-metric similarities.
  4. Fayyad, U.M.; Djorgovski, S.G.; Weir, N.: From digitized images to online catalogs : data ming a sky server (1996) 0.02
    0.024980886 = product of:
      0.04996177 = sum of:
        0.04996177 = product of:
          0.14988531 = sum of:
            0.14988531 = weight(_text_:objects in 6625) [ClassicSimilarity], result of:
              0.14988531 = score(doc=6625,freq=2.0), product of:
                0.31904724 = queryWeight, product of:
                  5.315071 = idf(docFreq=590, maxDocs=44218)
                  0.060026903 = queryNorm
                0.46979034 = fieldWeight in 6625, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.315071 = idf(docFreq=590, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6625)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Abstract
    Offers a data mining approach based on machine learning classification methods to the problem of automated cataloguing of online databases of digital images resulting from sky surveys. The SKICAT system automates the reduction and analysis of 3 terabytes of images expected to contain about 2 billion sky objects. It offers a solution to problems associated with the analysis of large data sets in science
  5. KDD : techniques and applications (1998) 0.02
    0.02439845 = product of:
      0.0487969 = sum of:
        0.0487969 = product of:
          0.0975938 = sum of:
            0.0975938 = weight(_text_:22 in 6783) [ClassicSimilarity], result of:
              0.0975938 = score(doc=6783,freq=2.0), product of:
                0.21020399 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.060026903 = queryNorm
                0.46428138 = fieldWeight in 6783, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=6783)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Footnote
    A special issue of selected papers from the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'97), held Singapore, 22-23 Feb 1997
  6. Maaten, L. van den: Accelerating t-SNE using Tree-Based Algorithms (2014) 0.02
    0.021858275 = product of:
      0.04371655 = sum of:
        0.04371655 = product of:
          0.13114965 = sum of:
            0.13114965 = weight(_text_:objects in 3886) [ClassicSimilarity], result of:
              0.13114965 = score(doc=3886,freq=2.0), product of:
                0.31904724 = queryWeight, product of:
                  5.315071 = idf(docFreq=590, maxDocs=44218)
                  0.060026903 = queryNorm
                0.41106653 = fieldWeight in 3886, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.315071 = idf(docFreq=590, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3886)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Abstract
    The paper investigates the acceleration of t-SNE-an embedding technique that is commonly used for the visualization of high-dimensional data in scatter plots-using two tree-based algorithms. In particular, the paper develops variants of the Barnes-Hut algorithm and of the dual-tree algorithm that approximate the gradient used for learning t-SNE embeddings in O(N*logN). Our experiments show that the resulting algorithms substantially accelerate t-SNE, and that they make it possible to learn embeddings of data sets with millions of objects. Somewhat counterintuitively, the Barnes-Hut variant of t-SNE appears to outperform the dual-tree variant.
  7. Loh, S.; Oliveira, J.P.M. de; Gastal, F.L.: Knowledge discovery in textual documentation : qualitative and quantitative analyses (2001) 0.02
    0.018735666 = product of:
      0.03747133 = sum of:
        0.03747133 = product of:
          0.11241399 = sum of:
            0.11241399 = weight(_text_:objects in 4482) [ClassicSimilarity], result of:
              0.11241399 = score(doc=4482,freq=2.0), product of:
                0.31904724 = queryWeight, product of:
                  5.315071 = idf(docFreq=590, maxDocs=44218)
                  0.060026903 = queryNorm
                0.35234275 = fieldWeight in 4482, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.315071 = idf(docFreq=590, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4482)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Abstract
    This paper presents an approach for performing knowledge discovery in texts through qualitative and quantitative analyses of high-level textual characteristics. Instead of applying mining techniques on attribute values, terms or keywords extracted from texts, the discovery process works over conceptss identified in texts. Concepts represent real world events and objects, and they help the user to understand ideas, trends, thoughts, opinions and intentions present in texts. The approach combines a quasi-automatic categorisation task (for qualitative analysis) with a mining process (for quantitative analysis). The goal is to find new and useful knowledge inside a textual collection through the use of mining techniques applied over concepts (representing text content). In this paper, an application of the approach to medical records of a psychiatric hospital is presented. The approach helps physicians to extract knowledge about patients and diseases. This knowledge may be used for epidemiological studies, for training professionals and it may be also used to support physicians to diagnose and evaluate diseases.
  8. Matson, L.D.; Bonski, D.J.: Do digital libraries need librarians? (1997) 0.02
    0.016265634 = product of:
      0.03253127 = sum of:
        0.03253127 = product of:
          0.06506254 = sum of:
            0.06506254 = weight(_text_:22 in 1737) [ClassicSimilarity], result of:
              0.06506254 = score(doc=1737,freq=2.0), product of:
                0.21020399 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.060026903 = queryNorm
                0.30952093 = fieldWeight in 1737, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1737)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22.11.1998 18:57:22
  9. Lusti, M.: Data Warehousing and Data Mining : Eine Einführung in entscheidungsunterstützende Systeme (1999) 0.02
    0.016265634 = product of:
      0.03253127 = sum of:
        0.03253127 = product of:
          0.06506254 = sum of:
            0.06506254 = weight(_text_:22 in 4261) [ClassicSimilarity], result of:
              0.06506254 = score(doc=4261,freq=2.0), product of:
                0.21020399 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.060026903 = queryNorm
                0.30952093 = fieldWeight in 4261, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4261)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    17. 7.2002 19:22:06
  10. Amir, A.; Feldman, R.; Kashi, R.: ¬A new and versatile method for association generation (1997) 0.02
    0.016265634 = product of:
      0.03253127 = sum of:
        0.03253127 = product of:
          0.06506254 = sum of:
            0.06506254 = weight(_text_:22 in 1270) [ClassicSimilarity], result of:
              0.06506254 = score(doc=1270,freq=2.0), product of:
                0.21020399 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.060026903 = queryNorm
                0.30952093 = fieldWeight in 1270, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1270)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Information systems. 22(1997) nos.5/6, S.333-347
  11. Ku, L.-W.; Chen, H.-H.: Mining opinions from the Web : beyond relevance retrieval (2007) 0.02
    0.015613055 = product of:
      0.03122611 = sum of:
        0.03122611 = product of:
          0.093678325 = sum of:
            0.093678325 = weight(_text_:objects in 605) [ClassicSimilarity], result of:
              0.093678325 = score(doc=605,freq=2.0), product of:
                0.31904724 = queryWeight, product of:
                  5.315071 = idf(docFreq=590, maxDocs=44218)
                  0.060026903 = queryNorm
                0.29361898 = fieldWeight in 605, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.315071 = idf(docFreq=590, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=605)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Abstract
    Documents discussing public affairs, common themes, interesting products, and so on, are reported and distributed on the Web. Positive and negative opinions embedded in documents are useful references and feedbacks for governments to improve their services, for companies to market their products, and for customers to purchase their objects. Web opinion mining aims to extract, summarize, and track various aspects of subjective information on the Web. Mining subjective information enables traditional information retrieval (IR) systems to retrieve more data from human viewpoints and provide information with finer granularity. Opinion extraction identifies opinion holders, extracts the relevant opinion sentences, and decides their polarities. Opinion summarization recognizes the major events embedded in documents and summarizes the supportive and the nonsupportive evidence. Opinion tracking captures subjective information from various genres and monitors the developments of opinions from spatial and temporal dimensions. To demonstrate and evaluate the proposed opinion mining algorithms, news and bloggers' articles are adopted. Documents in the evaluation corpora are tagged in different granularities from words, sentences to documents. In the experiments, positive and negative sentiment words and their weights are mined on the basis of Chinese word structures. The f-measure is 73.18% and 63.75% for verbs and nouns, respectively. Utilizing the sentiment words mined together with topical words, we achieve f-measure 62.16% at the sentence level and 74.37% at the document level.
  12. Maaten, L. van den; Hinton, G.: Visualizing data using t-SNE (2008) 0.02
    0.015613055 = product of:
      0.03122611 = sum of:
        0.03122611 = product of:
          0.093678325 = sum of:
            0.093678325 = weight(_text_:objects in 3888) [ClassicSimilarity], result of:
              0.093678325 = score(doc=3888,freq=2.0), product of:
                0.31904724 = queryWeight, product of:
                  5.315071 = idf(docFreq=590, maxDocs=44218)
                  0.060026903 = queryNorm
                0.29361898 = fieldWeight in 3888, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.315071 = idf(docFreq=590, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3888)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Abstract
    We present a new technique called "t-SNE" that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map. The technique is a variation of Stochastic Neighbor Embedding (Hinton and Roweis, 2002) that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map. t-SNE is better than existing techniques at creating a single map that reveals structure at many different scales. This is particularly important for high-dimensional data that lie on several different, but related, low-dimensional manifolds, such as images of objects from multiple classes seen from multiple viewpoints. For visualizing the structure of very large data sets, we show how t-SNE can use random walks on neighborhood graphs to allow the implicit structure of all of the data to influence the way in which a subset of the data is displayed. We illustrate the performance of t-SNE on a wide variety of data sets and compare it with many other non-parametric visualization techniques, including Sammon mapping, Isomap, and Locally Linear Embedding. The visualizations produced by t-SNE are significantly better than those produced by the other techniques on almost all of the data sets.
  13. Hofstede, A.H.M. ter; Proper, H.A.; Van der Weide, T.P.: Exploiting fact verbalisation in conceptual information modelling (1997) 0.01
    0.01423243 = product of:
      0.02846486 = sum of:
        0.02846486 = product of:
          0.05692972 = sum of:
            0.05692972 = weight(_text_:22 in 2908) [ClassicSimilarity], result of:
              0.05692972 = score(doc=2908,freq=2.0), product of:
                0.21020399 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.060026903 = queryNorm
                0.2708308 = fieldWeight in 2908, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2908)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Information systems. 22(1997) nos.5/6, S.349-385
  14. Lackes, R.; Tillmanns, C.: Data Mining für die Unternehmenspraxis : Entscheidungshilfen und Fallstudien mit führenden Softwarelösungen (2006) 0.01
    0.012199225 = product of:
      0.02439845 = sum of:
        0.02439845 = product of:
          0.0487969 = sum of:
            0.0487969 = weight(_text_:22 in 1383) [ClassicSimilarity], result of:
              0.0487969 = score(doc=1383,freq=2.0), product of:
                0.21020399 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.060026903 = queryNorm
                0.23214069 = fieldWeight in 1383, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1383)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 3.2008 14:46:06
  15. Hallonsten, O.; Holmberg, D.: Analyzing structural stratification in the Swedish higher education system : data contextualization with policy-history analysis (2013) 0.01
    0.010166021 = product of:
      0.020332042 = sum of:
        0.020332042 = product of:
          0.040664084 = sum of:
            0.040664084 = weight(_text_:22 in 668) [ClassicSimilarity], result of:
              0.040664084 = score(doc=668,freq=2.0), product of:
                0.21020399 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.060026903 = queryNorm
                0.19345059 = fieldWeight in 668, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=668)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 3.2013 19:43:01
  16. Vaughan, L.; Chen, Y.: Data mining from web search queries : a comparison of Google trends and Baidu index (2015) 0.01
    0.010166021 = product of:
      0.020332042 = sum of:
        0.020332042 = product of:
          0.040664084 = sum of:
            0.040664084 = weight(_text_:22 in 1605) [ClassicSimilarity], result of:
              0.040664084 = score(doc=1605,freq=2.0), product of:
                0.21020399 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.060026903 = queryNorm
                0.19345059 = fieldWeight in 1605, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1605)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Journal of the Association for Information Science and Technology. 66(2015) no.1, S.13-22
  17. Fonseca, F.; Marcinkowski, M.; Davis, C.: Cyber-human systems of thought and understanding (2019) 0.01
    0.010166021 = product of:
      0.020332042 = sum of:
        0.020332042 = product of:
          0.040664084 = sum of:
            0.040664084 = weight(_text_:22 in 5011) [ClassicSimilarity], result of:
              0.040664084 = score(doc=5011,freq=2.0), product of:
                0.21020399 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.060026903 = queryNorm
                0.19345059 = fieldWeight in 5011, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5011)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    7. 3.2019 16:32:22
  18. Peters, G.; Gaese, V.: ¬Das DocCat-System in der Textdokumentation von G+J (2003) 0.01
    0.008132817 = product of:
      0.016265634 = sum of:
        0.016265634 = product of:
          0.03253127 = sum of:
            0.03253127 = weight(_text_:22 in 1507) [ClassicSimilarity], result of:
              0.03253127 = score(doc=1507,freq=2.0), product of:
                0.21020399 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.060026903 = queryNorm
                0.15476047 = fieldWeight in 1507, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1507)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 4.2003 11:45:36
  19. Hölzig, C.: Google spürt Grippewellen auf : Die neue Anwendung ist bisher auf die USA beschränkt (2008) 0.01
    0.008132817 = product of:
      0.016265634 = sum of:
        0.016265634 = product of:
          0.03253127 = sum of:
            0.03253127 = weight(_text_:22 in 2403) [ClassicSimilarity], result of:
              0.03253127 = score(doc=2403,freq=2.0), product of:
                0.21020399 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.060026903 = queryNorm
                0.15476047 = fieldWeight in 2403, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2403)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    3. 5.1997 8:44:22
  20. Jäger, L.: Von Big Data zu Big Brother (2018) 0.01
    0.008132817 = product of:
      0.016265634 = sum of:
        0.016265634 = product of:
          0.03253127 = sum of:
            0.03253127 = weight(_text_:22 in 5234) [ClassicSimilarity], result of:
              0.03253127 = score(doc=5234,freq=2.0), product of:
                0.21020399 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.060026903 = queryNorm
                0.15476047 = fieldWeight in 5234, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5234)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 1.2018 11:33:49

Languages

  • e 16
  • d 7

Types