Search (15 results, page 1 of 1)

Short, M.: Text mining and subject analysis for fiction; or, using machine learning and information extraction to assign subject headings to dime novels (2019) 0.02
```
0.022510704 = product of:
  0.045021407 = sum of:
    0.045021407 = product of:
      0.090042815 = sum of:
        0.090042815 = weight(_text_:subject in 5481) [ClassicSimilarity], result of:
          0.090042815 = score(doc=5481,freq=8.0), product of:
            0.16275941 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.04550679 = queryNorm
            0.5532265 = fieldWeight in 5481, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5481)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This article describes multiple experiments in text mining at Northern Illinois University that were undertaken to improve the efficiency and accuracy of cataloging. It focuses narrowly on subject analysis of dime novels, a format of inexpensive fiction that was popular in the United States between 1860 and 1915. NIU holds more than 55,000 dime novels in its collections, which it is in the process of comprehensively digitizing. Classification, keyword extraction, named-entity recognition, clustering, and topic modeling are discussed as means of assigning subject headings to improve their discoverability by researchers and to increase the productivity of digitization workflows.

Chowdhury, G.G.: Template mining for information extraction from digital documents (1999) 0.02

0.021579396 = product of:
  0.043158792 = sum of:
    0.043158792 = product of:
      0.086317584 = sum of:
        0.086317584 = weight(_text_:22 in 4577) [ClassicSimilarity], result of:
          0.086317584 = score(doc=4577,freq=2.0), product of:
            0.15935703 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04550679 = queryNorm
            0.5416616 = fieldWeight in 4577, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=4577)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 2. 4.2000 18:01:22

KDD : techniques and applications (1998) 0.02

0.018496625 = product of:
  0.03699325 = sum of:
    0.03699325 = product of:
      0.0739865 = sum of:
        0.0739865 = weight(_text_:22 in 6783) [ClassicSimilarity], result of:
          0.0739865 = score(doc=6783,freq=2.0), product of:
            0.15935703 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04550679 = queryNorm
            0.46428138 = fieldWeight in 6783, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=6783)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Footnote: A special issue of selected papers from the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'97), held Singapore, 22-23 Feb 1997

Matson, L.D.; Bonski, D.J.: Do digital libraries need librarians? (1997) 0.01

0.012331083 = product of:
  0.024662167 = sum of:
    0.024662167 = product of:
      0.049324334 = sum of:
        0.049324334 = weight(_text_:22 in 1737) [ClassicSimilarity], result of:
          0.049324334 = score(doc=1737,freq=2.0), product of:
            0.15935703 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04550679 = queryNorm
            0.30952093 = fieldWeight in 1737, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=1737)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22.11.1998 18:57:22

Amir, A.; Feldman, R.; Kashi, R.: ¬A new and versatile method for association generation (1997) 0.01

0.012331083 = product of:
  0.024662167 = sum of:
    0.024662167 = product of:
      0.049324334 = sum of:
        0.049324334 = weight(_text_:22 in 1270) [ClassicSimilarity], result of:
          0.049324334 = score(doc=1270,freq=2.0), product of:
            0.15935703 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04550679 = queryNorm
            0.30952093 = fieldWeight in 1270, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=1270)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Information systems. 22(1997) nos.5/6, S.333-347

Haravu, L.J.; Neelameghan, A.: Text mining and data mining in knowledge organization and discovery : the making of knowledge-based products (2003) 0.01
```
0.011369622 = product of:
  0.022739245 = sum of:
    0.022739245 = product of:
      0.04547849 = sum of:
        0.04547849 = weight(_text_:subject in 5653) [ClassicSimilarity], result of:
          0.04547849 = score(doc=5653,freq=4.0), product of:
            0.16275941 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.04550679 = queryNorm
            0.27942157 = fieldWeight in 5653, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5653)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Discusses the importance of knowledge organization in the context of the information overload caused by the vast quantities of data and information accessible on internal and external networks of an organization. Defines the characteristics of a knowledge-based product. Elaborates on the techniques and applications of text mining in developing knowledge products. Presents two approaches, as case studies, to the making of knowledge products: (1) steps and processes in the planning, designing and development of a composite multilingual multimedia CD product, with the potential international, inter-cultural end users in view, and (2) application of natural language processing software in text mining. Using a text mining software, it is possible to link concept terms from a processed text to a related thesaurus, glossary, schedules of a classification scheme, and facet structured subject representations. Concludes that the products of text mining and data mining could be made more useful if the features of a faceted scheme for subject classification are incorporated into text mining techniques and products.

Hofstede, A.H.M. ter; Proper, H.A.; Van der Weide, T.P.: Exploiting fact verbalisation in conceptual information modelling (1997) 0.01

0.010789698 = product of:
  0.021579396 = sum of:
    0.021579396 = product of:
      0.043158792 = sum of:
        0.043158792 = weight(_text_:22 in 2908) [ClassicSimilarity], result of:
          0.043158792 = score(doc=2908,freq=2.0), product of:
            0.15935703 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04550679 = queryNorm
            0.2708308 = fieldWeight in 2908, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2908)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Information systems. 22(1997) nos.5/6, S.349-385

Mohr, J.W.; Bogdanov, P.: Topic models : what they are and why they matter (2013) 0.01
```
0.009647444 = product of:
  0.019294888 = sum of:
    0.019294888 = product of:
      0.038589776 = sum of:
        0.038589776 = weight(_text_:subject in 1142) [ClassicSimilarity], result of:
          0.038589776 = score(doc=1142,freq=2.0), product of:
            0.16275941 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.04550679 = queryNorm
            0.23709705 = fieldWeight in 1142, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.046875 = fieldNorm(doc=1142)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

We provide a brief, non-technical introduction to the text mining methodology known as "topic modeling." We summarize the theory and background of the method and discuss what kinds of things are found by topic models. Using a text corpus comprised of the eight articles from the special issue of Poetics on the subject of topic models, we run a topic model on these articles, both as a way to introduce the methodology and also to help summarize some of the ways in which social and cultural scientists are using topic models. We review some of the critiques and debates over the use of the method and finally, we link these developments back to some of the original innovations in the field of content analysis that were pioneered by Harold D. Lasswell and colleagues during and just after World War II.
Maaten, L. van den; Hinton, G.: Visualizing non-metric similarities in multiple maps (2012) 0.01
```
0.009647444 = product of:
  0.019294888 = sum of:
    0.019294888 = product of:
      0.038589776 = sum of:
        0.038589776 = weight(_text_:subject in 3884) [ClassicSimilarity], result of:
          0.038589776 = score(doc=3884,freq=2.0), product of:
            0.16275941 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.04550679 = queryNorm
            0.23709705 = fieldWeight in 3884, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.046875 = fieldNorm(doc=3884)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Techniques for multidimensional scaling visualize objects as points in a low-dimensional metric map. As a result, the visualizations are subject to the fundamental limitations of metric spaces. These limitations prevent multidimensional scaling from faithfully representing non-metric similarity data such as word associations or event co-occurrences. In particular, multidimensional scaling cannot faithfully represent intransitive pairwise similarities in a visualization, and it cannot faithfully visualize "central" objects. In this paper, we present an extension of a recently proposed multidimensional scaling technique called t-SNE. The extension aims to address the problems of traditional multidimensional scaling techniques when these techniques are used to visualize non-metric similarities. The new technique, called multiple maps t-SNE, alleviates these problems by constructing a collection of maps that reveal complementary structure in the similarity data. We apply multiple maps t-SNE to a large data set of word association data and to a data set of NIPS co-authorships, demonstrating its ability to successfully visualize non-metric similarities.

Lowe, D.B.; Dollinger, I.; Koster, T.; Herbert, B.E.: Text mining for type of research classification (2021) 0.01

0.009647444 = product of:
  0.019294888 = sum of:
    0.019294888 = product of:
      0.038589776 = sum of:
        0.038589776 = weight(_text_:subject in 720) [ClassicSimilarity], result of:
          0.038589776 = score(doc=720,freq=2.0), product of:
            0.16275941 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.04550679 = queryNorm
            0.23709705 = fieldWeight in 720, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.046875 = fieldNorm(doc=720)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Footnote: Teil eines Themenheftes: Artificial intelligence (AI) and automated processes for subject sccess

Classification, automation, and new media : Proceedings of the 24th Annual Conference of the Gesellschaft für Klassifikation e.V., University of Passau, March 15 - 17, 2000 (2002) 0.01

0.008039537 = product of:
  0.016079074 = sum of:
    0.016079074 = product of:
      0.032158148 = sum of:
        0.032158148 = weight(_text_:subject in 5997) [ClassicSimilarity], result of:
          0.032158148 = score(doc=5997,freq=2.0), product of:
            0.16275941 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.04550679 = queryNorm
            0.19758089 = fieldWeight in 5997, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5997)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Content: Data Analysis, Statistics, and Classification.- Pattern Recognition and Automation.- Data Mining, Information Processing, and Automation.- New Media, Web Mining, and Automation.- Applications in Management Science, Finance, and Marketing.- Applications in Medicine, Biology, Archaeology, and Others.- Author Index.- Subject Index.

Hallonsten, O.; Holmberg, D.: Analyzing structural stratification in the Swedish higher education system : data contextualization with policy-history analysis (2013) 0.01

0.0077069276 = product of:
  0.015413855 = sum of:
    0.015413855 = product of:
      0.03082771 = sum of:
        0.03082771 = weight(_text_:22 in 668) [ClassicSimilarity], result of:
          0.03082771 = score(doc=668,freq=2.0), product of:
            0.15935703 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04550679 = queryNorm
            0.19345059 = fieldWeight in 668, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=668)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 3.2013 19:43:01

Vaughan, L.; Chen, Y.: Data mining from web search queries : a comparison of Google trends and Baidu index (2015) 0.01

0.0077069276 = product of:
  0.015413855 = sum of:
    0.015413855 = product of:
      0.03082771 = sum of:
        0.03082771 = weight(_text_:22 in 1605) [ClassicSimilarity], result of:
          0.03082771 = score(doc=1605,freq=2.0), product of:
            0.15935703 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04550679 = queryNorm
            0.19345059 = fieldWeight in 1605, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1605)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Journal of the Association for Information Science and Technology. 66(2015) no.1, S.13-22

Fonseca, F.; Marcinkowski, M.; Davis, C.: Cyber-human systems of thought and understanding (2019) 0.01

0.0077069276 = product of:
  0.015413855 = sum of:
    0.015413855 = product of:
      0.03082771 = sum of:
        0.03082771 = weight(_text_:22 in 5011) [ClassicSimilarity], result of:
          0.03082771 = score(doc=5011,freq=2.0), product of:
            0.15935703 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04550679 = queryNorm
            0.19345059 = fieldWeight in 5011, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5011)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 7. 3.2019 16:32:22

Information visualization in data mining and knowledge discovery (2002) 0.00

0.0030827709 = product of:
  0.0061655417 = sum of:
    0.0061655417 = product of:
      0.012331083 = sum of:
        0.012331083 = weight(_text_:22 in 1789) [ClassicSimilarity], result of:
          0.012331083 = score(doc=1789,freq=2.0), product of:
            0.15935703 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04550679 = queryNorm
            0.07738023 = fieldWeight in 1789, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.015625 = fieldNorm(doc=1789)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 23. 3.2008 19:10:22

Search (15 results, page 1 of 1)

Authors

Years

Types

Themes

Subjects

Classifications