Search (2 results, page 1 of 1)

Lowe, D.B.; Dollinger, I.; Koster, T.; Herbert, B.E.: Text mining for type of research classification (2021) 0.00
```
0.001757696 = product of:
  0.003515392 = sum of:
    0.003515392 = product of:
      0.007030784 = sum of:
        0.007030784 = weight(_text_:a in 720) [ClassicSimilarity], result of:
          0.007030784 = score(doc=720,freq=6.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.13239266 = fieldWeight in 720, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=720)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This project brought together undergraduate students in Computer Science with librarians to mine abstracts of articles from the Texas A&M University Libraries' institutional repository, OAKTrust, in order to probe the creation of new metadata to improve discovery and use. The mining operation task consisted simply of classifying the articles into two categories of research type: basic research ("for understanding," "curiosity-based," or "knowledge-based") and applied research ("use-based"). These categories are fundamental especially for funders but are also important to researchers. The mining-to-classification steps took several iterations, but ultimately, we achieved good results with the toolkit BERT (Bidirectional Encoder Representations from Transformers). The project and its workflows represent a preview of what may lie ahead in the future of crafting metadata using text mining techniques to enhance discoverability.

Type

a
Short, M.: Text mining and subject analysis for fiction; or, using machine learning and information extraction to assign subject headings to dime novels (2019) 0.00
```
0.001674345 = product of:
  0.00334869 = sum of:
    0.00334869 = product of:
      0.00669738 = sum of:
        0.00669738 = weight(_text_:a in 5481) [ClassicSimilarity], result of:
          0.00669738 = score(doc=5481,freq=4.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.12611452 = fieldWeight in 5481, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5481)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This article describes multiple experiments in text mining at Northern Illinois University that were undertaken to improve the efficiency and accuracy of cataloging. It focuses narrowly on subject analysis of dime novels, a format of inexpensive fiction that was popular in the United States between 1860 and 1915. NIU holds more than 55,000 dime novels in its collections, which it is in the process of comprehensively digitizing. Classification, keyword extraction, named-entity recognition, clustering, and topic modeling are discussed as means of assigning subject headings to improve their discoverability by researchers and to increase the productivity of digitization workflows.

Type

a

Search (2 results, page 1 of 1)

Authors

Years

Themes