Document (#41020)

Author
Tonkin, E.L.
Tourte, G.J.L.
Title
Working with text. tools, techniques and approaches for text mining
Imprint
Cambridge (MA) : Chandos Publisher
Year
2016
Pages
xiii, 330 S
Isbn
978-1-84334-749-1
Series
Chandos Information Professional Series
Abstract
What is text mining, and how can it be used? What relevance do these methods have to everyday work in information science and the digital humanities? How does one develop competences in text mining? Working with Text provides a series of cross-disciplinary perspectives on text mining and its applications. As text mining raises legal and ethical issues, the legal background of text mining and the responsibilities of the engineer are discussed in this book. Chapters provide an introduction to the use of the popular GATE text mining package with data drawn from social media, the use of text mining to support semantic search, the development of an authority system to support content tagging, and recent techniques in automatic language evaluation. Focused studies describe text mining on historical texts, automated indexing using constrained vocabularies, and the use of natural language processing to explore the climate science literature. Interviews are included that offer a glimpse into the real-life experience of working within commercial and academic text mining.
Footnote
Rez. in: JASIST 69(2018) no.1, S.181-184 (Jacques Savoy).
Theme
Data Mining
LCSH
Data mining
RSWK
Text Mining / Aufsatzsammlung
DDC
005.7
LCC
QA76.9.D343
RVK
ST 680

Similar documents (content)

  1. Mining text data (2012) 0.32
    0.31562334 = sum of:
      0.31562334 = product of:
        1.3150973 = sum of:
          0.12364312 = weight(subject_txt:aufsatzsammlung in 362) [ClassicSimilarity], result of:
            0.12364312 = score(doc=362,freq=1.0), product of:
              0.075175636 = queryWeight, product of:
                1.0329536 = boost
                6.578893 = idf(docFreq=166, maxDocs=44218)
                0.01106225 = queryNorm
              1.6447233 = fieldWeight in 362, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.578893 = idf(docFreq=166, maxDocs=44218)
                0.25 = fieldNorm(doc=362)
          0.01975018 = weight(abstract_txt:data in 362) [ClassicSimilarity], result of:
            0.01975018 = score(doc=362,freq=6.0), product of:
              0.03866732 = queryWeight, product of:
                1.0476816 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.01106225 = queryNorm
              0.5107719 = fieldWeight in 362, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0625 = fieldNorm(doc=362)
          0.0050869077 = weight(abstract_txt:with in 362) [ClassicSimilarity], result of:
            0.0050869077 = score(doc=362,freq=1.0), product of:
              0.03255968 = queryWeight, product of:
                1.1774514 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.01106225 = queryNorm
              0.15623334 = fieldWeight in 362, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.0625 = fieldNorm(doc=362)
          0.017671213 = weight(abstract_txt:science in 362) [ClassicSimilarity], result of:
            0.017671213 = score(doc=362,freq=2.0), product of:
              0.051782303 = queryWeight, product of:
                1.2124057 = boost
                3.8609126 = idf(docFreq=2529, maxDocs=44218)
                0.01106225 = queryNorm
              0.3412597 = fieldWeight in 362, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.8609126 = idf(docFreq=2529, maxDocs=44218)
                0.0625 = fieldNorm(doc=362)
          0.22859193 = weight(abstract_txt:text in 362) [ClassicSimilarity], result of:
            0.22859193 = score(doc=362,freq=6.0), product of:
              0.36923975 = queryWeight, product of:
                8.25407 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.01106225 = queryNorm
              0.6190881 = fieldWeight in 362, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=362)
          0.92035395 = weight(abstract_txt:mining in 362) [ClassicSimilarity], result of:
            0.92035395 = score(doc=362,freq=9.0), product of:
              0.7948527 = queryWeight, product of:
                11.635263 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.01106225 = queryNorm
              1.1578925 = fieldWeight in 362, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.0625 = fieldNorm(doc=362)
        0.24 = coord(6/25)
    
  2. Relational data mining (2001) 0.26
    0.26181254 = sum of:
      0.26181254 = product of:
        1.0908856 = sum of:
          0.022536706 = weight(abstract_txt:data in 1303) [ClassicSimilarity], result of:
            0.022536706 = score(doc=1303,freq=5.0), product of:
              0.03866732 = queryWeight, product of:
                1.0476816 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.01106225 = queryNorm
              0.582836 = fieldWeight in 1303, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.078125 = fieldNorm(doc=1303)
          0.06262273 = weight(abstract_txt:chapters in 1303) [ClassicSimilarity], result of:
            0.06262273 = score(doc=1303,freq=2.0), product of:
              0.08232691 = queryWeight, product of:
                1.0809689 = boost
                6.8847027 = idf(docFreq=122, maxDocs=44218)
                0.01106225 = queryNorm
              0.76065934 = fieldWeight in 1303, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.8847027 = idf(docFreq=122, maxDocs=44218)
                0.078125 = fieldNorm(doc=1303)
          0.0063586347 = weight(abstract_txt:with in 1303) [ClassicSimilarity], result of:
            0.0063586347 = score(doc=1303,freq=1.0), product of:
              0.03255968 = queryWeight, product of:
                1.1774514 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.01106225 = queryNorm
              0.19529167 = fieldWeight in 1303, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.078125 = fieldNorm(doc=1303)
          0.025225675 = weight(abstract_txt:techniques in 1303) [ClassicSimilarity], result of:
            0.025225675 = score(doc=1303,freq=1.0), product of:
              0.07128021 = queryWeight, product of:
                1.4224656 = boost
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.01106225 = queryNorm
              0.3538945 = fieldWeight in 1303, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.078125 = fieldNorm(doc=1303)
          0.11665284 = weight(abstract_txt:text in 1303) [ClassicSimilarity], result of:
            0.11665284 = score(doc=1303,freq=1.0), product of:
              0.36923975 = queryWeight, product of:
                8.25407 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.01106225 = queryNorm
              0.3159271 = fieldWeight in 1303, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.078125 = fieldNorm(doc=1303)
          0.8574891 = weight(abstract_txt:mining in 1303) [ClassicSimilarity], result of:
            0.8574891 = score(doc=1303,freq=5.0), product of:
              0.7948527 = queryWeight, product of:
                11.635263 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.01106225 = queryNorm
              1.0788026 = fieldWeight in 1303, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.078125 = fieldNorm(doc=1303)
        0.24 = coord(6/25)
    
  3. Varathan, K.D.; Giachanou, A.; Crestani, F.: Comparative opinion mining : a review (2017) 0.26
    0.25559857 = sum of:
      0.25559857 = product of:
        1.0649941 = sum of:
          0.008062977 = weight(abstract_txt:data in 3540) [ClassicSimilarity], result of:
            0.008062977 = score(doc=3540,freq=1.0), product of:
              0.03866732 = queryWeight, product of:
                1.0476816 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.01106225 = queryNorm
              0.20852174 = fieldWeight in 3540, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0625 = fieldNorm(doc=3540)
          0.007193974 = weight(abstract_txt:with in 3540) [ClassicSimilarity], result of:
            0.007193974 = score(doc=3540,freq=2.0), product of:
              0.03255968 = queryWeight, product of:
                1.1774514 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.01106225 = queryNorm
              0.22094731 = fieldWeight in 3540, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.0625 = fieldNorm(doc=3540)
          0.015880376 = weight(abstract_txt:language in 3540) [ClassicSimilarity], result of:
            0.015880376 = score(doc=3540,freq=1.0), product of:
              0.060755786 = queryWeight, product of:
                1.3132612 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.01106225 = queryNorm
              0.26138046 = fieldWeight in 3540, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.0625 = fieldNorm(doc=3540)
          0.02018054 = weight(abstract_txt:techniques in 3540) [ClassicSimilarity], result of:
            0.02018054 = score(doc=3540,freq=1.0), product of:
              0.07128021 = queryWeight, product of:
                1.4224656 = boost
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.01106225 = queryNorm
              0.2831156 = fieldWeight in 3540, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.0625 = fieldNorm(doc=3540)
          0.09332227 = weight(abstract_txt:text in 3540) [ClassicSimilarity], result of:
            0.09332227 = score(doc=3540,freq=1.0), product of:
              0.36923975 = queryWeight, product of:
                8.25407 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.01106225 = queryNorm
              0.25274166 = fieldWeight in 3540, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=3540)
          0.92035395 = weight(abstract_txt:mining in 3540) [ClassicSimilarity], result of:
            0.92035395 = score(doc=3540,freq=9.0), product of:
              0.7948527 = queryWeight, product of:
                11.635263 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.01106225 = queryNorm
              1.1578925 = fieldWeight in 3540, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.0625 = fieldNorm(doc=3540)
        0.24 = coord(6/25)
    
  4. Haravu, L.J.; Neelameghan, A.: Text mining and data mining in knowledge organization and discovery : the making of knowledge-based products (2003) 0.25
    0.24983218 = sum of:
      0.24983218 = product of:
        1.0409675 = sum of:
          0.011402772 = weight(abstract_txt:data in 5653) [ClassicSimilarity], result of:
            0.011402772 = score(doc=5653,freq=2.0), product of:
              0.03866732 = queryWeight, product of:
                1.0476816 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.01106225 = queryNorm
              0.29489428 = fieldWeight in 5653, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0625 = fieldNorm(doc=5653)
          0.0050869077 = weight(abstract_txt:with in 5653) [ClassicSimilarity], result of:
            0.0050869077 = score(doc=5653,freq=1.0), product of:
              0.03255968 = queryWeight, product of:
                1.1774514 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.01106225 = queryNorm
              0.15623334 = fieldWeight in 5653, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.0625 = fieldNorm(doc=5653)
          0.015880376 = weight(abstract_txt:language in 5653) [ClassicSimilarity], result of:
            0.015880376 = score(doc=5653,freq=1.0), product of:
              0.060755786 = queryWeight, product of:
                1.3132612 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.01106225 = queryNorm
              0.26138046 = fieldWeight in 5653, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.0625 = fieldNorm(doc=5653)
          0.028539592 = weight(abstract_txt:techniques in 5653) [ClassicSimilarity], result of:
            0.028539592 = score(doc=5653,freq=2.0), product of:
              0.07128021 = queryWeight, product of:
                1.4224656 = boost
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.01106225 = queryNorm
              0.40038592 = fieldWeight in 5653, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.0625 = fieldNorm(doc=5653)
          0.22859193 = weight(abstract_txt:text in 5653) [ClassicSimilarity], result of:
            0.22859193 = score(doc=5653,freq=6.0), product of:
              0.36923975 = queryWeight, product of:
                8.25407 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.01106225 = queryNorm
              0.6190881 = fieldWeight in 5653, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=5653)
          0.75146586 = weight(abstract_txt:mining in 5653) [ClassicSimilarity], result of:
            0.75146586 = score(doc=5653,freq=6.0), product of:
              0.7948527 = queryWeight, product of:
                11.635263 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.01106225 = queryNorm
              0.94541526 = fieldWeight in 5653, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.0625 = fieldNorm(doc=5653)
        0.24 = coord(6/25)
    
  5. Zhou, L.; Chaovalit, P.: Ontology-supported polarity mining (2008) 0.25
    0.24684112 = sum of:
      0.24684112 = product of:
        1.2342056 = sum of:
          0.0076303617 = weight(abstract_txt:with in 1343) [ClassicSimilarity], result of:
            0.0076303617 = score(doc=1343,freq=1.0), product of:
              0.03255968 = queryWeight, product of:
                1.1774514 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.01106225 = queryNorm
              0.23435001 = fieldWeight in 1343, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.09375 = fieldNorm(doc=1343)
          0.027334021 = weight(abstract_txt:support in 1343) [ClassicSimilarity], result of:
            0.027334021 = score(doc=1343,freq=1.0), product of:
              0.066591986 = queryWeight, product of:
                1.374891 = boost
                4.378348 = idf(docFreq=1507, maxDocs=44218)
                0.01106225 = queryNorm
              0.41047013 = fieldWeight in 1343, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.378348 = idf(docFreq=1507, maxDocs=44218)
                0.09375 = fieldNorm(doc=1343)
          0.030270807 = weight(abstract_txt:techniques in 1343) [ClassicSimilarity], result of:
            0.030270807 = score(doc=1343,freq=1.0), product of:
              0.07128021 = queryWeight, product of:
                1.4224656 = boost
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.01106225 = queryNorm
              0.42467338 = fieldWeight in 1343, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.09375 = fieldNorm(doc=1343)
          0.1399834 = weight(abstract_txt:text in 1343) [ClassicSimilarity], result of:
            0.1399834 = score(doc=1343,freq=1.0), product of:
              0.36923975 = queryWeight, product of:
                8.25407 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.01106225 = queryNorm
              0.37911248 = fieldWeight in 1343, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.09375 = fieldNorm(doc=1343)
          1.028987 = weight(abstract_txt:mining in 1343) [ClassicSimilarity], result of:
            1.028987 = score(doc=1343,freq=5.0), product of:
              0.7948527 = queryWeight, product of:
                11.635263 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.01106225 = queryNorm
              1.2945632 = fieldWeight in 1343, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.09375 = fieldNorm(doc=1343)
        0.2 = coord(5/25)