Document (#41021)

Author
Tonkin, E.L.
Tourte, G.J.L.
Title
Working with text. tools, techniques and approaches for text mining
Imprint
Cambridge (MA) : Chandos Publisher
Year
2016
Pages
xiii, 330 S
Isbn
978-1-84334-749-1
Series
Chandos Information Professional Series
Abstract
What is text mining, and how can it be used? What relevance do these methods have to everyday work in information science and the digital humanities? How does one develop competences in text mining? Working with Text provides a series of cross-disciplinary perspectives on text mining and its applications. As text mining raises legal and ethical issues, the legal background of text mining and the responsibilities of the engineer are discussed in this book. Chapters provide an introduction to the use of the popular GATE text mining package with data drawn from social media, the use of text mining to support semantic search, the development of an authority system to support content tagging, and recent techniques in automatic language evaluation. Focused studies describe text mining on historical texts, automated indexing using constrained vocabularies, and the use of natural language processing to explore the climate science literature. Interviews are included that offer a glimpse into the real-life experience of working within commercial and academic text mining.
Footnote
Rez. in: JASIST 69(2018) no.1, S.181-184 (Jacques Savoy).
Theme
Data Mining
LCSH
Data mining
RSWK
Text Mining / Aufsatzsammlung
DDC
005.7
LCC
QA76.9.D343
RVK
ST 680

Similar documents (content)

  1. Mining text data (2012) 0.32
    0.31862408 = sum of:
      0.31862408 = product of:
        1.3276004 = sum of:
          0.120973565 = weight(subject_txt:aufsatzsammlung in 2363) [ClassicSimilarity], result of:
            0.120973565 = score(doc=2363,freq=1.0), product of:
              0.07365943 = queryWeight, product of:
                1.0065609 = boost
                6.569346 = idf(docFreq=159, maxDocs=41962)
                0.011139512 = queryNorm
              1.6423365 = fieldWeight in 2363, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.569346 = idf(docFreq=159, maxDocs=41962)
                0.25 = fieldNorm(doc=2363)
          0.02049613 = weight(abstract_txt:data in 2363) [ClassicSimilarity], result of:
            0.02049613 = score(doc=2363,freq=6.0), product of:
              0.039404772 = queryWeight, product of:
                1.0411547 = boost
                3.3975618 = idf(docFreq=3815, maxDocs=41962)
                0.011139512 = queryNorm
              0.52014333 = fieldWeight in 2363, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.3975618 = idf(docFreq=3815, maxDocs=41962)
                0.0625 = fieldNorm(doc=2363)
          0.005209254 = weight(abstract_txt:with in 2363) [ClassicSimilarity], result of:
            0.005209254 = score(doc=2363,freq=1.0), product of:
              0.032887597 = queryWeight, product of:
                1.1649374 = boost
                2.5343313 = idf(docFreq=9046, maxDocs=41962)
                0.011139512 = queryNorm
              0.15839571 = fieldWeight in 2363, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.5343313 = idf(docFreq=9046, maxDocs=41962)
                0.0625 = fieldNorm(doc=2363)
          0.018374534 = weight(abstract_txt:science in 2363) [ClassicSimilarity], result of:
            0.018374534 = score(doc=2363,freq=2.0), product of:
              0.052838713 = queryWeight, product of:
                1.2056382 = boost
                3.9343145 = idf(docFreq=2230, maxDocs=41962)
                0.011139512 = queryNorm
              0.34774756 = fieldWeight in 2363, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9343145 = idf(docFreq=2230, maxDocs=41962)
                0.0625 = fieldNorm(doc=2363)
          0.22660908 = weight(abstract_txt:text in 2363) [ClassicSimilarity], result of:
            0.22660908 = score(doc=2363,freq=6.0), product of:
              0.3649698 = queryWeight, product of:
                8.078412 = boost
                4.05569 = idf(docFreq=1975, maxDocs=41962)
                0.011139512 = queryNorm
              0.6208982 = fieldWeight in 2363, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.05569 = idf(docFreq=1975, maxDocs=41962)
                0.0625 = fieldNorm(doc=2363)
          0.9359378 = weight(abstract_txt:mining in 2363) [ClassicSimilarity], result of:
            0.9359378 = score(doc=2363,freq=9.0), product of:
              0.79913276 = queryWeight, product of:
                11.484867 = boost
                6.246357 = idf(docFreq=220, maxDocs=41962)
                0.011139512 = queryNorm
              1.1711919 = fieldWeight in 2363, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                6.246357 = idf(docFreq=220, maxDocs=41962)
                0.0625 = fieldNorm(doc=2363)
        0.24 = coord(6/25)
    
  2. Relational data mining (2001) 0.27
    0.26518774 = sum of:
      0.26518774 = product of:
        1.104949 = sum of:
          0.0233879 = weight(abstract_txt:data in 2304) [ClassicSimilarity], result of:
            0.0233879 = score(doc=2304,freq=5.0), product of:
              0.039404772 = queryWeight, product of:
                1.0411547 = boost
                3.3975618 = idf(docFreq=3815, maxDocs=41962)
                0.011139512 = queryNorm
              0.59352964 = fieldWeight in 2304, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.3975618 = idf(docFreq=3815, maxDocs=41962)
                0.078125 = fieldNorm(doc=2304)
          0.0626528 = weight(abstract_txt:chapters in 2304) [ClassicSimilarity], result of:
            0.0626528 = score(doc=2304,freq=2.0), product of:
              0.08187506 = queryWeight, product of:
                1.0612109 = boost
                6.9260206 = idf(docFreq=111, maxDocs=41962)
                0.011139512 = queryNorm
              0.7652244 = fieldWeight in 2304, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.9260206 = idf(docFreq=111, maxDocs=41962)
                0.078125 = fieldNorm(doc=2304)
          0.0065115676 = weight(abstract_txt:with in 2304) [ClassicSimilarity], result of:
            0.0065115676 = score(doc=2304,freq=1.0), product of:
              0.032887597 = queryWeight, product of:
                1.1649374 = boost
                2.5343313 = idf(docFreq=9046, maxDocs=41962)
                0.011139512 = queryNorm
              0.19799463 = fieldWeight in 2304, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.5343313 = idf(docFreq=9046, maxDocs=41962)
                0.078125 = fieldNorm(doc=2304)
          0.024747195 = weight(abstract_txt:techniques in 2304) [ClassicSimilarity], result of:
            0.024747195 = score(doc=2304,freq=1.0), product of:
              0.06996734 = queryWeight, product of:
                1.3873581 = boost
                4.527314 = idf(docFreq=1232, maxDocs=41962)
                0.011139512 = queryNorm
              0.3536964 = fieldWeight in 2304, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.527314 = idf(docFreq=1232, maxDocs=41962)
                0.078125 = fieldNorm(doc=2304)
          0.11564096 = weight(abstract_txt:text in 2304) [ClassicSimilarity], result of:
            0.11564096 = score(doc=2304,freq=1.0), product of:
              0.3649698 = queryWeight, product of:
                8.078412 = boost
                4.05569 = idf(docFreq=1975, maxDocs=41962)
                0.011139512 = queryNorm
              0.31685078 = fieldWeight in 2304, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.05569 = idf(docFreq=1975, maxDocs=41962)
                0.078125 = fieldNorm(doc=2304)
          0.8720086 = weight(abstract_txt:mining in 2304) [ClassicSimilarity], result of:
            0.8720086 = score(doc=2304,freq=5.0), product of:
              0.79913276 = queryWeight, product of:
                11.484867 = boost
                6.246357 = idf(docFreq=220, maxDocs=41962)
                0.011139512 = queryNorm
              1.0911937 = fieldWeight in 2304, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.246357 = idf(docFreq=220, maxDocs=41962)
                0.078125 = fieldNorm(doc=2304)
        0.24 = coord(6/25)
    
  3. Varathan, K.D.; Giachanou, A.; Crestani, F.: Comparative opinion mining : a review (2017) 0.26
    0.2591534 = sum of:
      0.2591534 = product of:
        1.0798059 = sum of:
          0.00836751 = weight(abstract_txt:data in 105) [ClassicSimilarity], result of:
            0.00836751 = score(doc=105,freq=1.0), product of:
              0.039404772 = queryWeight, product of:
                1.0411547 = boost
                3.3975618 = idf(docFreq=3815, maxDocs=41962)
                0.011139512 = queryNorm
              0.21234761 = fieldWeight in 105, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3975618 = idf(docFreq=3815, maxDocs=41962)
                0.0625 = fieldNorm(doc=105)
          0.0073669977 = weight(abstract_txt:with in 105) [ClassicSimilarity], result of:
            0.0073669977 = score(doc=105,freq=2.0), product of:
              0.032887597 = queryWeight, product of:
                1.1649374 = boost
                2.5343313 = idf(docFreq=9046, maxDocs=41962)
                0.011139512 = queryNorm
              0.22400536 = fieldWeight in 105, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.5343313 = idf(docFreq=9046, maxDocs=41962)
                0.0625 = fieldNorm(doc=105)
          0.015823008 = weight(abstract_txt:language in 105) [ClassicSimilarity], result of:
            0.015823008 = score(doc=105,freq=1.0), product of:
              0.060257446 = queryWeight, product of:
                1.287497 = boost
                4.2014413 = idf(docFreq=1707, maxDocs=41962)
                0.011139512 = queryNorm
              0.26259008 = fieldWeight in 105, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2014413 = idf(docFreq=1707, maxDocs=41962)
                0.0625 = fieldNorm(doc=105)
          0.019797757 = weight(abstract_txt:techniques in 105) [ClassicSimilarity], result of:
            0.019797757 = score(doc=105,freq=1.0), product of:
              0.06996734 = queryWeight, product of:
                1.3873581 = boost
                4.527314 = idf(docFreq=1232, maxDocs=41962)
                0.011139512 = queryNorm
              0.28295714 = fieldWeight in 105, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.527314 = idf(docFreq=1232, maxDocs=41962)
                0.0625 = fieldNorm(doc=105)
          0.092512764 = weight(abstract_txt:text in 105) [ClassicSimilarity], result of:
            0.092512764 = score(doc=105,freq=1.0), product of:
              0.3649698 = queryWeight, product of:
                8.078412 = boost
                4.05569 = idf(docFreq=1975, maxDocs=41962)
                0.011139512 = queryNorm
              0.2534806 = fieldWeight in 105, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.05569 = idf(docFreq=1975, maxDocs=41962)
                0.0625 = fieldNorm(doc=105)
          0.9359378 = weight(abstract_txt:mining in 105) [ClassicSimilarity], result of:
            0.9359378 = score(doc=105,freq=9.0), product of:
              0.79913276 = queryWeight, product of:
                11.484867 = boost
                6.246357 = idf(docFreq=220, maxDocs=41962)
                0.011139512 = queryNorm
              1.1711919 = fieldWeight in 105, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                6.246357 = idf(docFreq=220, maxDocs=41962)
                0.0625 = fieldNorm(doc=105)
        0.24 = coord(6/25)
    
  4. Haravu, L.J.; Neelameghan, A.: Text mining and data mining in knowledge organization and discovery : the making of knowledge-based products (2003) 0.25
    0.25239915 = sum of:
      0.25239915 = product of:
        1.0516632 = sum of:
          0.011833444 = weight(abstract_txt:data in 654) [ClassicSimilarity], result of:
            0.011833444 = score(doc=654,freq=2.0), product of:
              0.039404772 = queryWeight, product of:
                1.0411547 = boost
                3.3975618 = idf(docFreq=3815, maxDocs=41962)
                0.011139512 = queryNorm
              0.30030486 = fieldWeight in 654, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3975618 = idf(docFreq=3815, maxDocs=41962)
                0.0625 = fieldNorm(doc=654)
          0.005209254 = weight(abstract_txt:with in 654) [ClassicSimilarity], result of:
            0.005209254 = score(doc=654,freq=1.0), product of:
              0.032887597 = queryWeight, product of:
                1.1649374 = boost
                2.5343313 = idf(docFreq=9046, maxDocs=41962)
                0.011139512 = queryNorm
              0.15839571 = fieldWeight in 654, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.5343313 = idf(docFreq=9046, maxDocs=41962)
                0.0625 = fieldNorm(doc=654)
          0.015823008 = weight(abstract_txt:language in 654) [ClassicSimilarity], result of:
            0.015823008 = score(doc=654,freq=1.0), product of:
              0.060257446 = queryWeight, product of:
                1.287497 = boost
                4.2014413 = idf(docFreq=1707, maxDocs=41962)
                0.011139512 = queryNorm
              0.26259008 = fieldWeight in 654, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2014413 = idf(docFreq=1707, maxDocs=41962)
                0.0625 = fieldNorm(doc=654)
          0.027998256 = weight(abstract_txt:techniques in 654) [ClassicSimilarity], result of:
            0.027998256 = score(doc=654,freq=2.0), product of:
              0.06996734 = queryWeight, product of:
                1.3873581 = boost
                4.527314 = idf(docFreq=1232, maxDocs=41962)
                0.011139512 = queryNorm
              0.4001618 = fieldWeight in 654, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.527314 = idf(docFreq=1232, maxDocs=41962)
                0.0625 = fieldNorm(doc=654)
          0.22660908 = weight(abstract_txt:text in 654) [ClassicSimilarity], result of:
            0.22660908 = score(doc=654,freq=6.0), product of:
              0.3649698 = queryWeight, product of:
                8.078412 = boost
                4.05569 = idf(docFreq=1975, maxDocs=41962)
                0.011139512 = queryNorm
              0.6208982 = fieldWeight in 654, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.05569 = idf(docFreq=1975, maxDocs=41962)
                0.0625 = fieldNorm(doc=654)
          0.7641901 = weight(abstract_txt:mining in 654) [ClassicSimilarity], result of:
            0.7641901 = score(doc=654,freq=6.0), product of:
              0.79913276 = queryWeight, product of:
                11.484867 = boost
                6.246357 = idf(docFreq=220, maxDocs=41962)
                0.011139512 = queryNorm
              0.9562743 = fieldWeight in 654, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.246357 = idf(docFreq=220, maxDocs=41962)
                0.0625 = fieldNorm(doc=654)
        0.24 = coord(6/25)
    
  5. Zhou, L.; Chaovalit, P.: Ontology-supported polarity mining (2008) 0.25
    0.2501026 = sum of:
      0.2501026 = product of:
        1.250513 = sum of:
          0.007813881 = weight(abstract_txt:with in 3344) [ClassicSimilarity], result of:
            0.007813881 = score(doc=3344,freq=1.0), product of:
              0.032887597 = queryWeight, product of:
                1.1649374 = boost
                2.5343313 = idf(docFreq=9046, maxDocs=41962)
                0.011139512 = queryNorm
              0.23759356 = fieldWeight in 3344, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.5343313 = idf(docFreq=9046, maxDocs=41962)
                0.09375 = fieldNorm(doc=3344)
          0.027822807 = weight(abstract_txt:support in 3344) [ClassicSimilarity], result of:
            0.027822807 = score(doc=3344,freq=1.0), product of:
              0.06699224 = queryWeight, product of:
                1.3575416 = boost
                4.430015 = idf(docFreq=1358, maxDocs=41962)
                0.011139512 = queryNorm
              0.4153139 = fieldWeight in 3344, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.430015 = idf(docFreq=1358, maxDocs=41962)
                0.09375 = fieldNorm(doc=3344)
          0.029696636 = weight(abstract_txt:techniques in 3344) [ClassicSimilarity], result of:
            0.029696636 = score(doc=3344,freq=1.0), product of:
              0.06996734 = queryWeight, product of:
                1.3873581 = boost
                4.527314 = idf(docFreq=1232, maxDocs=41962)
                0.011139512 = queryNorm
              0.4244357 = fieldWeight in 3344, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.527314 = idf(docFreq=1232, maxDocs=41962)
                0.09375 = fieldNorm(doc=3344)
          0.13876915 = weight(abstract_txt:text in 3344) [ClassicSimilarity], result of:
            0.13876915 = score(doc=3344,freq=1.0), product of:
              0.3649698 = queryWeight, product of:
                8.078412 = boost
                4.05569 = idf(docFreq=1975, maxDocs=41962)
                0.011139512 = queryNorm
              0.38022092 = fieldWeight in 3344, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.05569 = idf(docFreq=1975, maxDocs=41962)
                0.09375 = fieldNorm(doc=3344)
          1.0464104 = weight(abstract_txt:mining in 3344) [ClassicSimilarity], result of:
            1.0464104 = score(doc=3344,freq=5.0), product of:
              0.79913276 = queryWeight, product of:
                11.484867 = boost
                6.246357 = idf(docFreq=220, maxDocs=41962)
                0.011139512 = queryNorm
              1.3094325 = fieldWeight in 3344, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.246357 = idf(docFreq=220, maxDocs=41962)
                0.09375 = fieldNorm(doc=3344)
        0.2 = coord(5/25)