Search (10 results, page 1 of 1)

  • × language_ss:"e"
  • × theme_ss:"Data Mining"
  1. Teich, E.; Degaetano-Ortlieb, S.; Fankhauser, P.; Kermes, H.; Lapshinova-Koltunski, E.: ¬The linguistic construal of disciplinarity : a data-mining approach using register features (2016) 0.04
    0.04188432 = product of:
      0.2094216 = sum of:
        0.2094216 = weight(_text_:grams in 3015) [ClassicSimilarity], result of:
          0.2094216 = score(doc=3015,freq=2.0), product of:
            0.39198354 = queryWeight, product of:
              8.059301 = idf(docFreq=37, maxDocs=44218)
              0.04863741 = queryNorm
            0.5342612 = fieldWeight in 3015, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.059301 = idf(docFreq=37, maxDocs=44218)
              0.046875 = fieldNorm(doc=3015)
      0.2 = coord(1/5)
    
    Abstract
    We analyze the linguistic evolution of selected scientific disciplines over a 30-year time span (1970s to 2000s). Our focus is on four highly specialized disciplines at the boundaries of computer science that emerged during that time: computational linguistics, bioinformatics, digital construction, and microelectronics. Our analysis is driven by the question whether these disciplines develop a distinctive language use-both individually and collectively-over the given time period. The data set is the English Scientific Text Corpus (scitex), which includes texts from the 1970s/1980s and early 2000s. Our theoretical basis is register theory. In terms of methods, we combine corpus-based methods of feature extraction (various aggregated features [part-of-speech based], n-grams, lexico-grammatical patterns) and automatic text classification. The results of our research are directly relevant to the study of linguistic variation and languages for specific purposes (LSP) and have implications for various natural language processing (NLP) tasks, for example, authorship attribution, text mining, or training NLP tools.
  2. Chowdhury, G.G.: Template mining for information extraction from digital documents (1999) 0.02
    0.018451152 = product of:
      0.09225576 = sum of:
        0.09225576 = weight(_text_:22 in 4577) [ClassicSimilarity], result of:
          0.09225576 = score(doc=4577,freq=2.0), product of:
            0.17031991 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04863741 = queryNorm
            0.5416616 = fieldWeight in 4577, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=4577)
      0.2 = coord(1/5)
    
    Date
    2. 4.2000 18:01:22
  3. KDD : techniques and applications (1998) 0.02
    0.015815273 = product of:
      0.079076365 = sum of:
        0.079076365 = weight(_text_:22 in 6783) [ClassicSimilarity], result of:
          0.079076365 = score(doc=6783,freq=2.0), product of:
            0.17031991 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04863741 = queryNorm
            0.46428138 = fieldWeight in 6783, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=6783)
      0.2 = coord(1/5)
    
    Footnote
    A special issue of selected papers from the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'97), held Singapore, 22-23 Feb 1997
  4. Matson, L.D.; Bonski, D.J.: Do digital libraries need librarians? (1997) 0.01
    0.010543516 = product of:
      0.052717578 = sum of:
        0.052717578 = weight(_text_:22 in 1737) [ClassicSimilarity], result of:
          0.052717578 = score(doc=1737,freq=2.0), product of:
            0.17031991 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04863741 = queryNorm
            0.30952093 = fieldWeight in 1737, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=1737)
      0.2 = coord(1/5)
    
    Date
    22.11.1998 18:57:22
  5. Amir, A.; Feldman, R.; Kashi, R.: ¬A new and versatile method for association generation (1997) 0.01
    0.010543516 = product of:
      0.052717578 = sum of:
        0.052717578 = weight(_text_:22 in 1270) [ClassicSimilarity], result of:
          0.052717578 = score(doc=1270,freq=2.0), product of:
            0.17031991 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04863741 = queryNorm
            0.30952093 = fieldWeight in 1270, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=1270)
      0.2 = coord(1/5)
    
    Source
    Information systems. 22(1997) nos.5/6, S.333-347
  6. Hofstede, A.H.M. ter; Proper, H.A.; Van der Weide, T.P.: Exploiting fact verbalisation in conceptual information modelling (1997) 0.01
    0.009225576 = product of:
      0.04612788 = sum of:
        0.04612788 = weight(_text_:22 in 2908) [ClassicSimilarity], result of:
          0.04612788 = score(doc=2908,freq=2.0), product of:
            0.17031991 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04863741 = queryNorm
            0.2708308 = fieldWeight in 2908, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2908)
      0.2 = coord(1/5)
    
    Source
    Information systems. 22(1997) nos.5/6, S.349-385
  7. Hallonsten, O.; Holmberg, D.: Analyzing structural stratification in the Swedish higher education system : data contextualization with policy-history analysis (2013) 0.01
    0.006589697 = product of:
      0.032948487 = sum of:
        0.032948487 = weight(_text_:22 in 668) [ClassicSimilarity], result of:
          0.032948487 = score(doc=668,freq=2.0), product of:
            0.17031991 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04863741 = queryNorm
            0.19345059 = fieldWeight in 668, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=668)
      0.2 = coord(1/5)
    
    Date
    22. 3.2013 19:43:01
  8. Vaughan, L.; Chen, Y.: Data mining from web search queries : a comparison of Google trends and Baidu index (2015) 0.01
    0.006589697 = product of:
      0.032948487 = sum of:
        0.032948487 = weight(_text_:22 in 1605) [ClassicSimilarity], result of:
          0.032948487 = score(doc=1605,freq=2.0), product of:
            0.17031991 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04863741 = queryNorm
            0.19345059 = fieldWeight in 1605, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1605)
      0.2 = coord(1/5)
    
    Source
    Journal of the Association for Information Science and Technology. 66(2015) no.1, S.13-22
  9. Fonseca, F.; Marcinkowski, M.; Davis, C.: Cyber-human systems of thought and understanding (2019) 0.01
    0.006589697 = product of:
      0.032948487 = sum of:
        0.032948487 = weight(_text_:22 in 5011) [ClassicSimilarity], result of:
          0.032948487 = score(doc=5011,freq=2.0), product of:
            0.17031991 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04863741 = queryNorm
            0.19345059 = fieldWeight in 5011, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5011)
      0.2 = coord(1/5)
    
    Date
    7. 3.2019 16:32:22
  10. Information visualization in data mining and knowledge discovery (2002) 0.00
    0.002635879 = product of:
      0.013179394 = sum of:
        0.013179394 = weight(_text_:22 in 1789) [ClassicSimilarity], result of:
          0.013179394 = score(doc=1789,freq=2.0), product of:
            0.17031991 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04863741 = queryNorm
            0.07738023 = fieldWeight in 1789, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.015625 = fieldNorm(doc=1789)
      0.2 = coord(1/5)
    
    Date
    23. 3.2008 19:10:22