Document (#37239)

Author
Alberts, I.
Forest, D.
Title
Email pragmatics and automatic classification : a study in the organizational context
Source
Journal of the American Society for Information Science and Technology. 63(2012) no.5, S.904-922
Year
2012
Abstract
This paper presents a two-phased research project aiming to improve email triage for public administration managers. The first phase developed a typology of email classification patterns through a qualitative study involving 34 participants. Inspired by the fields of pragmatics and speech act theory, this typology comprising four top level categories and 13 subcategories represents the typical email triage behaviors of managers in an organizational context. The second study phase was conducted on a corpus of 1,703 messages using email samples of two managers. Using the k-NN (k-nearest neighbor) algorithm, statistical treatments automatically classified the email according to lexical and nonlexical features representative of managers' triage patterns. The automatic classification of email according to the lexicon of the messages was found to be substantially more efficient when k = 2 and n = 2,000. For four categories, the average recall rate was 94.32%, the average precision rate was 94.50%, and the accuracy rate was 94.54%. For 13 categories, the average recall rate was 91.09%, the average precision rate was 84.18%, and the accuracy rate was 88.70%. It appears that a message's nonlexical features are also deeply influenced by email pragmatics. Features related to the recipient and the sender were the most relevant for characterizing email.
Theme
Automatisches Klassifizieren
Object
Email

Similar documents (content)

  1. Gao, N.; Dredze, M.; Oard, D.W.: Person entity linking in email with NIL detection (2017) 0.14
    0.13598467 = sum of:
      0.13598467 = product of:
        0.84990424 = sum of:
          0.049958464 = weight(abstract_txt:accuracy in 3830) [ClassicSimilarity], result of:
            0.049958464 = score(doc=3830,freq=2.0), product of:
              0.09467534 = queryWeight, product of:
                1.3470801 = boost
                5.9700394 = idf(docFreq=306, maxDocs=44218)
                0.011772434 = queryNorm
              0.5276819 = fieldWeight in 3830, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.9700394 = idf(docFreq=306, maxDocs=44218)
                0.0625 = fieldNorm(doc=3830)
          0.023290448 = weight(abstract_txt:features in 3830) [ClassicSimilarity], result of:
            0.023290448 = score(doc=3830,freq=1.0), product of:
              0.08209621 = queryWeight, product of:
                1.5363216 = boost
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.011772434 = queryNorm
              0.28369698 = fieldWeight in 3830, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.0625 = fieldNorm(doc=3830)
          0.05476243 = weight(abstract_txt:messages in 3830) [ClassicSimilarity], result of:
            0.05476243 = score(doc=3830,freq=1.0), product of:
              0.12681267 = queryWeight, product of:
                1.5590365 = boost
                6.9093957 = idf(docFreq=119, maxDocs=44218)
                0.011772434 = queryNorm
              0.43183723 = fieldWeight in 3830, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9093957 = idf(docFreq=119, maxDocs=44218)
                0.0625 = fieldNorm(doc=3830)
          0.7218929 = weight(abstract_txt:email in 3830) [ClassicSimilarity], result of:
            0.7218929 = score(doc=3830,freq=4.0), product of:
              0.73599267 = queryWeight, product of:
                7.967425 = boost
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.011772434 = queryNorm
              0.9808425 = fieldWeight in 3830, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.0625 = fieldNorm(doc=3830)
        0.16 = coord(4/25)
    
  2. Jörgensen, P.: Incorporating context in text analysis by interactive activation with competition artificial neural networks (2005) 0.13
    0.12797828 = sum of:
      0.12797828 = product of:
        0.6398914 = sum of:
          0.009994963 = weight(abstract_txt:study in 1039) [ClassicSimilarity], result of:
            0.009994963 = score(doc=1039,freq=1.0), product of:
              0.04670808 = queryWeight, product of:
                1.1588217 = boost
                3.423806 = idf(docFreq=3916, maxDocs=44218)
                0.011772434 = queryNorm
              0.21398787 = fieldWeight in 1039, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.423806 = idf(docFreq=3916, maxDocs=44218)
                0.0625 = fieldNorm(doc=1039)
          0.02397457 = weight(abstract_txt:according in 1039) [ClassicSimilarity], result of:
            0.02397457 = score(doc=1039,freq=1.0), product of:
              0.07311526 = queryWeight, product of:
                1.1838018 = boost
                5.2464166 = idf(docFreq=632, maxDocs=44218)
                0.011772434 = queryNorm
              0.32790104 = fieldWeight in 1039, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2464166 = idf(docFreq=632, maxDocs=44218)
                0.0625 = fieldNorm(doc=1039)
          0.040704034 = weight(abstract_txt:phase in 1039) [ClassicSimilarity], result of:
            0.040704034 = score(doc=1039,freq=1.0), product of:
              0.10405568 = queryWeight, product of:
                1.4122379 = boost
                6.258808 = idf(docFreq=229, maxDocs=44218)
                0.011772434 = queryNorm
              0.3911755 = fieldWeight in 1039, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.258808 = idf(docFreq=229, maxDocs=44218)
                0.0625 = fieldNorm(doc=1039)
          0.05476243 = weight(abstract_txt:messages in 1039) [ClassicSimilarity], result of:
            0.05476243 = score(doc=1039,freq=1.0), product of:
              0.12681267 = queryWeight, product of:
                1.5590365 = boost
                6.9093957 = idf(docFreq=119, maxDocs=44218)
                0.011772434 = queryNorm
              0.43183723 = fieldWeight in 1039, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9093957 = idf(docFreq=119, maxDocs=44218)
                0.0625 = fieldNorm(doc=1039)
          0.51045537 = weight(abstract_txt:email in 1039) [ClassicSimilarity], result of:
            0.51045537 = score(doc=1039,freq=2.0), product of:
              0.73599267 = queryWeight, product of:
                7.967425 = boost
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.011772434 = queryNorm
              0.69356036 = fieldWeight in 1039, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.0625 = fieldNorm(doc=1039)
        0.2 = coord(5/25)
    
  3. Rodriguez-Esteban, R.; Vishnyakova, D.; Rinaldi, F.: Revisiting the decay of scientific email addresses (2022) 0.12
    0.123308875 = sum of:
      0.123308875 = product of:
        1.541361 = sum of:
          0.009994963 = weight(abstract_txt:study in 449) [ClassicSimilarity], result of:
            0.009994963 = score(doc=449,freq=1.0), product of:
              0.04670808 = queryWeight, product of:
                1.1588217 = boost
                3.423806 = idf(docFreq=3916, maxDocs=44218)
                0.011772434 = queryNorm
              0.21398787 = fieldWeight in 449, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.423806 = idf(docFreq=3916, maxDocs=44218)
                0.0625 = fieldNorm(doc=449)
          1.531366 = weight(abstract_txt:email in 449) [ClassicSimilarity], result of:
            1.531366 = score(doc=449,freq=18.0), product of:
              0.73599267 = queryWeight, product of:
                7.967425 = boost
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.011772434 = queryNorm
              2.080681 = fieldWeight in 449, product of:
                4.2426405 = tf(freq=18.0), with freq of:
                  18.0 = termFreq=18.0
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.0625 = fieldNorm(doc=449)
        0.08 = coord(2/25)
    
  4. Na, J.-C.; Sui, H.; Khoo, C.; Chan, S.; Zhou, Y.: Effectiveness of simple linguistic processing in automatic sentiment classification of product reviews (2004) 0.08
    0.08084013 = sum of:
      0.08084013 = product of:
        0.3368339 = sum of:
          0.022349415 = weight(abstract_txt:study in 2624) [ClassicSimilarity], result of:
            0.022349415 = score(doc=2624,freq=5.0), product of:
              0.04670808 = queryWeight, product of:
                1.1588217 = boost
                3.423806 = idf(docFreq=3916, maxDocs=44218)
                0.011772434 = queryNorm
              0.47849143 = fieldWeight in 2624, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.423806 = idf(docFreq=3916, maxDocs=44218)
                0.0625 = fieldNorm(doc=2624)
          0.023284616 = weight(abstract_txt:automatic in 2624) [ClassicSimilarity], result of:
            0.023284616 = score(doc=2624,freq=1.0), product of:
              0.07170568 = queryWeight, product of:
                1.172335 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.011772434 = queryNorm
              0.32472485 = fieldWeight in 2624, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.0625 = fieldNorm(doc=2624)
          0.07065194 = weight(abstract_txt:accuracy in 2624) [ClassicSimilarity], result of:
            0.07065194 = score(doc=2624,freq=4.0), product of:
              0.09467534 = queryWeight, product of:
                1.3470801 = boost
                5.9700394 = idf(docFreq=306, maxDocs=44218)
                0.011772434 = queryNorm
              0.7462549 = fieldWeight in 2624, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.9700394 = idf(docFreq=306, maxDocs=44218)
                0.0625 = fieldNorm(doc=2624)
          0.03542704 = weight(abstract_txt:classification in 2624) [ClassicSimilarity], result of:
            0.03542704 = score(doc=2624,freq=5.0), product of:
              0.06349962 = queryWeight, product of:
                1.3511581 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.011772434 = queryNorm
              0.5579095 = fieldWeight in 2624, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.0625 = fieldNorm(doc=2624)
          0.023290448 = weight(abstract_txt:features in 2624) [ClassicSimilarity], result of:
            0.023290448 = score(doc=2624,freq=1.0), product of:
              0.08209621 = queryWeight, product of:
                1.5363216 = boost
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.011772434 = queryNorm
              0.28369698 = fieldWeight in 2624, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.0625 = fieldNorm(doc=2624)
          0.16183044 = weight(abstract_txt:rate in 2624) [ClassicSimilarity], result of:
            0.16183044 = score(doc=2624,freq=2.0), product of:
              0.2989359 = queryWeight, product of:
                4.1459556 = boost
                6.124733 = idf(docFreq=262, maxDocs=44218)
                0.011772434 = queryNorm
              0.541355 = fieldWeight in 2624, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.124733 = idf(docFreq=262, maxDocs=44218)
                0.0625 = fieldNorm(doc=2624)
        0.24 = coord(6/25)
    
  5. Kim, Y.H.; Kim, H.H.: Development and validation of evaluation indicators for a consortium of institutional repositories : a case study of dcollection (2008) 0.08
    0.079195656 = sum of:
      0.079195656 = product of:
        0.39597827 = sum of:
          0.009994963 = weight(abstract_txt:study in 1882) [ClassicSimilarity], result of:
            0.009994963 = score(doc=1882,freq=1.0), product of:
              0.04670808 = queryWeight, product of:
                1.1588217 = boost
                3.423806 = idf(docFreq=3916, maxDocs=44218)
                0.011772434 = queryNorm
              0.21398787 = fieldWeight in 1882, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.423806 = idf(docFreq=3916, maxDocs=44218)
                0.0625 = fieldNorm(doc=1882)
          0.023224236 = weight(abstract_txt:four in 1882) [ClassicSimilarity], result of:
            0.023224236 = score(doc=1882,freq=1.0), product of:
              0.07158166 = queryWeight, product of:
                1.1713208 = boost
                5.191103 = idf(docFreq=668, maxDocs=44218)
                0.011772434 = queryNorm
              0.32444394 = fieldWeight in 1882, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.191103 = idf(docFreq=668, maxDocs=44218)
                0.0625 = fieldNorm(doc=1882)
          0.048886564 = weight(abstract_txt:categories in 1882) [ClassicSimilarity], result of:
            0.048886564 = score(doc=1882,freq=2.0), product of:
              0.10682041 = queryWeight, product of:
                1.7524585 = boost
                5.17774 = idf(docFreq=677, maxDocs=44218)
                0.011772434 = queryNorm
              0.45765188 = fieldWeight in 1882, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.17774 = idf(docFreq=677, maxDocs=44218)
                0.0625 = fieldNorm(doc=1882)
          0.0850097 = weight(abstract_txt:managers in 1882) [ClassicSimilarity], result of:
            0.0850097 = score(doc=1882,freq=1.0), product of:
              0.2142051 = queryWeight, product of:
                2.8655293 = boost
                6.3497796 = idf(docFreq=209, maxDocs=44218)
                0.011772434 = queryNorm
              0.39686123 = fieldWeight in 1882, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3497796 = idf(docFreq=209, maxDocs=44218)
                0.0625 = fieldNorm(doc=1882)
          0.2288628 = weight(abstract_txt:rate in 1882) [ClassicSimilarity], result of:
            0.2288628 = score(doc=1882,freq=4.0), product of:
              0.2989359 = queryWeight, product of:
                4.1459556 = boost
                6.124733 = idf(docFreq=262, maxDocs=44218)
                0.011772434 = queryNorm
              0.7655916 = fieldWeight in 1882, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.124733 = idf(docFreq=262, maxDocs=44218)
                0.0625 = fieldNorm(doc=1882)
        0.2 = coord(5/25)