Document (#37240)

Author
Alberts, I.
Forest, D.
Title
Email pragmatics and automatic classification : a study in the organizational context
Source
Journal of the American Society for Information Science and Technology. 63(2012) no.5, S.904-922
Year
2012
Abstract
This paper presents a two-phased research project aiming to improve email triage for public administration managers. The first phase developed a typology of email classification patterns through a qualitative study involving 34 participants. Inspired by the fields of pragmatics and speech act theory, this typology comprising four top level categories and 13 subcategories represents the typical email triage behaviors of managers in an organizational context. The second study phase was conducted on a corpus of 1,703 messages using email samples of two managers. Using the k-NN (k-nearest neighbor) algorithm, statistical treatments automatically classified the email according to lexical and nonlexical features representative of managers' triage patterns. The automatic classification of email according to the lexicon of the messages was found to be substantially more efficient when k = 2 and n = 2,000. For four categories, the average recall rate was 94.32%, the average precision rate was 94.50%, and the accuracy rate was 94.54%. For 13 categories, the average recall rate was 91.09%, the average precision rate was 84.18%, and the accuracy rate was 88.70%. It appears that a message's nonlexical features are also deeply influenced by email pragmatics. Features related to the recipient and the sender were the most relevant for characterizing email.
Theme
Automatisches Klassifizieren
Object
Email

Similar documents (content)

  1. Gao, N.; Dredze, M.; Oard, D.W.: Person entity linking in email with NIL detection (2017) 0.14
    0.13676752 = sum of:
      0.13676752 = product of:
        0.85479707 = sum of:
          0.0511518 = weight(abstract_txt:accuracy in 749) [ClassicSimilarity], result of:
            0.0511518 = score(doc=749,freq=2.0), product of:
              0.096051544 = queryWeight, product of:
                1.3569816 = boost
                6.025063 = idf(docFreq=277, maxDocs=42306)
                0.011748131 = queryNorm
              0.5325453 = fieldWeight in 749, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.025063 = idf(docFreq=277, maxDocs=42306)
                0.0625 = fieldNorm(doc=749)
          0.023651639 = weight(abstract_txt:features in 749) [ClassicSimilarity], result of:
            0.023651639 = score(doc=749,freq=1.0), product of:
              0.08283457 = queryWeight, product of:
                1.5433812 = boost
                4.5684576 = idf(docFreq=1192, maxDocs=42306)
                0.011748131 = queryNorm
              0.2855286 = fieldWeight in 749, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5684576 = idf(docFreq=1192, maxDocs=42306)
                0.0625 = fieldNorm(doc=749)
          0.05492579 = weight(abstract_txt:messages in 749) [ClassicSimilarity], result of:
            0.05492579 = score(doc=749,freq=1.0), product of:
              0.12689891 = queryWeight, product of:
                1.5597347 = boost
                6.9252963 = idf(docFreq=112, maxDocs=42306)
                0.011748131 = queryNorm
              0.43283102 = fieldWeight in 749, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9252963 = idf(docFreq=112, maxDocs=42306)
                0.0625 = fieldNorm(doc=749)
          0.72506785 = weight(abstract_txt:email in 749) [ClassicSimilarity], result of:
            0.72506785 = score(doc=749,freq=4.0), product of:
              0.73718584 = queryWeight, product of:
                7.97474 = boost
                7.8684945 = idf(docFreq=43, maxDocs=42306)
                0.011748131 = queryNorm
              0.9835618 = fieldWeight in 749, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.8684945 = idf(docFreq=43, maxDocs=42306)
                0.0625 = fieldNorm(doc=749)
        0.16 = coord(4/25)
    
  2. Jörgensen, P.: Incorporating context in text analysis by interactive activation with competition artificial neural networks (2005) 0.13
    0.12856084 = sum of:
      0.12856084 = product of:
        0.6428042 = sum of:
          0.010504879 = weight(abstract_txt:study in 3040) [ClassicSimilarity], result of:
            0.010504879 = score(doc=3040,freq=1.0), product of:
              0.048220478 = queryWeight, product of:
                1.1775603 = boost
                3.485616 = idf(docFreq=3522, maxDocs=42306)
                0.011748131 = queryNorm
              0.217851 = fieldWeight in 3040, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.485616 = idf(docFreq=3522, maxDocs=42306)
                0.0625 = fieldNorm(doc=3040)
          0.02412344 = weight(abstract_txt:according in 3040) [ClassicSimilarity], result of:
            0.02412344 = score(doc=3040,freq=1.0), product of:
              0.07332181 = queryWeight, product of:
                1.1856005 = boost
                5.264123 = idf(docFreq=594, maxDocs=42306)
                0.011748131 = queryNorm
              0.3290077 = fieldWeight in 3040, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.264123 = idf(docFreq=594, maxDocs=42306)
                0.0625 = fieldNorm(doc=3040)
          0.040549714 = weight(abstract_txt:phase in 3040) [ClassicSimilarity], result of:
            0.040549714 = score(doc=3040,freq=1.0), product of:
              0.10365707 = queryWeight, product of:
                1.4096823 = boost
                6.2590566 = idf(docFreq=219, maxDocs=42306)
                0.011748131 = queryNorm
              0.39119104 = fieldWeight in 3040, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2590566 = idf(docFreq=219, maxDocs=42306)
                0.0625 = fieldNorm(doc=3040)
          0.05492579 = weight(abstract_txt:messages in 3040) [ClassicSimilarity], result of:
            0.05492579 = score(doc=3040,freq=1.0), product of:
              0.12689891 = queryWeight, product of:
                1.5597347 = boost
                6.9252963 = idf(docFreq=112, maxDocs=42306)
                0.011748131 = queryNorm
              0.43283102 = fieldWeight in 3040, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9252963 = idf(docFreq=112, maxDocs=42306)
                0.0625 = fieldNorm(doc=3040)
          0.5127004 = weight(abstract_txt:email in 3040) [ClassicSimilarity], result of:
            0.5127004 = score(doc=3040,freq=2.0), product of:
              0.73718584 = queryWeight, product of:
                7.97474 = boost
                7.8684945 = idf(docFreq=43, maxDocs=42306)
                0.011748131 = queryNorm
              0.6954832 = fieldWeight in 3040, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.8684945 = idf(docFreq=43, maxDocs=42306)
                0.0625 = fieldNorm(doc=3040)
        0.2 = coord(5/25)
    
  3. Na, J.-C.; Sui, H.; Khoo, C.; Chan, S.; Zhou, Y.: Effectiveness of simple linguistic processing in automatic sentiment classification of product reviews (2004) 0.08
    0.08139686 = sum of:
      0.08139686 = product of:
        0.33915362 = sum of:
          0.023197822 = weight(abstract_txt:automatic in 3625) [ClassicSimilarity], result of:
            0.023197822 = score(doc=3625,freq=1.0), product of:
              0.071434036 = queryWeight, product of:
                1.1702385 = boost
                5.1959147 = idf(docFreq=636, maxDocs=42306)
                0.011748131 = queryNorm
              0.32474467 = fieldWeight in 3625, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1959147 = idf(docFreq=636, maxDocs=42306)
                0.0625 = fieldNorm(doc=3625)
          0.023489624 = weight(abstract_txt:study in 3625) [ClassicSimilarity], result of:
            0.023489624 = score(doc=3625,freq=5.0), product of:
              0.048220478 = queryWeight, product of:
                1.1775603 = boost
                3.485616 = idf(docFreq=3522, maxDocs=42306)
                0.011748131 = queryNorm
              0.48712966 = fieldWeight in 3625, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.485616 = idf(docFreq=3522, maxDocs=42306)
                0.0625 = fieldNorm(doc=3625)
          0.03570624 = weight(abstract_txt:classification in 3625) [ClassicSimilarity], result of:
            0.03570624 = score(doc=3625,freq=5.0), product of:
              0.063749515 = queryWeight, product of:
                1.35396 = boost
                4.007765 = idf(docFreq=2089, maxDocs=42306)
                0.011748131 = queryNorm
              0.56010216 = fieldWeight in 3625, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.007765 = idf(docFreq=2089, maxDocs=42306)
                0.0625 = fieldNorm(doc=3625)
          0.07233958 = weight(abstract_txt:accuracy in 3625) [ClassicSimilarity], result of:
            0.07233958 = score(doc=3625,freq=4.0), product of:
              0.096051544 = queryWeight, product of:
                1.3569816 = boost
                6.025063 = idf(docFreq=277, maxDocs=42306)
                0.011748131 = queryNorm
              0.7531329 = fieldWeight in 3625, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.025063 = idf(docFreq=277, maxDocs=42306)
                0.0625 = fieldNorm(doc=3625)
          0.023651639 = weight(abstract_txt:features in 3625) [ClassicSimilarity], result of:
            0.023651639 = score(doc=3625,freq=1.0), product of:
              0.08283457 = queryWeight, product of:
                1.5433812 = boost
                4.5684576 = idf(docFreq=1192, maxDocs=42306)
                0.011748131 = queryNorm
              0.2855286 = fieldWeight in 3625, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5684576 = idf(docFreq=1192, maxDocs=42306)
                0.0625 = fieldNorm(doc=3625)
          0.1607687 = weight(abstract_txt:rate in 3625) [ClassicSimilarity], result of:
            0.1607687 = score(doc=3625,freq=2.0), product of:
              0.29723853 = queryWeight, product of:
                4.134614 = boost
                6.1192946 = idf(docFreq=252, maxDocs=42306)
                0.011748131 = queryNorm
              0.54087436 = fieldWeight in 3625, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.1192946 = idf(docFreq=252, maxDocs=42306)
                0.0625 = fieldNorm(doc=3625)
        0.24 = coord(6/25)
    
  4. Kim, Y.H.; Kim, H.H.: Development and validation of evaluation indicators for a consortium of institutional repositories : a case study of dcollection (2008) 0.08
    0.07941965 = sum of:
      0.07941965 = product of:
        0.39709824 = sum of:
          0.010504879 = weight(abstract_txt:study in 3883) [ClassicSimilarity], result of:
            0.010504879 = score(doc=3883,freq=1.0), product of:
              0.048220478 = queryWeight, product of:
                1.1775603 = boost
                3.485616 = idf(docFreq=3522, maxDocs=42306)
                0.011748131 = queryNorm
              0.217851 = fieldWeight in 3883, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.485616 = idf(docFreq=3522, maxDocs=42306)
                0.0625 = fieldNorm(doc=3883)
          0.023782767 = weight(abstract_txt:four in 3883) [ClassicSimilarity], result of:
            0.023782767 = score(doc=3883,freq=1.0), product of:
              0.07262988 = queryWeight, product of:
                1.179993 = boost
                5.2392254 = idf(docFreq=609, maxDocs=42306)
                0.011748131 = queryNorm
              0.3274516 = fieldWeight in 3883, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2392254 = idf(docFreq=609, maxDocs=42306)
                0.0625 = fieldNorm(doc=3883)
          0.049569968 = weight(abstract_txt:categories in 3883) [ClassicSimilarity], result of:
            0.049569968 = score(doc=3883,freq=2.0), product of:
              0.10767294 = queryWeight, product of:
                1.7596272 = boost
                5.208553 = idf(docFreq=628, maxDocs=42306)
                0.011748131 = queryNorm
              0.46037537 = fieldWeight in 3883, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.208553 = idf(docFreq=628, maxDocs=42306)
                0.0625 = fieldNorm(doc=3883)
          0.085879356 = weight(abstract_txt:managers in 3883) [ClassicSimilarity], result of:
            0.085879356 = score(doc=3883,freq=1.0), product of:
              0.21538208 = queryWeight, product of:
                2.8737009 = boost
                6.3796844 = idf(docFreq=194, maxDocs=42306)
                0.011748131 = queryNorm
              0.39873028 = fieldWeight in 3883, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3796844 = idf(docFreq=194, maxDocs=42306)
                0.0625 = fieldNorm(doc=3883)
          0.22736126 = weight(abstract_txt:rate in 3883) [ClassicSimilarity], result of:
            0.22736126 = score(doc=3883,freq=4.0), product of:
              0.29723853 = queryWeight, product of:
                4.134614 = boost
                6.1192946 = idf(docFreq=252, maxDocs=42306)
                0.011748131 = queryNorm
              0.76491183 = fieldWeight in 3883, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.1192946 = idf(docFreq=252, maxDocs=42306)
                0.0625 = fieldNorm(doc=3883)
        0.2 = coord(5/25)
    
  5. Aksnes, D.W.: Citation rates and perceptions of scientific contribution (2006) 0.08
    0.07702024 = sum of:
      0.07702024 = product of:
        0.3851012 = sum of:
          0.02274373 = weight(abstract_txt:study in 926) [ClassicSimilarity], result of:
            0.02274373 = score(doc=926,freq=3.0), product of:
              0.048220478 = queryWeight, product of:
                1.1775603 = boost
                3.485616 = idf(docFreq=3522, maxDocs=42306)
                0.011748131 = queryNorm
              0.47166124 = fieldWeight in 926, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.485616 = idf(docFreq=3522, maxDocs=42306)
                0.078125 = fieldNorm(doc=926)
          0.030154299 = weight(abstract_txt:according in 926) [ClassicSimilarity], result of:
            0.030154299 = score(doc=926,freq=1.0), product of:
              0.07332181 = queryWeight, product of:
                1.1856005 = boost
                5.264123 = idf(docFreq=594, maxDocs=42306)
                0.011748131 = queryNorm
              0.4112596 = fieldWeight in 926, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.264123 = idf(docFreq=594, maxDocs=42306)
                0.078125 = fieldNorm(doc=926)
          0.045212235 = weight(abstract_txt:accuracy in 926) [ClassicSimilarity], result of:
            0.045212235 = score(doc=926,freq=1.0), product of:
              0.096051544 = queryWeight, product of:
                1.3569816 = boost
                6.025063 = idf(docFreq=277, maxDocs=42306)
                0.011748131 = queryNorm
              0.47070804 = fieldWeight in 926, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.025063 = idf(docFreq=277, maxDocs=42306)
                0.078125 = fieldNorm(doc=926)
          0.08603004 = weight(abstract_txt:average in 926) [ClassicSimilarity], result of:
            0.08603004 = score(doc=926,freq=1.0), product of:
              0.1858277 = queryWeight, product of:
                2.669267 = boost
                5.9258366 = idf(docFreq=306, maxDocs=42306)
                0.011748131 = queryNorm
              0.46295598 = fieldWeight in 926, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9258366 = idf(docFreq=306, maxDocs=42306)
                0.078125 = fieldNorm(doc=926)
          0.20096089 = weight(abstract_txt:rate in 926) [ClassicSimilarity], result of:
            0.20096089 = score(doc=926,freq=2.0), product of:
              0.29723853 = queryWeight, product of:
                4.134614 = boost
                6.1192946 = idf(docFreq=252, maxDocs=42306)
                0.011748131 = queryNorm
              0.676093 = fieldWeight in 926, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.1192946 = idf(docFreq=252, maxDocs=42306)
                0.078125 = fieldNorm(doc=926)
        0.2 = coord(5/25)