Document (#43358)

Author
Zhang, Y.
Ren, P.
Rijke, M. de
Title
¬A taxonomy, data set, and benchmark for detecting and classifying malevolent dialogue responses
Source
Journal of the Association for Information Science and Technology. 72(2021) no.12, S.1477-1497
Year
2021
Abstract
Conversational interfaces are increasingly popular as a way of connecting people to information. With the increased generative capacity of corpus-based conversational agents comes the need to classify and filter out malevolent responses that are inappropriate in terms of content and dialogue acts. Previous studies on the topic of detecting and classifying inappropriate content are mostly focused on a specific category of malevolence or on single sentences instead of an entire dialogue. We make three contributions to advance research on the malevolent dialogue response detection and classification (MDRDC) task. First, we define the task and present a hierarchical malevolent dialogue taxonomy. Second, we create a labeled multiturn dialogue data set and formulate the MDRDC task as a hierarchical classification task. Last, we apply state-of-the-art text classification methods to the MDRDC task, and report on experiments aimed at assessing the performance of these approaches.
Content
Vgl.: https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.24496.

Similar documents (author)

  1. De Rijke, M. -> Rijke, M. de: 2.01
    2.0092688 = sum of:
      2.0092688 = product of:
        4.0185375 = sum of:
          4.0185375 = weight(author_txt:rijke in 4116) [ClassicSimilarity], result of:
            4.0185375 = score(doc=4116,freq=2.0), product of:
              0.8757637 = queryWeight, product of:
                1.3469048 = boost
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.07514762 = queryNorm
              4.588609 = fieldWeight in 4116, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.375 = fieldNorm(doc=4116)
        0.5 = coord(1/2)
    
  2. Li, X.; Rijke, M.de: Characterizing and predicting downloads in academic search (2019) 1.89
    1.8943567 = sum of:
      1.8943567 = product of:
        3.7887135 = sum of:
          3.7887135 = weight(author_txt:rijke in 5103) [ClassicSimilarity], result of:
            3.7887135 = score(doc=5103,freq=1.0), product of:
              0.8757637 = queryWeight, product of:
                1.3469048 = boost
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.07514762 = queryNorm
              4.3261824 = fieldWeight in 5103, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.5 = fieldNorm(doc=5103)
        0.5 = coord(1/2)
    
  3. DeRijke, M. -> Rijke, M. de: 1.66
    1.6575621 = sum of:
      1.6575621 = product of:
        3.3151243 = sum of:
          3.3151243 = weight(author_txt:rijke in 117) [ClassicSimilarity], result of:
            3.3151243 = score(doc=117,freq=1.0), product of:
              0.8757637 = queryWeight, product of:
                1.3469048 = boost
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.07514762 = queryNorm
              3.7854095 = fieldWeight in 117, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.4375 = fieldNorm(doc=117)
        0.5 = coord(1/2)
    
  4. Meij, E.; Rijke, M. de: Thesaurus-based feedback to support mixed search and browsing environments (2007) 1.66
    1.6575621 = sum of:
      1.6575621 = product of:
        3.3151243 = sum of:
          3.3151243 = weight(author_txt:rijke in 2432) [ClassicSimilarity], result of:
            3.3151243 = score(doc=2432,freq=1.0), product of:
              0.8757637 = queryWeight, product of:
                1.3469048 = boost
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.07514762 = queryNorm
              3.7854095 = fieldWeight in 2432, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.4375 = fieldNorm(doc=2432)
        0.5 = coord(1/2)
    
  5. Cai, F.; Rijke, M. de: Learning from homologous queries and semantically related terms for query auto completion (2016) 1.66
    1.6575621 = sum of:
      1.6575621 = product of:
        3.3151243 = sum of:
          3.3151243 = weight(author_txt:rijke in 2971) [ClassicSimilarity], result of:
            3.3151243 = score(doc=2971,freq=1.0), product of:
              0.8757637 = queryWeight, product of:
                1.3469048 = boost
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.07514762 = queryNorm
              3.7854095 = fieldWeight in 2971, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.4375 = fieldNorm(doc=2971)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. ChatGPT : Optimizing language models for dalogue (2022) 0.14
    0.14181776 = sum of:
      0.14181776 = product of:
        1.1818147 = sum of:
          0.25504926 = weight(abstract_txt:inappropriate in 836) [ClassicSimilarity], result of:
            0.25504926 = score(doc=836,freq=1.0), product of:
              0.25635612 = queryWeight, product of:
                2.3854966 = boost
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.013501888 = queryNorm
              0.9949022 = fieldWeight in 836, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.125 = fieldNorm(doc=836)
          0.2923372 = weight(abstract_txt:conversational in 836) [ClassicSimilarity], result of:
            0.2923372 = score(doc=836,freq=1.0), product of:
              0.2807698 = queryWeight, product of:
                2.4965034 = boost
                8.329592 = idf(docFreq=28, maxDocs=44218)
                0.013501888 = queryNorm
              1.041199 = fieldWeight in 836, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.329592 = idf(docFreq=28, maxDocs=44218)
                0.125 = fieldNorm(doc=836)
          0.63442814 = weight(abstract_txt:dialogue in 836) [ClassicSimilarity], result of:
            0.63442814 = score(doc=836,freq=1.0), product of:
              0.6787706 = queryWeight, product of:
                6.723248 = boost
                7.4773793 = idf(docFreq=67, maxDocs=44218)
                0.013501888 = queryNorm
              0.9346724 = fieldWeight in 836, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.4773793 = idf(docFreq=67, maxDocs=44218)
                0.125 = fieldNorm(doc=836)
        0.12 = coord(3/25)
    
  2. Sood, S.O.; Churchill, E.F.; Antin, J.: Automatic identification of personal insults on social news sites (2012) 0.08
    0.082028754 = sum of:
      0.082028754 = product of:
        0.41014376 = sum of:
          0.06839363 = weight(abstract_txt:detection in 4976) [ClassicSimilarity], result of:
            0.06839363 = score(doc=4976,freq=3.0), product of:
              0.09312673 = queryWeight, product of:
                1.016668 = boost
                6.784232 = idf(docFreq=135, maxDocs=44218)
                0.013501888 = queryNorm
              0.73441464 = fieldWeight in 4976, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.784232 = idf(docFreq=135, maxDocs=44218)
                0.0625 = fieldNorm(doc=4976)
          0.054505613 = weight(abstract_txt:labeled in 4976) [ClassicSimilarity], result of:
            0.054505613 = score(doc=4976,freq=1.0), product of:
              0.1154512 = queryWeight, product of:
                1.1319864 = boost
                7.5537524 = idf(docFreq=62, maxDocs=44218)
                0.013501888 = queryNorm
              0.47210953 = fieldWeight in 4976, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.5537524 = idf(docFreq=62, maxDocs=44218)
                0.0625 = fieldNorm(doc=4976)
          0.031992223 = weight(abstract_txt:content in 4976) [ClassicSimilarity], result of:
            0.031992223 = score(doc=4976,freq=3.0), product of:
              0.070702836 = queryWeight, product of:
                1.2527816 = boost
                4.17991 = idf(docFreq=1838, maxDocs=44218)
                0.013501888 = queryNorm
              0.45248854 = fieldWeight in 4976, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.17991 = idf(docFreq=1838, maxDocs=44218)
                0.0625 = fieldNorm(doc=4976)
          0.18034706 = weight(abstract_txt:inappropriate in 4976) [ClassicSimilarity], result of:
            0.18034706 = score(doc=4976,freq=2.0), product of:
              0.25635612 = queryWeight, product of:
                2.3854966 = boost
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.013501888 = queryNorm
              0.70350206 = fieldWeight in 4976, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.0625 = fieldNorm(doc=4976)
          0.074905254 = weight(abstract_txt:task in 4976) [ClassicSimilarity], result of:
            0.074905254 = score(doc=4976,freq=1.0), product of:
              0.24402586 = queryWeight, product of:
                3.6799753 = boost
                4.9112997 = idf(docFreq=884, maxDocs=44218)
                0.013501888 = queryNorm
              0.30695623 = fieldWeight in 4976, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9112997 = idf(docFreq=884, maxDocs=44218)
                0.0625 = fieldNorm(doc=4976)
        0.2 = coord(5/25)
    
  3. Ding, W.; Chen, C.: Dynamic topic detection and tracking : a comparison of HDP, C-word, and cocitation methods (2014) 0.08
    0.07831715 = sum of:
      0.07831715 = product of:
        0.48948222 = sum of:
          0.049358856 = weight(abstract_txt:detection in 1502) [ClassicSimilarity], result of:
            0.049358856 = score(doc=1502,freq=1.0), product of:
              0.09312673 = queryWeight, product of:
                1.016668 = boost
                6.784232 = idf(docFreq=135, maxDocs=44218)
                0.013501888 = queryNorm
              0.53001815 = fieldWeight in 1502, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.784232 = idf(docFreq=135, maxDocs=44218)
                0.078125 = fieldNorm(doc=1502)
          0.15268818 = weight(abstract_txt:generative in 1502) [ClassicSimilarity], result of:
            0.15268818 = score(doc=1502,freq=3.0), product of:
              0.13708633 = queryWeight, product of:
                1.2334996 = boost
                8.231152 = idf(docFreq=31, maxDocs=44218)
                0.013501888 = queryNorm
              1.1138103 = fieldWeight in 1502, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.231152 = idf(docFreq=31, maxDocs=44218)
                0.078125 = fieldNorm(doc=1502)
          0.08414731 = weight(abstract_txt:hierarchical in 1502) [ClassicSimilarity], result of:
            0.08414731 = score(doc=1502,freq=2.0), product of:
              0.13289984 = queryWeight, product of:
                1.7175888 = boost
                5.7307405 = idf(docFreq=389, maxDocs=44218)
                0.013501888 = queryNorm
              0.6331634 = fieldWeight in 1502, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.7307405 = idf(docFreq=389, maxDocs=44218)
                0.078125 = fieldNorm(doc=1502)
          0.20328787 = weight(abstract_txt:detecting in 1502) [ClassicSimilarity], result of:
            0.20328787 = score(doc=1502,freq=2.0), product of:
              0.23927937 = queryWeight, product of:
                2.3046746 = boost
                7.689554 = idf(docFreq=54, maxDocs=44218)
                0.013501888 = queryNorm
              0.84958375 = fieldWeight in 1502, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.689554 = idf(docFreq=54, maxDocs=44218)
                0.078125 = fieldNorm(doc=1502)
        0.16 = coord(4/25)
    
  4. Li, Y.; Belkin, N.J.: ¬A faceted approach to conceptualizing tasks in information seeking (2008) 0.08
    0.076232366 = sum of:
      0.076232366 = product of:
        0.4764523 = sum of:
          0.04529522 = weight(abstract_txt:advance in 2442) [ClassicSimilarity], result of:
            0.04529522 = score(doc=2442,freq=1.0), product of:
              0.10204833 = queryWeight, product of:
                1.0642531 = boost
                7.1017675 = idf(docFreq=98, maxDocs=44218)
                0.013501888 = queryNorm
              0.44386047 = fieldWeight in 2442, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.1017675 = idf(docFreq=98, maxDocs=44218)
                0.0625 = fieldNorm(doc=2442)
          0.05912158 = weight(abstract_txt:classification in 2442) [ClassicSimilarity], result of:
            0.05912158 = score(doc=2442,freq=6.0), product of:
              0.09673679 = queryWeight, product of:
                1.7947271 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.013501888 = queryNorm
              0.6111592 = fieldWeight in 2442, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.0625 = fieldNorm(doc=2442)
          0.07241449 = weight(abstract_txt:classifying in 2442) [ClassicSimilarity], result of:
            0.07241449 = score(doc=2442,freq=1.0), product of:
              0.17579153 = queryWeight, product of:
                1.975404 = boost
                6.590942 = idf(docFreq=164, maxDocs=44218)
                0.013501888 = queryNorm
              0.41193387 = fieldWeight in 2442, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.590942 = idf(docFreq=164, maxDocs=44218)
                0.0625 = fieldNorm(doc=2442)
          0.29962102 = weight(abstract_txt:task in 2442) [ClassicSimilarity], result of:
            0.29962102 = score(doc=2442,freq=16.0), product of:
              0.24402586 = queryWeight, product of:
                3.6799753 = boost
                4.9112997 = idf(docFreq=884, maxDocs=44218)
                0.013501888 = queryNorm
              1.2278249 = fieldWeight in 2442, product of:
                4.0 = tf(freq=16.0), with freq of:
                  16.0 = termFreq=16.0
                4.9112997 = idf(docFreq=884, maxDocs=44218)
                0.0625 = fieldNorm(doc=2442)
        0.16 = coord(4/25)
    
  5. Price, L.; Robinson, L.: Tag analysis as a tool for investigating information behaviour : comparing fan-tagging on Tumblr, Archive of Our Own and Etsy (2021) 0.07
    0.07422408 = sum of:
      0.07422408 = product of:
        0.4639005 = sum of:
          0.021119248 = weight(abstract_txt:classification in 339) [ClassicSimilarity], result of:
            0.021119248 = score(doc=339,freq=1.0), product of:
              0.09673679 = queryWeight, product of:
                1.7947271 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.013501888 = queryNorm
              0.21831661 = fieldWeight in 339, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.0546875 = fieldNorm(doc=339)
          0.10185627 = weight(abstract_txt:taxonomy in 339) [ClassicSimilarity], result of:
            0.10185627 = score(doc=339,freq=3.0), product of:
              0.16726062 = queryWeight, product of:
                1.9268763 = boost
                6.429029 = idf(docFreq=193, maxDocs=44218)
                0.013501888 = queryNorm
              0.6089674 = fieldWeight in 339, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.429029 = idf(docFreq=193, maxDocs=44218)
                0.0546875 = fieldNorm(doc=339)
          0.06336267 = weight(abstract_txt:classifying in 339) [ClassicSimilarity], result of:
            0.06336267 = score(doc=339,freq=1.0), product of:
              0.17579153 = queryWeight, product of:
                1.975404 = boost
                6.590942 = idf(docFreq=164, maxDocs=44218)
                0.013501888 = queryNorm
              0.36044213 = fieldWeight in 339, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.590942 = idf(docFreq=164, maxDocs=44218)
                0.0546875 = fieldNorm(doc=339)
          0.27756232 = weight(abstract_txt:dialogue in 339) [ClassicSimilarity], result of:
            0.27756232 = score(doc=339,freq=1.0), product of:
              0.6787706 = queryWeight, product of:
                6.723248 = boost
                7.4773793 = idf(docFreq=67, maxDocs=44218)
                0.013501888 = queryNorm
              0.4089192 = fieldWeight in 339, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.4773793 = idf(docFreq=67, maxDocs=44218)
                0.0546875 = fieldNorm(doc=339)
        0.16 = coord(4/25)