Document (#38847)

Author
Miah, M.W.R.
Yearwood, J.
Kulkarni, S.
Title
Constructing an inter-post similarity measure to differentiate the psychological stages in offensive chats
Source
Journal of the Association for Information Science and Technology. 66(2015) no.5, S.1065-1081
Year
2015
Abstract
Offensive Internet chats, particularly the child-exploiting type, tend to follow a documented psychological behavioral pattern. Researchers have identified some important stages in this pattern. The psychological stages broadly include befriending, information exchange, grooming, and approach. Similarities among the posts of a chat play an important role in differentiating as well as in identifying these stages. In this article a novel similarity measure is constructed which gives high Inter-post-similarity among the chat-posts within a particular behavioral stage and low inter-post-similarity across different behavioral stages. A psychological stage corpus-based dictionary is constructed from mining the terms associated with each stage. The dictionary works as a background knowledge-base to support the similarity measure. To find the inter-post similarity a modified sentence similarity measure is used. The proposed measure gives improved recognition of inter-stage and intra-stage similarity among the chat posts compared with other types of similarity measures. The pairwise inter-post similarity is used for clustering chat-posts into the psychological stages. Results of experiments demonstrate that the new clustering method gives better results than some current clustering methods.
Content
Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23247/abstract.

Similar documents (content)

  1. Zhang, J.; Korfhage, R.R.: ¬A distance and angle similarity measure method (1999) 0.12
    0.116580695 = sum of:
      0.116580695 = product of:
        0.9715058 = sum of:
          0.06138794 = weight(abstract_txt:gives in 3915) [ClassicSimilarity], result of:
            0.06138794 = score(doc=3915,freq=1.0), product of:
              0.10498709 = queryWeight, product of:
                2.1733484 = boost
                5.3460016 = idf(docFreq=572, maxDocs=44218)
                0.009036026 = queryNorm
              0.58471894 = fieldWeight in 3915, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3460016 = idf(docFreq=572, maxDocs=44218)
                0.109375 = fieldNorm(doc=3915)
          0.26367652 = weight(abstract_txt:measure in 3915) [ClassicSimilarity], result of:
            0.26367652 = score(doc=3915,freq=6.0), product of:
              0.18100643 = queryWeight, product of:
                3.6841114 = boost
                5.437306 = idf(docFreq=522, maxDocs=44218)
                0.009036026 = queryNorm
              1.4567246 = fieldWeight in 3915, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.437306 = idf(docFreq=522, maxDocs=44218)
                0.109375 = fieldNorm(doc=3915)
          0.64644134 = weight(abstract_txt:similarity in 3915) [ClassicSimilarity], result of:
            0.64644134 = score(doc=3915,freq=6.0), product of:
              0.41464436 = queryWeight, product of:
                7.8856707 = boost
                5.8191514 = idf(docFreq=356, maxDocs=44218)
                0.009036026 = queryNorm
              1.559026 = fieldWeight in 3915, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.8191514 = idf(docFreq=356, maxDocs=44218)
                0.109375 = fieldNorm(doc=3915)
        0.12 = coord(3/25)
    
  2. Ellis, D.; Furner-Hines, J.; Willett, P.: On the creation of hypertext links in full-text documents : measurement of inter-linker consistency (1994) 0.11
    0.10747474 = sum of:
      0.10747474 = product of:
        0.53737366 = sum of:
          0.010027571 = weight(abstract_txt:important in 7493) [ClassicSimilarity], result of:
            0.010027571 = score(doc=7493,freq=1.0), product of:
              0.043504473 = queryWeight, product of:
                1.1423067 = boost
                4.2147684 = idf(docFreq=1775, maxDocs=44218)
                0.009036026 = queryNorm
              0.23049515 = fieldWeight in 7493, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2147684 = idf(docFreq=1775, maxDocs=44218)
                0.0546875 = fieldNorm(doc=7493)
          0.053822745 = weight(abstract_txt:measure in 7493) [ClassicSimilarity], result of:
            0.053822745 = score(doc=7493,freq=1.0), product of:
              0.18100643 = queryWeight, product of:
                3.6841114 = boost
                5.437306 = idf(docFreq=522, maxDocs=44218)
                0.009036026 = queryNorm
              0.29735267 = fieldWeight in 7493, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.437306 = idf(docFreq=522, maxDocs=44218)
                0.0546875 = fieldNorm(doc=7493)
          0.07635915 = weight(abstract_txt:stage in 7493) [ClassicSimilarity], result of:
            0.07635915 = score(doc=7493,freq=1.0), product of:
              0.22853751 = queryWeight, product of:
                4.1396585 = boost
                6.1096387 = idf(docFreq=266, maxDocs=44218)
                0.009036026 = queryNorm
              0.33412087 = fieldWeight in 7493, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1096387 = idf(docFreq=266, maxDocs=44218)
                0.0546875 = fieldNorm(doc=7493)
          0.21055269 = weight(abstract_txt:inter in 7493) [ClassicSimilarity], result of:
            0.21055269 = score(doc=7493,freq=3.0), product of:
              0.33111385 = queryWeight, product of:
                5.4583964 = boost
                6.7132807 = idf(docFreq=145, maxDocs=44218)
                0.009036026 = queryNorm
              0.63589215 = fieldWeight in 7493, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.7132807 = idf(docFreq=145, maxDocs=44218)
                0.0546875 = fieldNorm(doc=7493)
          0.18661153 = weight(abstract_txt:similarity in 7493) [ClassicSimilarity], result of:
            0.18661153 = score(doc=7493,freq=2.0), product of:
              0.41464436 = queryWeight, product of:
                7.8856707 = boost
                5.8191514 = idf(docFreq=356, maxDocs=44218)
                0.009036026 = queryNorm
              0.45005202 = fieldWeight in 7493, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8191514 = idf(docFreq=356, maxDocs=44218)
                0.0546875 = fieldNorm(doc=7493)
        0.2 = coord(5/25)
    
  3. Ellis, D.; Furner-Hines, J.; Willett, P.: ¬The creation of hypertext links in full-text documents (1994) 0.09
    0.09444045 = sum of:
      0.09444045 = product of:
        0.47220224 = sum of:
          0.01146008 = weight(abstract_txt:important in 1084) [ClassicSimilarity], result of:
            0.01146008 = score(doc=1084,freq=1.0), product of:
              0.043504473 = queryWeight, product of:
                1.1423067 = boost
                4.2147684 = idf(docFreq=1775, maxDocs=44218)
                0.009036026 = queryNorm
              0.26342303 = fieldWeight in 1084, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2147684 = idf(docFreq=1775, maxDocs=44218)
                0.0625 = fieldNorm(doc=1084)
          0.021275504 = weight(abstract_txt:among in 1084) [ClassicSimilarity], result of:
            0.021275504 = score(doc=1084,freq=1.0), product of:
              0.07522447 = queryWeight, product of:
                1.8396744 = boost
                4.5252304 = idf(docFreq=1301, maxDocs=44218)
                0.009036026 = queryNorm
              0.2828269 = fieldWeight in 1084, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5252304 = idf(docFreq=1301, maxDocs=44218)
                0.0625 = fieldNorm(doc=1084)
          0.0872676 = weight(abstract_txt:stage in 1084) [ClassicSimilarity], result of:
            0.0872676 = score(doc=1084,freq=1.0), product of:
              0.22853751 = queryWeight, product of:
                4.1396585 = boost
                6.1096387 = idf(docFreq=266, maxDocs=44218)
                0.009036026 = queryNorm
              0.38185242 = fieldWeight in 1084, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1096387 = idf(docFreq=266, maxDocs=44218)
                0.0625 = fieldNorm(doc=1084)
          0.13892876 = weight(abstract_txt:inter in 1084) [ClassicSimilarity], result of:
            0.13892876 = score(doc=1084,freq=1.0), product of:
              0.33111385 = queryWeight, product of:
                5.4583964 = boost
                6.7132807 = idf(docFreq=145, maxDocs=44218)
                0.009036026 = queryNorm
              0.41958004 = fieldWeight in 1084, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7132807 = idf(docFreq=145, maxDocs=44218)
                0.0625 = fieldNorm(doc=1084)
          0.21327032 = weight(abstract_txt:similarity in 1084) [ClassicSimilarity], result of:
            0.21327032 = score(doc=1084,freq=2.0), product of:
              0.41464436 = queryWeight, product of:
                7.8856707 = boost
                5.8191514 = idf(docFreq=356, maxDocs=44218)
                0.009036026 = queryNorm
              0.51434517 = fieldWeight in 1084, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8191514 = idf(docFreq=356, maxDocs=44218)
                0.0625 = fieldNorm(doc=1084)
        0.2 = coord(5/25)
    
  4. Ellis, D.; Furner, J.; Willett, P.: On the creation of hypertext links in full-text documents : measurement of retrieval effectiveness (1996) 0.09
    0.089501955 = sum of:
      0.089501955 = product of:
        0.44750977 = sum of:
          0.010027571 = weight(abstract_txt:important in 4214) [ClassicSimilarity], result of:
            0.010027571 = score(doc=4214,freq=1.0), product of:
              0.043504473 = queryWeight, product of:
                1.1423067 = boost
                4.2147684 = idf(docFreq=1775, maxDocs=44218)
                0.009036026 = queryNorm
              0.23049515 = fieldWeight in 4214, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2147684 = idf(docFreq=1775, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4214)
          0.018616065 = weight(abstract_txt:among in 4214) [ClassicSimilarity], result of:
            0.018616065 = score(doc=4214,freq=1.0), product of:
              0.07522447 = queryWeight, product of:
                1.8396744 = boost
                4.5252304 = idf(docFreq=1301, maxDocs=44218)
                0.009036026 = queryNorm
              0.24747354 = fieldWeight in 4214, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5252304 = idf(docFreq=1301, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4214)
          0.07635915 = weight(abstract_txt:stage in 4214) [ClassicSimilarity], result of:
            0.07635915 = score(doc=4214,freq=1.0), product of:
              0.22853751 = queryWeight, product of:
                4.1396585 = boost
                6.1096387 = idf(docFreq=266, maxDocs=44218)
                0.009036026 = queryNorm
              0.33412087 = fieldWeight in 4214, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1096387 = idf(docFreq=266, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4214)
          0.21055269 = weight(abstract_txt:inter in 4214) [ClassicSimilarity], result of:
            0.21055269 = score(doc=4214,freq=3.0), product of:
              0.33111385 = queryWeight, product of:
                5.4583964 = boost
                6.7132807 = idf(docFreq=145, maxDocs=44218)
                0.009036026 = queryNorm
              0.63589215 = fieldWeight in 4214, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.7132807 = idf(docFreq=145, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4214)
          0.13195428 = weight(abstract_txt:similarity in 4214) [ClassicSimilarity], result of:
            0.13195428 = score(doc=4214,freq=1.0), product of:
              0.41464436 = queryWeight, product of:
                7.8856707 = boost
                5.8191514 = idf(docFreq=356, maxDocs=44218)
                0.009036026 = queryNorm
              0.31823483 = fieldWeight in 4214, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8191514 = idf(docFreq=356, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4214)
        0.2 = coord(5/25)
    
  5. Huang, M.-H.; Wu, L.-L.; Wu, Y.-C.: ¬A study of research collaboration in the pre-web and post-web stages : a coauthorship analysis of the information systems discipline (2015) 0.09
    0.088144355 = sum of:
      0.088144355 = product of:
        0.55090225 = sum of:
          0.021275504 = weight(abstract_txt:among in 1729) [ClassicSimilarity], result of:
            0.021275504 = score(doc=1729,freq=1.0), product of:
              0.07522447 = queryWeight, product of:
                1.8396744 = boost
                4.5252304 = idf(docFreq=1301, maxDocs=44218)
                0.009036026 = queryNorm
              0.2828269 = fieldWeight in 1729, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5252304 = idf(docFreq=1301, maxDocs=44218)
                0.0625 = fieldNorm(doc=1729)
          0.2137611 = weight(abstract_txt:stage in 1729) [ClassicSimilarity], result of:
            0.2137611 = score(doc=1729,freq=6.0), product of:
              0.22853751 = queryWeight, product of:
                4.1396585 = boost
                6.1096387 = idf(docFreq=266, maxDocs=44218)
                0.009036026 = queryNorm
              0.9353436 = fieldWeight in 1729, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.1096387 = idf(docFreq=266, maxDocs=44218)
                0.0625 = fieldNorm(doc=1729)
          0.2011175 = weight(abstract_txt:post in 1729) [ClassicSimilarity], result of:
            0.2011175 = score(doc=1729,freq=5.0), product of:
              0.23318398 = queryWeight, product of:
                4.181529 = boost
                6.1714344 = idf(docFreq=250, maxDocs=44218)
                0.009036026 = queryNorm
              0.8624842 = fieldWeight in 1729, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.1714344 = idf(docFreq=250, maxDocs=44218)
                0.0625 = fieldNorm(doc=1729)
          0.11474812 = weight(abstract_txt:stages in 1729) [ClassicSimilarity], result of:
            0.11474812 = score(doc=1729,freq=1.0), product of:
              0.2914828 = queryWeight, product of:
                5.121331 = boost
                6.2987247 = idf(docFreq=220, maxDocs=44218)
                0.009036026 = queryNorm
              0.3936703 = fieldWeight in 1729, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2987247 = idf(docFreq=220, maxDocs=44218)
                0.0625 = fieldNorm(doc=1729)
        0.16 = coord(4/25)