Document (#38845)

Author
Miah, M.W.R.
Yearwood, J.
Kulkarni, S.
Title
Constructing an inter-post similarity measure to differentiate the psychological stages in offensive chats
Source
Journal of the Association for Information Science and Technology. 66(2015) no.5, S.1065-1081
Year
2015
Abstract
Offensive Internet chats, particularly the child-exploiting type, tend to follow a documented psychological behavioral pattern. Researchers have identified some important stages in this pattern. The psychological stages broadly include befriending, information exchange, grooming, and approach. Similarities among the posts of a chat play an important role in differentiating as well as in identifying these stages. In this article a novel similarity measure is constructed which gives high Inter-post-similarity among the chat-posts within a particular behavioral stage and low inter-post-similarity across different behavioral stages. A psychological stage corpus-based dictionary is constructed from mining the terms associated with each stage. The dictionary works as a background knowledge-base to support the similarity measure. To find the inter-post similarity a modified sentence similarity measure is used. The proposed measure gives improved recognition of inter-stage and intra-stage similarity among the chat posts compared with other types of similarity measures. The pairwise inter-post similarity is used for clustering chat-posts into the psychological stages. Results of experiments demonstrate that the new clustering method gives better results than some current clustering methods.
Content
Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23247/abstract.

Similar documents (content)

  1. Zhang, J.; Korfhage, R.R.: ¬A distance and angle similarity measure method (1999) 0.12
    0.11554221 = sum of:
      0.11554221 = product of:
        0.96285176 = sum of:
          0.06061799 = weight(abstract_txt:gives in 4913) [ClassicSimilarity], result of:
            0.06061799 = score(doc=4913,freq=1.0), product of:
              0.10382722 = queryWeight, product of:
                2.166557 = boost
                5.337922 = idf(docFreq=568, maxDocs=43556)
                0.008977778 = queryNorm
              0.58383524 = fieldWeight in 4913, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.337922 = idf(docFreq=568, maxDocs=43556)
                0.109375 = fieldNorm(doc=4913)
          0.26132125 = weight(abstract_txt:measure in 4913) [ClassicSimilarity], result of:
            0.26132125 = score(doc=4913,freq=6.0), product of:
              0.17944272 = queryWeight, product of:
                3.6770692 = boost
                5.435696 = idf(docFreq=515, maxDocs=43556)
                0.008977778 = queryNorm
              1.4562933 = fieldWeight in 4913, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.435696 = idf(docFreq=515, maxDocs=43556)
                0.109375 = fieldNorm(doc=4913)
          0.64091253 = weight(abstract_txt:similarity in 4913) [ClassicSimilarity], result of:
            0.64091253 = score(doc=4913,freq=6.0), product of:
              0.41116726 = queryWeight, product of:
                7.8716025 = boost
                5.8181715 = idf(docFreq=351, maxDocs=43556)
                0.008977778 = queryNorm
              1.5587635 = fieldWeight in 4913, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.8181715 = idf(docFreq=351, maxDocs=43556)
                0.109375 = fieldNorm(doc=4913)
        0.12 = coord(3/25)
    
  2. Ellis, D.; Furner-Hines, J.; Willett, P.: On the creation of hypertext links in full-text documents : measurement of inter-linker consistency (1994) 0.11
    0.107435666 = sum of:
      0.107435666 = product of:
        0.53717834 = sum of:
          0.010059064 = weight(abstract_txt:important in 7490) [ClassicSimilarity], result of:
            0.010059064 = score(doc=7490,freq=1.0), product of:
              0.043478195 = queryWeight, product of:
                1.1447341 = boost
                4.2305613 = idf(docFreq=1721, maxDocs=43556)
                0.008977778 = queryNorm
              0.23135883 = fieldWeight in 7490, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2305613 = idf(docFreq=1721, maxDocs=43556)
                0.0546875 = fieldNorm(doc=7490)
          0.053341974 = weight(abstract_txt:measure in 7490) [ClassicSimilarity], result of:
            0.053341974 = score(doc=7490,freq=1.0), product of:
              0.17944272 = queryWeight, product of:
                3.6770692 = boost
                5.435696 = idf(docFreq=515, maxDocs=43556)
                0.008977778 = queryNorm
              0.29726464 = fieldWeight in 7490, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.435696 = idf(docFreq=515, maxDocs=43556)
                0.0546875 = fieldNorm(doc=7490)
          0.07602899 = weight(abstract_txt:stage in 7490) [ClassicSimilarity], result of:
            0.07602899 = score(doc=7490,freq=1.0), product of:
              0.22726503 = queryWeight, product of:
                4.13814 = boost
                6.1172824 = idf(docFreq=260, maxDocs=43556)
                0.008977778 = queryNorm
              0.33453888 = fieldWeight in 7490, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1172824 = idf(docFreq=260, maxDocs=43556)
                0.0546875 = fieldNorm(doc=7490)
          0.21273282 = weight(abstract_txt:inter in 7490) [ClassicSimilarity], result of:
            0.21273282 = score(doc=7490,freq=3.0), product of:
              0.3324983 = queryWeight, product of:
                5.4830756 = boost
                6.754549 = idf(docFreq=137, maxDocs=43556)
                0.008977778 = queryNorm
              0.6398012 = fieldWeight in 7490, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.754549 = idf(docFreq=137, maxDocs=43556)
                0.0546875 = fieldNorm(doc=7490)
          0.1850155 = weight(abstract_txt:similarity in 7490) [ClassicSimilarity], result of:
            0.1850155 = score(doc=7490,freq=2.0), product of:
              0.41116726 = queryWeight, product of:
                7.8716025 = boost
                5.8181715 = idf(docFreq=351, maxDocs=43556)
                0.008977778 = queryNorm
              0.44997624 = fieldWeight in 7490, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8181715 = idf(docFreq=351, maxDocs=43556)
                0.0546875 = fieldNorm(doc=7490)
        0.2 = coord(5/25)
    
  3. Ellis, D.; Furner-Hines, J.; Willett, P.: ¬The creation of hypertext links in full-text documents (1994) 0.09
    0.09427726 = sum of:
      0.09427726 = product of:
        0.4713863 = sum of:
          0.011496073 = weight(abstract_txt:important in 1150) [ClassicSimilarity], result of:
            0.011496073 = score(doc=1150,freq=1.0), product of:
              0.043478195 = queryWeight, product of:
                1.1447341 = boost
                4.2305613 = idf(docFreq=1721, maxDocs=43556)
                0.008977778 = queryNorm
              0.26441008 = fieldWeight in 1150, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2305613 = idf(docFreq=1721, maxDocs=43556)
                0.0625 = fieldNorm(doc=1150)
          0.021186445 = weight(abstract_txt:among in 1150) [ClassicSimilarity], result of:
            0.021186445 = score(doc=1150,freq=1.0), product of:
              0.074812524 = queryWeight, product of:
                1.8390844 = boost
                4.531101 = idf(docFreq=1274, maxDocs=43556)
                0.008977778 = queryNorm
              0.28319383 = fieldWeight in 1150, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.531101 = idf(docFreq=1274, maxDocs=43556)
                0.0625 = fieldNorm(doc=1150)
          0.08689027 = weight(abstract_txt:stage in 1150) [ClassicSimilarity], result of:
            0.08689027 = score(doc=1150,freq=1.0), product of:
              0.22726503 = queryWeight, product of:
                4.13814 = boost
                6.1172824 = idf(docFreq=260, maxDocs=43556)
                0.008977778 = queryNorm
              0.38233015 = fieldWeight in 1150, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1172824 = idf(docFreq=260, maxDocs=43556)
                0.0625 = fieldNorm(doc=1150)
          0.14036725 = weight(abstract_txt:inter in 1150) [ClassicSimilarity], result of:
            0.14036725 = score(doc=1150,freq=1.0), product of:
              0.3324983 = queryWeight, product of:
                5.4830756 = boost
                6.754549 = idf(docFreq=137, maxDocs=43556)
                0.008977778 = queryNorm
              0.4221593 = fieldWeight in 1150, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.754549 = idf(docFreq=137, maxDocs=43556)
                0.0625 = fieldNorm(doc=1150)
          0.21144629 = weight(abstract_txt:similarity in 1150) [ClassicSimilarity], result of:
            0.21144629 = score(doc=1150,freq=2.0), product of:
              0.41116726 = queryWeight, product of:
                7.8716025 = boost
                5.8181715 = idf(docFreq=351, maxDocs=43556)
                0.008977778 = queryNorm
              0.51425856 = fieldWeight in 1150, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8181715 = idf(docFreq=351, maxDocs=43556)
                0.0625 = fieldNorm(doc=1150)
        0.2 = coord(5/25)
    
  4. Ellis, D.; Furner, J.; Willett, P.: On the creation of hypertext links in full-text documents : measurement of retrieval effectiveness (1996) 0.09
    0.089636944 = sum of:
      0.089636944 = product of:
        0.44818473 = sum of:
          0.010059064 = weight(abstract_txt:important in 4280) [ClassicSimilarity], result of:
            0.010059064 = score(doc=4280,freq=1.0), product of:
              0.043478195 = queryWeight, product of:
                1.1447341 = boost
                4.2305613 = idf(docFreq=1721, maxDocs=43556)
                0.008977778 = queryNorm
              0.23135883 = fieldWeight in 4280, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2305613 = idf(docFreq=1721, maxDocs=43556)
                0.0546875 = fieldNorm(doc=4280)
          0.01853814 = weight(abstract_txt:among in 4280) [ClassicSimilarity], result of:
            0.01853814 = score(doc=4280,freq=1.0), product of:
              0.074812524 = queryWeight, product of:
                1.8390844 = boost
                4.531101 = idf(docFreq=1274, maxDocs=43556)
                0.008977778 = queryNorm
              0.2477946 = fieldWeight in 4280, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.531101 = idf(docFreq=1274, maxDocs=43556)
                0.0546875 = fieldNorm(doc=4280)
          0.07602899 = weight(abstract_txt:stage in 4280) [ClassicSimilarity], result of:
            0.07602899 = score(doc=4280,freq=1.0), product of:
              0.22726503 = queryWeight, product of:
                4.13814 = boost
                6.1172824 = idf(docFreq=260, maxDocs=43556)
                0.008977778 = queryNorm
              0.33453888 = fieldWeight in 4280, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1172824 = idf(docFreq=260, maxDocs=43556)
                0.0546875 = fieldNorm(doc=4280)
          0.21273282 = weight(abstract_txt:inter in 4280) [ClassicSimilarity], result of:
            0.21273282 = score(doc=4280,freq=3.0), product of:
              0.3324983 = queryWeight, product of:
                5.4830756 = boost
                6.754549 = idf(docFreq=137, maxDocs=43556)
                0.008977778 = queryNorm
              0.6398012 = fieldWeight in 4280, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.754549 = idf(docFreq=137, maxDocs=43556)
                0.0546875 = fieldNorm(doc=4280)
          0.13082571 = weight(abstract_txt:similarity in 4280) [ClassicSimilarity], result of:
            0.13082571 = score(doc=4280,freq=1.0), product of:
              0.41116726 = queryWeight, product of:
                7.8716025 = boost
                5.8181715 = idf(docFreq=351, maxDocs=43556)
                0.008977778 = queryNorm
              0.31818125 = fieldWeight in 4280, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8181715 = idf(docFreq=351, maxDocs=43556)
                0.0546875 = fieldNorm(doc=4280)
        0.2 = coord(5/25)
    
  5. Huang, M.-H.; Wu, L.-L.; Wu, Y.-C.: ¬A study of research collaboration in the pre-web and post-web stages : a coauthorship analysis of the information systems discipline (2015) 0.09
    0.08788869 = sum of:
      0.08788869 = product of:
        0.5493043 = sum of:
          0.021186445 = weight(abstract_txt:among in 3727) [ClassicSimilarity], result of:
            0.021186445 = score(doc=3727,freq=1.0), product of:
              0.074812524 = queryWeight, product of:
                1.8390844 = boost
                4.531101 = idf(docFreq=1274, maxDocs=43556)
                0.008977778 = queryNorm
              0.28319383 = fieldWeight in 3727, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.531101 = idf(docFreq=1274, maxDocs=43556)
                0.0625 = fieldNorm(doc=3727)
          0.21283685 = weight(abstract_txt:stage in 3727) [ClassicSimilarity], result of:
            0.21283685 = score(doc=3727,freq=6.0), product of:
              0.22726503 = queryWeight, product of:
                4.13814 = boost
                6.1172824 = idf(docFreq=260, maxDocs=43556)
                0.008977778 = queryNorm
              0.93651384 = fieldWeight in 3727, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.1172824 = idf(docFreq=260, maxDocs=43556)
                0.0625 = fieldNorm(doc=3727)
          0.20078105 = weight(abstract_txt:post in 3727) [ClassicSimilarity], result of:
            0.20078105 = score(doc=3727,freq=5.0), product of:
              0.23229702 = queryWeight, product of:
                4.1837015 = boost
                6.1846347 = idf(docFreq=243, maxDocs=43556)
                0.008977778 = queryNorm
              0.864329 = fieldWeight in 3727, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.1846347 = idf(docFreq=243, maxDocs=43556)
                0.0625 = fieldNorm(doc=3727)
          0.114499964 = weight(abstract_txt:stages in 3727) [ClassicSimilarity], result of:
            0.114499964 = score(doc=3727,freq=1.0), product of:
              0.29027912 = queryWeight, product of:
                5.123154 = boost
                6.311165 = idf(docFreq=214, maxDocs=43556)
                0.008977778 = queryNorm
              0.3944478 = fieldWeight in 3727, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.311165 = idf(docFreq=214, maxDocs=43556)
                0.0625 = fieldNorm(doc=3727)
        0.16 = coord(4/25)