Document (#41462)

Author
Zhu, J.
Han, L.
Gou, Z.
Yuan, X.
Title
¬A fuzzy clustering-based denoising model for evaluating uncertainty in collaborative filtering recommender systems
Source
Journal of the Association for Information Science and Technology. 69(2018) no.9, S.1109-1121
Year
2018
Abstract
Recommender systems are effective in predicting the most suitable products for users, such as movies and books. To facilitate personalized recommendations, the quality of item ratings should be guaranteed. However, a few ratings might not be accurate enough due to the uncertainty of user behavior and are referred to as natural noise. In this article, we present a novel fuzzy clustering-based method for detecting noisy ratings. The entropy of a subset of the original ratings dataset is used to indicate the data-driven uncertainty, and evaluation metrics are adopted to represent the prediction-driven uncertainty. After the repetition of resampling and the execution of a recommendation algorithm, the entropy and evaluation metrics vectors are obtained and are empirically categorized to identify the proportion of the potential noise. Then, the fuzzy C-means-based denoising (FCMD) algorithm is performed to verify the natural noise under the assumption that natural noise is primarily the result of the exceptional behavior of users. Finally, a case study is performed using two real-world datasets. The experimental results show that our proposal outperforms previous proposals and has an advantage in dealing with natural noise.
Content
Vgl.: https://onlinelibrary.wiley.com/doi/10.1002/asi.24036.
Theme
Retrievalalgorithmen

Similar documents (author)

  1. Yuan, W.: End-user searching behavior in information retrieval : a longitudinal study (1997) 5.20
    5.2002997 = sum of:
      5.2002997 = weight(author_txt:yuan in 395) [ClassicSimilarity], result of:
        5.2002997 = fieldWeight in 395, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.320479 = idf(docFreq=27, maxDocs=42306)
          0.625 = fieldNorm(doc=395)
    
  2. Yuan, W.; Meadow, C.T.: ¬A study of the use of variables in information retrieval user studies (1999) 4.16
    4.1602397 = sum of:
      4.1602397 = weight(author_txt:yuan in 3944) [ClassicSimilarity], result of:
        4.1602397 = fieldWeight in 3944, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.320479 = idf(docFreq=27, maxDocs=42306)
          0.5 = fieldNorm(doc=3944)
    
  3. Jin, Z.; Yuan, C.: On the ambiguity of information retrieval for visualization (1998) 4.16
    4.1602397 = sum of:
      4.1602397 = weight(author_txt:yuan in 4217) [ClassicSimilarity], result of:
        4.1602397 = fieldWeight in 4217, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.320479 = idf(docFreq=27, maxDocs=42306)
          0.5 = fieldNorm(doc=4217)
    
  4. Yuan, X.; Belkin, N.J.: Investigating information retrieval support techniques for different information-seeking strategies (2010) 4.16
    4.1602397 = sum of:
      4.1602397 = weight(author_txt:yuan in 700) [ClassicSimilarity], result of:
        4.1602397 = fieldWeight in 700, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.320479 = idf(docFreq=27, maxDocs=42306)
          0.5 = fieldNorm(doc=700)
    
  5. Yuan, X.; Belkin, N.J.: Evaluating an integrated system supporting multiple information-seeking strategies (2010) 4.16
    4.1602397 = sum of:
      4.1602397 = weight(author_txt:yuan in 993) [ClassicSimilarity], result of:
        4.1602397 = fieldWeight in 993, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.320479 = idf(docFreq=27, maxDocs=42306)
          0.5 = fieldNorm(doc=993)
    

Similar documents (content)

  1. Cole, C.: Shannon revisited : information in terms of uncertainty (1993) 0.11
    0.11334749 = sum of:
      0.11334749 = product of:
        0.94456244 = sum of:
          0.16536418 = weight(abstract_txt:entropy in 4069) [ClassicSimilarity], result of:
            0.16536418 = score(doc=4069,freq=1.0), product of:
              0.22148767 = queryWeight, product of:
                2.1072896 = boost
                7.9638047 = idf(docFreq=39, maxDocs=42306)
                0.013197898 = queryNorm
              0.7466067 = fieldWeight in 4069, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.9638047 = idf(docFreq=39, maxDocs=42306)
                0.09375 = fieldNorm(doc=4069)
          0.39039752 = weight(abstract_txt:uncertainty in 4069) [ClassicSimilarity], result of:
            0.39039752 = score(doc=4069,freq=3.0), product of:
              0.3430543 = queryWeight, product of:
                3.7089062 = boost
                7.008293 = idf(docFreq=103, maxDocs=42306)
                0.013197898 = queryNorm
              1.138005 = fieldWeight in 4069, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.008293 = idf(docFreq=103, maxDocs=42306)
                0.09375 = fieldNorm(doc=4069)
          0.38880077 = weight(abstract_txt:noise in 4069) [ClassicSimilarity], result of:
            0.38880077 = score(doc=4069,freq=1.0), product of:
              0.5315205 = queryWeight, product of:
                5.1615415 = boost
                7.8025365 = idf(docFreq=46, maxDocs=42306)
                0.013197898 = queryNorm
              0.7314878 = fieldWeight in 4069, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.8025365 = idf(docFreq=46, maxDocs=42306)
                0.09375 = fieldNorm(doc=4069)
        0.12 = coord(3/25)
    
  2. Agarwal, B.; Ramampiaro, H.; Langseth, H.; Ruocco, M.: ¬A deep network model for paraphrase detection in short text messages (2018) 0.11
    0.11226746 = sum of:
      0.11226746 = product of:
        0.46778107 = sum of:
          0.05226994 = weight(abstract_txt:detecting in 1962) [ClassicSimilarity], result of:
            0.05226994 = score(doc=1962,freq=1.0), product of:
              0.10689091 = queryWeight, product of:
                1.0351536 = boost
                7.824043 = idf(docFreq=45, maxDocs=42306)
                0.013197898 = queryNorm
              0.48900267 = fieldWeight in 1962, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.824043 = idf(docFreq=45, maxDocs=42306)
                0.0625 = fieldNorm(doc=1962)
          0.0628642 = weight(abstract_txt:noisy in 1962) [ClassicSimilarity], result of:
            0.0628642 = score(doc=1962,freq=1.0), product of:
              0.12088573 = queryWeight, product of:
                1.1008343 = boost
                8.320479 = idf(docFreq=27, maxDocs=42306)
                0.013197898 = queryNorm
              0.52002996 = fieldWeight in 1962, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.320479 = idf(docFreq=27, maxDocs=42306)
                0.0625 = fieldNorm(doc=1962)
          0.019802108 = weight(abstract_txt:evaluation in 1962) [ClassicSimilarity], result of:
            0.019802108 = score(doc=1962,freq=1.0), product of:
              0.07051103 = queryWeight, product of:
                1.1889893 = boost
                4.4933925 = idf(docFreq=1285, maxDocs=42306)
                0.013197898 = queryNorm
              0.28083703 = fieldWeight in 1962, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4933925 = idf(docFreq=1285, maxDocs=42306)
                0.0625 = fieldNorm(doc=1962)
          0.01538906 = weight(abstract_txt:based in 1962) [ClassicSimilarity], result of:
            0.01538906 = score(doc=1962,freq=2.0), product of:
              0.054151595 = queryWeight, product of:
                1.2761469 = boost
                3.2151837 = idf(docFreq=4616, maxDocs=42306)
                0.013197898 = queryNorm
              0.28418478 = fieldWeight in 1962, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.2151837 = idf(docFreq=4616, maxDocs=42306)
                0.0625 = fieldNorm(doc=1962)
          0.05825527 = weight(abstract_txt:natural in 1962) [ClassicSimilarity], result of:
            0.05825527 = score(doc=1962,freq=1.0), product of:
              0.1823964 = queryWeight, product of:
                2.70441 = boost
                5.1102123 = idf(docFreq=693, maxDocs=42306)
                0.013197898 = queryNorm
              0.31938827 = fieldWeight in 1962, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1102123 = idf(docFreq=693, maxDocs=42306)
                0.0625 = fieldNorm(doc=1962)
          0.2592005 = weight(abstract_txt:noise in 1962) [ClassicSimilarity], result of:
            0.2592005 = score(doc=1962,freq=1.0), product of:
              0.5315205 = queryWeight, product of:
                5.1615415 = boost
                7.8025365 = idf(docFreq=46, maxDocs=42306)
                0.013197898 = queryNorm
              0.48765853 = fieldWeight in 1962, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.8025365 = idf(docFreq=46, maxDocs=42306)
                0.0625 = fieldNorm(doc=1962)
        0.24 = coord(6/25)
    
  3. Zhao, L.; Wu, L.; Huang, X.: Using query expansion in graph-based approach for query-focused multi-document summarization (2009) 0.09
    0.09328383 = sum of:
      0.09328383 = product of:
        0.46641916 = sum of:
          0.024752636 = weight(abstract_txt:evaluation in 269) [ClassicSimilarity], result of:
            0.024752636 = score(doc=269,freq=1.0), product of:
              0.07051103 = queryWeight, product of:
                1.1889893 = boost
                4.4933925 = idf(docFreq=1285, maxDocs=42306)
                0.013197898 = queryNorm
              0.3510463 = fieldWeight in 269, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4933925 = idf(docFreq=1285, maxDocs=42306)
                0.078125 = fieldNorm(doc=269)
          0.013602135 = weight(abstract_txt:based in 269) [ClassicSimilarity], result of:
            0.013602135 = score(doc=269,freq=1.0), product of:
              0.054151595 = queryWeight, product of:
                1.2761469 = boost
                3.2151837 = idf(docFreq=4616, maxDocs=42306)
                0.013197898 = queryNorm
              0.25118622 = fieldWeight in 269, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.2151837 = idf(docFreq=4616, maxDocs=42306)
                0.078125 = fieldNorm(doc=269)
          0.05128671 = weight(abstract_txt:algorithm in 269) [ClassicSimilarity], result of:
            0.05128671 = score(doc=269,freq=1.0), product of:
              0.1145986 = queryWeight, product of:
                1.5157901 = boost
                5.7284284 = idf(docFreq=373, maxDocs=42306)
                0.013197898 = queryNorm
              0.44753346 = fieldWeight in 269, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7284284 = idf(docFreq=373, maxDocs=42306)
                0.078125 = fieldNorm(doc=269)
          0.05277706 = weight(abstract_txt:performed in 269) [ClassicSimilarity], result of:
            0.05277706 = score(doc=269,freq=1.0), product of:
              0.11680809 = queryWeight, product of:
                1.5303327 = boost
                5.783387 = idf(docFreq=353, maxDocs=42306)
                0.013197898 = queryNorm
              0.4518271 = fieldWeight in 269, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.783387 = idf(docFreq=353, maxDocs=42306)
                0.078125 = fieldNorm(doc=269)
          0.32400063 = weight(abstract_txt:noise in 269) [ClassicSimilarity], result of:
            0.32400063 = score(doc=269,freq=1.0), product of:
              0.5315205 = queryWeight, product of:
                5.1615415 = boost
                7.8025365 = idf(docFreq=46, maxDocs=42306)
                0.013197898 = queryNorm
              0.6095732 = fieldWeight in 269, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.8025365 = idf(docFreq=46, maxDocs=42306)
                0.078125 = fieldNorm(doc=269)
        0.2 = coord(5/25)
    
  4. Longshu, L.; Xia, Z.: On an aproximate fuzzy information retrieval agent (1998) 0.09
    0.08998877 = sum of:
      0.08998877 = product of:
        0.7499064 = sum of:
          0.02720427 = weight(abstract_txt:based in 4295) [ClassicSimilarity], result of:
            0.02720427 = score(doc=4295,freq=1.0), product of:
              0.054151595 = queryWeight, product of:
                1.2761469 = boost
                3.2151837 = idf(docFreq=4616, maxDocs=42306)
                0.013197898 = queryNorm
              0.50237244 = fieldWeight in 4295, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.2151837 = idf(docFreq=4616, maxDocs=42306)
                0.15625 = fieldNorm(doc=4295)
          0.14506072 = weight(abstract_txt:algorithm in 4295) [ClassicSimilarity], result of:
            0.14506072 = score(doc=4295,freq=2.0), product of:
              0.1145986 = queryWeight, product of:
                1.5157901 = boost
                5.7284284 = idf(docFreq=373, maxDocs=42306)
                0.013197898 = queryNorm
              1.2658157 = fieldWeight in 4295, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.7284284 = idf(docFreq=373, maxDocs=42306)
                0.15625 = fieldNorm(doc=4295)
          0.5776414 = weight(abstract_txt:fuzzy in 4295) [ClassicSimilarity], result of:
            0.5776414 = score(doc=4295,freq=5.0), product of:
              0.24282986 = queryWeight, product of:
                2.702378 = boost
                6.808497 = idf(docFreq=126, maxDocs=42306)
                0.013197898 = queryNorm
              2.3787909 = fieldWeight in 4295, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.808497 = idf(docFreq=126, maxDocs=42306)
                0.15625 = fieldNorm(doc=4295)
        0.12 = coord(3/25)
    
  5. Colina, J.: ¬Un algoritmo informetrico para la evaluacion de un vocabulario de busqueda (1995) 0.08
    0.08189291 = sum of:
      0.08189291 = product of:
        1.0236614 = sum of:
          0.37566018 = weight(abstract_txt:uncertainty in 6824) [ClassicSimilarity], result of:
            0.37566018 = score(doc=6824,freq=1.0), product of:
              0.3430543 = queryWeight, product of:
                3.7089062 = boost
                7.008293 = idf(docFreq=103, maxDocs=42306)
                0.013197898 = queryNorm
              1.0950458 = fieldWeight in 6824, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.008293 = idf(docFreq=103, maxDocs=42306)
                0.15625 = fieldNorm(doc=6824)
          0.64800125 = weight(abstract_txt:noise in 6824) [ClassicSimilarity], result of:
            0.64800125 = score(doc=6824,freq=1.0), product of:
              0.5315205 = queryWeight, product of:
                5.1615415 = boost
                7.8025365 = idf(docFreq=46, maxDocs=42306)
                0.013197898 = queryNorm
              1.2191464 = fieldWeight in 6824, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.8025365 = idf(docFreq=46, maxDocs=42306)
                0.15625 = fieldNorm(doc=6824)
        0.08 = coord(2/25)