Document (#41461)

Author
Zhu, J.
Han, L.
Gou, Z.
Yuan, X.
Title
¬A fuzzy clustering-based denoising model for evaluating uncertainty in collaborative filtering recommender systems
Source
Journal of the Association for Information Science and Technology. 69(2018) no.9, S.1109-1121
Year
2018
Abstract
Recommender systems are effective in predicting the most suitable products for users, such as movies and books. To facilitate personalized recommendations, the quality of item ratings should be guaranteed. However, a few ratings might not be accurate enough due to the uncertainty of user behavior and are referred to as natural noise. In this article, we present a novel fuzzy clustering-based method for detecting noisy ratings. The entropy of a subset of the original ratings dataset is used to indicate the data-driven uncertainty, and evaluation metrics are adopted to represent the prediction-driven uncertainty. After the repetition of resampling and the execution of a recommendation algorithm, the entropy and evaluation metrics vectors are obtained and are empirically categorized to identify the proportion of the potential noise. Then, the fuzzy C-means-based denoising (FCMD) algorithm is performed to verify the natural noise under the assumption that natural noise is primarily the result of the exceptional behavior of users. Finally, a case study is performed using two real-world datasets. The experimental results show that our proposal outperforms previous proposals and has an advantage in dealing with natural noise.
Content
Vgl.: https://onlinelibrary.wiley.com/doi/10.1002/asi.24036.
Theme
Retrievalalgorithmen

Similar documents (author)

  1. Yuan, W.: End-user searching behavior in information retrieval : a longitudinal study (1997) 5.18
    5.184806 = sum of:
      5.184806 = weight(author_txt:yuan in 394) [ClassicSimilarity], result of:
        5.184806 = fieldWeight in 394, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.29569 = idf(docFreq=29, maxDocs=44218)
          0.625 = fieldNorm(doc=394)
    
  2. Yuan, W.; Meadow, C.T.: ¬A study of the use of variables in information retrieval user studies (1999) 4.15
    4.147845 = sum of:
      4.147845 = weight(author_txt:yuan in 2943) [ClassicSimilarity], result of:
        4.147845 = fieldWeight in 2943, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.29569 = idf(docFreq=29, maxDocs=44218)
          0.5 = fieldNorm(doc=2943)
    
  3. Jin, Z.; Yuan, C.: On the ambiguity of information retrieval for visualization (1998) 4.15
    4.147845 = sum of:
      4.147845 = weight(author_txt:yuan in 3216) [ClassicSimilarity], result of:
        4.147845 = fieldWeight in 3216, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.29569 = idf(docFreq=29, maxDocs=44218)
          0.5 = fieldNorm(doc=3216)
    
  4. Yuan, X.; Belkin, N.J.: Investigating information retrieval support techniques for different information-seeking strategies (2010) 4.15
    4.147845 = sum of:
      4.147845 = weight(author_txt:yuan in 3699) [ClassicSimilarity], result of:
        4.147845 = fieldWeight in 3699, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.29569 = idf(docFreq=29, maxDocs=44218)
          0.5 = fieldNorm(doc=3699)
    
  5. Yuan, X.; Belkin, N.J.: Evaluating an integrated system supporting multiple information-seeking strategies (2010) 4.15
    4.147845 = sum of:
      4.147845 = weight(author_txt:yuan in 3992) [ClassicSimilarity], result of:
        4.147845 = fieldWeight in 3992, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.29569 = idf(docFreq=29, maxDocs=44218)
          0.5 = fieldNorm(doc=3992)
    

Similar documents (content)

  1. Tay, W.; Zhang, X.; Karimi , S.: Beyond mean rating : probabilistic aggregation of star ratings based on helpfulness (2020) 0.16
    0.15601252 = sum of:
      0.15601252 = product of:
        0.78006256 = sum of:
          0.08596812 = weight(abstract_txt:noisy in 5917) [ClassicSimilarity], result of:
            0.08596812 = score(doc=5917,freq=2.0), product of:
              0.11860649 = queryWeight, product of:
                1.0988369 = boost
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.013162588 = queryNorm
              0.724818 = fieldWeight in 5917, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.0625 = fieldNorm(doc=5917)
          0.06451625 = weight(abstract_txt:movies in 5917) [ClassicSimilarity], result of:
            0.06451625 = score(doc=5917,freq=1.0), product of:
              0.12340694 = queryWeight, product of:
                1.1208534 = boost
                8.364683 = idf(docFreq=27, maxDocs=44218)
                0.013162588 = queryNorm
              0.5227927 = fieldWeight in 5917, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.364683 = idf(docFreq=27, maxDocs=44218)
                0.0625 = fieldNorm(doc=5917)
          0.015152429 = weight(abstract_txt:based in 5917) [ClassicSimilarity], result of:
            0.015152429 = score(doc=5917,freq=2.0), product of:
              0.0537748 = queryWeight, product of:
                1.2815309 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.013162588 = queryNorm
              0.28177565 = fieldWeight in 5917, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.0625 = fieldNorm(doc=5917)
          0.35438362 = weight(abstract_txt:ratings in 5917) [ClassicSimilarity], result of:
            0.35438362 = score(doc=5917,freq=4.0), product of:
              0.38418695 = queryWeight, product of:
                3.955308 = boost
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.013162588 = queryNorm
              0.9224249 = fieldWeight in 5917, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.0625 = fieldNorm(doc=5917)
          0.26004216 = weight(abstract_txt:noise in 5917) [ClassicSimilarity], result of:
            0.26004216 = score(doc=5917,freq=1.0), product of:
              0.5344569 = queryWeight, product of:
                5.2157936 = boost
                7.7848644 = idf(docFreq=49, maxDocs=44218)
                0.013162588 = queryNorm
              0.48655403 = fieldWeight in 5917, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7848644 = idf(docFreq=49, maxDocs=44218)
                0.0625 = fieldNorm(doc=5917)
        0.2 = coord(5/25)
    
  2. Cole, C.: Shannon revisited : information in terms of uncertainty (1993) 0.11
    0.112833224 = sum of:
      0.112833224 = product of:
        0.94027686 = sum of:
          0.16674507 = weight(abstract_txt:entropy in 4069) [ClassicSimilarity], result of:
            0.16674507 = score(doc=4069,freq=1.0), product of:
              0.22346593 = queryWeight, product of:
                2.1330433 = boost
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.013162588 = queryNorm
              0.74617666 = fieldWeight in 4069, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.9592175 = idf(docFreq=41, maxDocs=44218)
                0.09375 = fieldNorm(doc=4069)
          0.3834685 = weight(abstract_txt:uncertainty in 4069) [ClassicSimilarity], result of:
            0.3834685 = score(doc=4069,freq=3.0), product of:
              0.3401199 = queryWeight, product of:
                3.72156 = boost
                6.943297 = idf(docFreq=115, maxDocs=44218)
                0.013162588 = queryNorm
              1.127451 = fieldWeight in 4069, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.943297 = idf(docFreq=115, maxDocs=44218)
                0.09375 = fieldNorm(doc=4069)
          0.39006326 = weight(abstract_txt:noise in 4069) [ClassicSimilarity], result of:
            0.39006326 = score(doc=4069,freq=1.0), product of:
              0.5344569 = queryWeight, product of:
                5.2157936 = boost
                7.7848644 = idf(docFreq=49, maxDocs=44218)
                0.013162588 = queryNorm
              0.72983104 = fieldWeight in 4069, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7848644 = idf(docFreq=49, maxDocs=44218)
                0.09375 = fieldNorm(doc=4069)
        0.12 = coord(3/25)
    
  3. Agarwal, B.; Ramampiaro, H.; Langseth, H.; Ruocco, M.: ¬A deep network model for paraphrase detection in short text messages (2018) 0.11
    0.11131136 = sum of:
      0.11131136 = product of:
        0.46379736 = sum of:
          0.050121505 = weight(abstract_txt:detecting in 5043) [ClassicSimilarity], result of:
            0.050121505 = score(doc=5043,freq=1.0), product of:
              0.10429006 = queryWeight, product of:
                1.0303873 = boost
                7.689554 = idf(docFreq=54, maxDocs=44218)
                0.013162588 = queryNorm
              0.48059714 = fieldWeight in 5043, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.689554 = idf(docFreq=54, maxDocs=44218)
                0.0625 = fieldNorm(doc=5043)
          0.06078864 = weight(abstract_txt:noisy in 5043) [ClassicSimilarity], result of:
            0.06078864 = score(doc=5043,freq=1.0), product of:
              0.11860649 = queryWeight, product of:
                1.0988369 = boost
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.013162588 = queryNorm
              0.5125237 = fieldWeight in 5043, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.0625 = fieldNorm(doc=5043)
          0.019904368 = weight(abstract_txt:evaluation in 5043) [ClassicSimilarity], result of:
            0.019904368 = score(doc=5043,freq=1.0), product of:
              0.07099086 = queryWeight, product of:
                1.2022512 = boost
                4.4860687 = idf(docFreq=1353, maxDocs=44218)
                0.013162588 = queryNorm
              0.2803793 = fieldWeight in 5043, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4860687 = idf(docFreq=1353, maxDocs=44218)
                0.0625 = fieldNorm(doc=5043)
          0.015152429 = weight(abstract_txt:based in 5043) [ClassicSimilarity], result of:
            0.015152429 = score(doc=5043,freq=2.0), product of:
              0.0537748 = queryWeight, product of:
                1.2815309 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.013162588 = queryNorm
              0.28177565 = fieldWeight in 5043, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.0625 = fieldNorm(doc=5043)
          0.057788245 = weight(abstract_txt:natural in 5043) [ClassicSimilarity], result of:
            0.057788245 = score(doc=5043,freq=1.0), product of:
              0.1820287 = queryWeight, product of:
                2.7225692 = boost
                5.0794845 = idf(docFreq=747, maxDocs=44218)
                0.013162588 = queryNorm
              0.31746778 = fieldWeight in 5043, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0794845 = idf(docFreq=747, maxDocs=44218)
                0.0625 = fieldNorm(doc=5043)
          0.26004216 = weight(abstract_txt:noise in 5043) [ClassicSimilarity], result of:
            0.26004216 = score(doc=5043,freq=1.0), product of:
              0.5344569 = queryWeight, product of:
                5.2157936 = boost
                7.7848644 = idf(docFreq=49, maxDocs=44218)
                0.013162588 = queryNorm
              0.48655403 = fieldWeight in 5043, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7848644 = idf(docFreq=49, maxDocs=44218)
                0.0625 = fieldNorm(doc=5043)
        0.24 = coord(6/25)
    
  4. Zhao, L.; Wu, L.; Huang, X.: Using query expansion in graph-based approach for query-focused multi-document summarization (2009) 0.09
    0.093460426 = sum of:
      0.093460426 = product of:
        0.4673021 = sum of:
          0.02488046 = weight(abstract_txt:evaluation in 2449) [ClassicSimilarity], result of:
            0.02488046 = score(doc=2449,freq=1.0), product of:
              0.07099086 = queryWeight, product of:
                1.2022512 = boost
                4.4860687 = idf(docFreq=1353, maxDocs=44218)
                0.013162588 = queryNorm
              0.35047412 = fieldWeight in 2449, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4860687 = idf(docFreq=1353, maxDocs=44218)
                0.078125 = fieldNorm(doc=2449)
          0.013392982 = weight(abstract_txt:based in 2449) [ClassicSimilarity], result of:
            0.013392982 = score(doc=2449,freq=1.0), product of:
              0.0537748 = queryWeight, product of:
                1.2815309 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.013162588 = queryNorm
              0.24905685 = fieldWeight in 2449, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.078125 = fieldNorm(doc=2449)
          0.05118281 = weight(abstract_txt:algorithm in 2449) [ClassicSimilarity], result of:
            0.05118281 = score(doc=2449,freq=1.0), product of:
              0.11482759 = queryWeight, product of:
                1.529034 = boost
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.013162588 = queryNorm
              0.44573617 = fieldWeight in 2449, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.078125 = fieldNorm(doc=2449)
          0.05279316 = weight(abstract_txt:performed in 2449) [ClassicSimilarity], result of:
            0.05279316 = score(doc=2449,freq=1.0), product of:
              0.11722366 = queryWeight, product of:
                1.5449046 = boost
                5.7646422 = idf(docFreq=376, maxDocs=44218)
                0.013162588 = queryNorm
              0.45036268 = fieldWeight in 2449, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7646422 = idf(docFreq=376, maxDocs=44218)
                0.078125 = fieldNorm(doc=2449)
          0.3250527 = weight(abstract_txt:noise in 2449) [ClassicSimilarity], result of:
            0.3250527 = score(doc=2449,freq=1.0), product of:
              0.5344569 = queryWeight, product of:
                5.2157936 = boost
                7.7848644 = idf(docFreq=49, maxDocs=44218)
                0.013162588 = queryNorm
              0.60819256 = fieldWeight in 2449, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7848644 = idf(docFreq=49, maxDocs=44218)
                0.078125 = fieldNorm(doc=2449)
        0.2 = coord(5/25)
    
  5. Longshu, L.; Xia, Z.: On an aproximate fuzzy information retrieval agent (1998) 0.09
    0.09173082 = sum of:
      0.09173082 = product of:
        0.7644235 = sum of:
          0.026785964 = weight(abstract_txt:based in 3294) [ClassicSimilarity], result of:
            0.026785964 = score(doc=3294,freq=1.0), product of:
              0.0537748 = queryWeight, product of:
                1.2815309 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.013162588 = queryNorm
              0.4981137 = fieldWeight in 3294, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.15625 = fieldNorm(doc=3294)
          0.14476685 = weight(abstract_txt:algorithm in 3294) [ClassicSimilarity], result of:
            0.14476685 = score(doc=3294,freq=2.0), product of:
              0.11482759 = queryWeight, product of:
                1.529034 = boost
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.013162588 = queryNorm
              1.2607323 = fieldWeight in 3294, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.15625 = fieldNorm(doc=3294)
          0.59287065 = weight(abstract_txt:fuzzy in 3294) [ClassicSimilarity], result of:
            0.59287065 = score(doc=3294,freq=5.0), product of:
              0.24790801 = queryWeight, product of:
                2.7515976 = boost
                6.8448567 = idf(docFreq=127, maxDocs=44218)
                0.013162588 = queryNorm
              2.3914945 = fieldWeight in 3294, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.8448567 = idf(docFreq=127, maxDocs=44218)
                0.15625 = fieldNorm(doc=3294)
        0.12 = coord(3/25)