Search (6 results, page 1 of 1)

Dalip, D.H.; Gonçalves, M.A.; Cristo, M.; Calado, P.: ¬A general multiview framework for assessing the quality of collaboratively created content on web 2.0 (2017) 0.03
```
0.031199675 = product of:
  0.06239935 = sum of:
    0.06239935 = sum of:
      0.031563994 = weight(_text_:b in 3343) [ClassicSimilarity], result of:
        0.031563994 = score(doc=3343,freq=2.0), product of:
          0.16126883 = queryWeight, product of:
            3.542962 = idf(docFreq=3476, maxDocs=44218)
            0.045518078 = queryNorm
          0.19572285 = fieldWeight in 3343, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.542962 = idf(docFreq=3476, maxDocs=44218)
            0.0390625 = fieldNorm(doc=3343)
      0.030835358 = weight(_text_:22 in 3343) [ClassicSimilarity], result of:
        0.030835358 = score(doc=3343,freq=2.0), product of:
          0.15939656 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.045518078 = queryNorm
          0.19345059 = fieldWeight in 3343, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=3343)
  0.5 = coord(1/2)
```
Abstract

User-generated content is one of the most interesting phenomena of current published media, as users are now able not only to consume, but also to produce content in a much faster and easier manner. However, such freedom also carries concerns about content quality. In this work, we propose an automatic framework to assess the quality of collaboratively generated content. Quality is addressed as a multidimensional concept, modeled as a combination of independent assessments, each regarding different quality dimensions. Accordingly, we adopt a machine-learning (ML)-based multiview approach to assess content quality. We perform a thorough analysis of our framework on two different domains: Questions and Answer Forums and Collaborative Encyclopedias. This allowed us to better understand when and how the proposed multiview approach is able to provide accurate quality assessments. Our main contributions are: (a) a general ML multiview framework that takes advantage of different views of quality indicators; (b) the improvement (up to 30%) in quality assessment over the best state-of-the-art baseline methods; (c) a thorough feature and view analysis regarding impact, informativeness, and correlation, based on two distinct domains.

Date

16.11.2017 13:04:22
Moura, E.S. de; Fernandes, D.; Ribeiro-Neto, B.; Silva, A.S. da; Gonçalves, M.A.: Using structural information to improve search in Web collections (2010) 0.01
```
0.013391469 = product of:
  0.026782937 = sum of:
    0.026782937 = product of:
      0.053565875 = sum of:
        0.053565875 = weight(_text_:b in 4119) [ClassicSimilarity], result of:
          0.053565875 = score(doc=4119,freq=4.0), product of:
            0.16126883 = queryWeight, product of:
              3.542962 = idf(docFreq=3476, maxDocs=44218)
              0.045518078 = queryNorm
            0.3321527 = fieldWeight in 4119, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.542962 = idf(docFreq=3476, maxDocs=44218)
              0.046875 = fieldNorm(doc=4119)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

In this work, we investigate the problem of using the block structure of Web pages to improve ranking results. Starting with basic intuitions provided by the concepts of term frequency (TF) and inverse document frequency (IDF), we propose nine block-weight functions to distinguish the impact of term occurrences inside page blocks, instead of inside whole pages. These are then used to compute a modified BM25 ranking function. Using four distinct Web collections, we ran extensive experiments to compare our block-weight ranking formulas with two other baselines: (a) a BM25 ranking applied to full pages, and (b) a BM25 ranking that takes into account best blocks. Our methods suggest that our block-weighting ranking method is superior to all baselines across all collections we used and that average gain in precision figures from 5 to 20% are generated.
Pereira, D.A.; Ribeiro-Neto, B.; Ziviani, N.; Laender, A.H.F.; Gonçalves, M.A.: ¬A generic Web-based entity resolution framework (2011) 0.01
```
0.011159557 = product of:
  0.022319114 = sum of:
    0.022319114 = product of:
      0.044638228 = sum of:
        0.044638228 = weight(_text_:b in 4450) [ClassicSimilarity], result of:
          0.044638228 = score(doc=4450,freq=4.0), product of:
            0.16126883 = queryWeight, product of:
              3.542962 = idf(docFreq=3476, maxDocs=44218)
              0.045518078 = queryNorm
            0.2767939 = fieldWeight in 4450, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.542962 = idf(docFreq=3476, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4450)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Web data repositories usually contain references to thousands of real-world entities from multiple sources. It is not uncommon that multiple entities share the same label (polysemes) and that distinct label variations are associated with the same entity (synonyms), which frequently leads to ambiguous interpretations. Further, spelling variants, acronyms, abbreviated forms, and misspellings compound to worsen the problem. Solving this problem requires identifying which labels correspond to the same real-world entity, a process known as entity resolution. One approach to solve the entity resolution problem is to associate an authority identifier and a list of variant forms with each entity-a data structure known as an authority file. In this work, we propose a generic framework for implementing a method for generating authority files. Our method uses information from the Web to improve the quality of the authority file and, because of that, is referred to as WER-Web-based Entity Resolution. Our contribution here is threefold: (a) we discuss how to implement the WER framework, which is flexible and easy to adapt to new domains; (b) we run extended experimentation with our WER framework to show that it outperforms selected baselines; and (c) we compare the results of a specialized solution for author name resolution with those produced by the generic WER framework, and show that the WER results remain competitive.

Calado, P.; Cristo, M.; Gonçalves, M.A.; Moura, E.S. de; Ribeiro-Neto, B.; Ziviani, N.: Link-based similarity measures for the classification of Web documents (2006) 0.01

0.007890998 = product of:
  0.015781997 = sum of:
    0.015781997 = product of:
      0.031563994 = sum of:
        0.031563994 = weight(_text_:b in 4921) [ClassicSimilarity], result of:
          0.031563994 = score(doc=4921,freq=2.0), product of:
            0.16126883 = queryWeight, product of:
              3.542962 = idf(docFreq=3476, maxDocs=44218)
              0.045518078 = queryNorm
            0.19572285 = fieldWeight in 4921, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.542962 = idf(docFreq=3476, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4921)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Couto, T.; Cristo, M.; Gonçalves, M.A.; Calado, P.; Ziviani, N.; Moura, E.; Ribeiro-Neto, B.: ¬A comparative study of citations and links in document classification (2006) 0.01

0.007890998 = product of:
  0.015781997 = sum of:
    0.015781997 = product of:
      0.031563994 = sum of:
        0.031563994 = weight(_text_:b in 2531) [ClassicSimilarity], result of:
          0.031563994 = score(doc=2531,freq=2.0), product of:
            0.16126883 = queryWeight, product of:
              3.542962 = idf(docFreq=3476, maxDocs=44218)
              0.045518078 = queryNorm
            0.19572285 = fieldWeight in 2531, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.542962 = idf(docFreq=3476, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2531)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Belém, F.M.; Almeida, J.M.; Gonçalves, M.A.: ¬A survey on tag recommendation methods : a review (2017) 0.01

0.0077088396 = product of:
  0.015417679 = sum of:
    0.015417679 = product of:
      0.030835358 = sum of:
        0.030835358 = weight(_text_:22 in 3524) [ClassicSimilarity], result of:
          0.030835358 = score(doc=3524,freq=2.0), product of:
            0.15939656 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045518078 = queryNorm
            0.19345059 = fieldWeight in 3524, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3524)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 16.11.2017 13:30:22

Search (6 results, page 1 of 1)

Authors

Years

Themes