Search (5 results, page 1 of 1)

  • × author_ss:"Cristo, M."
  • × language_ss:"e"
  1. Calado, P.; Cristo, M.; Gonçalves, M.A.; Moura, E.S. de; Ribeiro-Neto, B.; Ziviani, N.: Link-based similarity measures for the classification of Web documents (2006) 0.03
    0.033307053 = sum of:
      0.011494946 = product of:
        0.068969674 = sum of:
          0.068969674 = weight(_text_:authors in 4921) [ClassicSimilarity], result of:
            0.068969674 = score(doc=4921,freq=4.0), product of:
              0.19364944 = queryWeight, product of:
                4.558814 = idf(docFreq=1258, maxDocs=44218)
                0.04247803 = queryNorm
              0.35615736 = fieldWeight in 4921, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.558814 = idf(docFreq=1258, maxDocs=44218)
                0.0390625 = fieldNorm(doc=4921)
        0.16666667 = coord(1/6)
      0.02181211 = product of:
        0.04362422 = sum of:
          0.04362422 = weight(_text_:n in 4921) [ClassicSimilarity], result of:
            0.04362422 = score(doc=4921,freq=2.0), product of:
              0.18315066 = queryWeight, product of:
                4.3116565 = idf(docFreq=1611, maxDocs=44218)
                0.04247803 = queryNorm
              0.23818761 = fieldWeight in 4921, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3116565 = idf(docFreq=1611, maxDocs=44218)
                0.0390625 = fieldNorm(doc=4921)
        0.5 = coord(1/2)
    
    Abstract
    Traditional text-based document classifiers tend to perform poorly an the Web. Text in Web documents is usually noisy and often does not contain enough information to determine their topic. However, the Web provides a different source that can be useful to document classification: its hyperlink structure. In this work, the authors evaluate how the link structure of the Web can be used to determine a measure of similarity appropriate for document classification. They experiment with five different similarity measures and determine their adequacy for predicting the topic of a Web page. Tests performed an a Web directory Show that link information alone allows classifying documents with an average precision of 86%. Further, when combined with a traditional textbased classifier, precision increases to values of up to 90%, representing gains that range from 63 to 132% over the use of text-based classification alone. Because the measures proposed in this article are straightforward to compute, they provide a practical and effective solution for Web classification and related information retrieval tasks. Further, the authors provide an important set of guidelines an how link structure can be used effectively to classify Web documents.
  2. Couto, T.; Cristo, M.; Gonçalves, M.A.; Calado, P.; Ziviani, N.; Moura, E.; Ribeiro-Neto, B.: ¬A comparative study of citations and links in document classification (2006) 0.01
    0.010906055 = product of:
      0.02181211 = sum of:
        0.02181211 = product of:
          0.04362422 = sum of:
            0.04362422 = weight(_text_:n in 2531) [ClassicSimilarity], result of:
              0.04362422 = score(doc=2531,freq=2.0), product of:
                0.18315066 = queryWeight, product of:
                  4.3116565 = idf(docFreq=1611, maxDocs=44218)
                  0.04247803 = queryNorm
                0.23818761 = fieldWeight in 2531, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.3116565 = idf(docFreq=1611, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2531)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  3. Silva, A.J.C.; Gonçalves, M.A.; Laender, A.H.F.; Modesto, M.A.B.; Cristo, M.; Ziviani, N.: Finding what is missing from a digital library : a case study in the computer science field (2009) 0.01
    0.010906055 = product of:
      0.02181211 = sum of:
        0.02181211 = product of:
          0.04362422 = sum of:
            0.04362422 = weight(_text_:n in 4219) [ClassicSimilarity], result of:
              0.04362422 = score(doc=4219,freq=2.0), product of:
                0.18315066 = queryWeight, product of:
                  4.3116565 = idf(docFreq=1611, maxDocs=44218)
                  0.04247803 = queryNorm
                0.23818761 = fieldWeight in 4219, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.3116565 = idf(docFreq=1611, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4219)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  4. Dalip, D.H.; Gonçalves, M.A.; Cristo, M.; Calado, P.: ¬A general multiview framework for assessing the quality of collaboratively created content on web 2.0 (2017) 0.01
    0.007193983 = product of:
      0.014387966 = sum of:
        0.014387966 = product of:
          0.028775932 = sum of:
            0.028775932 = weight(_text_:22 in 3343) [ClassicSimilarity], result of:
              0.028775932 = score(doc=3343,freq=2.0), product of:
                0.14875081 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04247803 = queryNorm
                0.19345059 = fieldWeight in 3343, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3343)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    16.11.2017 13:04:22
  5. Souza, J.; Carvalho, A.; Cristo, M.; Moura, E.; Calado, P.; Chirita, P.-A.; Nejdl, W.: Using site-level connections to estimate link confidence (2012) 0.00
    0.004064077 = product of:
      0.008128154 = sum of:
        0.008128154 = product of:
          0.048768923 = sum of:
            0.048768923 = weight(_text_:authors in 498) [ClassicSimilarity], result of:
              0.048768923 = score(doc=498,freq=2.0), product of:
                0.19364944 = queryWeight, product of:
                  4.558814 = idf(docFreq=1258, maxDocs=44218)
                  0.04247803 = queryNorm
                0.25184128 = fieldWeight in 498, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.558814 = idf(docFreq=1258, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=498)
          0.16666667 = coord(1/6)
      0.5 = coord(1/2)
    
    Abstract
    Search engines are essential tools for web users today. They rely on a large number of features to compute the rank of search results for each given query. The estimated reputation of pages is among the effective features available for search engine designers, probably being adopted by most current commercial search engines. Page reputation is estimated by analyzing the linkage relationships between pages. This information is used by link analysis algorithms as a query-independent feature, to be taken into account when computing the rank of the results. Unfortunately, several types of links found on the web may damage the estimated page reputation and thus cause a negative effect on the quality of search results. This work studies alternatives to reduce the negative impact of such noisy links. More specifically, the authors propose and evaluate new methods that deal with noisy links, considering scenarios where the reputation of pages is computed using the PageRank algorithm. They show, through experiments with real web content, that their methods achieve significant improvements when compared to previous solutions proposed in the literature.