Search (1 results, page 1 of 1)

  • × author_ss:"Craven, T.C."
  • × theme_ss:"Informetrie"
  1. Craven, T.C.: Determining authorship of Web pages (2006) 0.07
    0.06704608 = product of:
      0.13409217 = sum of:
        0.13409217 = product of:
          0.26818433 = sum of:
            0.26818433 = weight(_text_:943 in 1498) [ClassicSimilarity], result of:
              0.26818433 = score(doc=1498,freq=2.0), product of:
                0.40901226 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.04824389 = queryNorm
                0.65568775 = fieldWeight in 1498, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1498)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Assignability of authors to Web pages using either normal browsing procedures or browsing assisted by simple automatic extraction was investigated. Candidate strings for 1000 pages were extracted automatically from title elements, meta-tags, and address-like and copyright-like passages; 539 of the pages produced at least one candidate: 310 candidates from titles, 66 from meta-tags, 91 from address-like passages, and 259 from copyright-like passages. An assistant attempted to identify personal authors for 943 pages by examining the pages themselves and related pages; this added 90 pages with authors to the pages from which no candidate strings were extracted. Specific problems are noted and some refinements to the extraction methods are suggested.