Search (4 results, page 1 of 1)

Shen, D.; Chen, Z.; Yang, Q.; Zeng, H.J.; Zhang, B.; Lu, Y.; Ma, W.Y.: Web page classification through summarization (2004) 0.06

0.05772092 = product of:
  0.11544184 = sum of:
    0.11544184 = product of:
      0.23088367 = sum of:
        0.23088367 = weight(_text_:q in 4132) [ClassicSimilarity], result of:
          0.23088367 = score(doc=4132,freq=2.0), product of:
            0.3190709 = queryWeight, product of:
              6.5493927 = idf(docFreq=171, maxDocs=44218)
              0.048717633 = queryNorm
            0.7236124 = fieldWeight in 4132, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.5493927 = idf(docFreq=171, maxDocs=44218)
              0.078125 = fieldNorm(doc=4132)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Chen, Z.; Fu, B.: On the complexity of Rocchio's similarity-based relevance feedback algorithm (2007) 0.04
```
0.040814858 = product of:
  0.081629716 = sum of:
    0.081629716 = product of:
      0.16325943 = sum of:
        0.16325943 = weight(_text_:q in 578) [ClassicSimilarity], result of:
          0.16325943 = score(doc=578,freq=4.0), product of:
            0.3190709 = queryWeight, product of:
              6.5493927 = idf(docFreq=171, maxDocs=44218)
              0.048717633 = queryNorm
            0.5116713 = fieldWeight in 578, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              6.5493927 = idf(docFreq=171, maxDocs=44218)
              0.0390625 = fieldNorm(doc=578)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Rocchio's similarity-based relevance feedback algorithm, one of the most important query reformation methods in information retrieval, is essentially an adaptive learning algorithm from examples in searching for documents represented by a linear classifier. Despite its popularity in various applications, there is little rigorous analysis of its learning complexity in literature. In this article, the authors prove for the first time that the learning complexity of Rocchio's algorithm is O(d + d**2(log d + log n)) over the discretized vector space {0, ... , n - 1 }**d when the inner product similarity measure is used. The upper bound on the learning complexity for searching for documents represented by a monotone linear classifier (q, 0) over {0, ... , n - 1 }d can be improved to, at most, 1 + 2k (n - 1) (log d + log(n - 1)), where k is the number of nonzero components in q. Several lower bounds on the learning complexity are also obtained for Rocchio's algorithm. For example, the authors prove that Rocchio's algorithm has a lower bound Omega((d über 2)log n) on its learning complexity over the Boolean vector space {0,1}**d.

Shen, D.; Yang, Q.; Chen, Z.: Noise reduction through summarization for Web-page classification (2007) 0.03

0.034632552 = product of:
  0.069265105 = sum of:
    0.069265105 = product of:
      0.13853021 = sum of:
        0.13853021 = weight(_text_:q in 953) [ClassicSimilarity], result of:
          0.13853021 = score(doc=953,freq=2.0), product of:
            0.3190709 = queryWeight, product of:
              6.5493927 = idf(docFreq=171, maxDocs=44218)
              0.048717633 = queryNorm
            0.43416747 = fieldWeight in 953, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.5493927 = idf(docFreq=171, maxDocs=44218)
              0.046875 = fieldNorm(doc=953)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Chen, Z.; Wenyin, L.; Zhang, F.; Li, M.; Zhang, H.: Web mining for Web image retrieval (2001) 0.02
```
0.015192511 = product of:
  0.030385021 = sum of:
    0.030385021 = product of:
      0.121540084 = sum of:
        0.121540084 = weight(_text_:author's in 6521) [ClassicSimilarity], result of:
          0.121540084 = score(doc=6521,freq=2.0), product of:
            0.32738996 = queryWeight, product of:
              6.7201533 = idf(docFreq=144, maxDocs=44218)
              0.048717633 = queryNorm
            0.3712395 = fieldWeight in 6521, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.7201533 = idf(docFreq=144, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6521)
      0.25 = coord(1/4)
  0.5 = coord(1/2)
```
Abstract

The popularity of digital images is rapidly increasing due to improving digital imaging technologies and convenient availability facilitated by the Internet. However, how to find user-intended images from the Internet is nontrivial. The main reason is that the Web images are usually not annotated using semantic descriptors. In this article, we present an effective approach to and a prototype system for image retrieval from the Internet using Web mining. The system can also serve as a Web image search engine. One of the key ideas in the approach is to extract the text information on the Web pages to semantically describe the images. The text description is then combined with other low-level image features in the image similarity assessment. Another main contribution of this work is that we apply data mining on the log of users' feedback to improve image retrieval performance in three aspects. First, the accuracy of the document space model of image representation obtained from the Web pages is improved by removing clutter and irrelevant text information. Second, to construct the user space model of users' representation of images, which is then combined with the document space model to eliminate mismatch between the page author's expression and the user's understanding and expectation. Third, to discover the relationship between low-level and high-level features, which is extremely useful for assigning the low-level features' weights in similarity assessment

Search (4 results, page 1 of 1)

Authors

Themes