Search (2 results, page 1 of 1)

Yang, C.C.; Liu, N.: Web site topic-hierarchy generation based on link structure (2009) 0.04
```
0.04298305 = product of:
  0.064474575 = sum of:
    0.04744636 = weight(_text_:search in 2738) [ClassicSimilarity], result of:
      0.04744636 = score(doc=2738,freq=4.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.27153727 = fieldWeight in 2738, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2738)
    0.017028214 = product of:
      0.03405643 = sum of:
        0.03405643 = weight(_text_:22 in 2738) [ClassicSimilarity], result of:
          0.03405643 = score(doc=2738,freq=2.0), product of:
            0.17604718 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05027291 = queryNorm
            0.19345059 = fieldWeight in 2738, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2738)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Navigating through hyperlinks within a Web site to look for information from one of its Web pages without the support of a site map can be inefficient and ineffective. Although the content of a Web site is usually organized with an inherent structure like a topic hierarchy, which is a directed tree rooted at a Web site's homepage whose vertices and edges correspond to Web pages and hyperlinks, such a topic hierarchy is not always available to the user. In this work, we studied the problem of automatic generation of Web sites' topic hierarchies. We modeled a Web site's link structure as a weighted directed graph and proposed methods for estimating edge weights based on eight types of features and three learning algorithms, namely decision trees, naïve Bayes classifiers, and logistic regression. Three graph algorithms, namely breadth-first search, shortest-path search, and directed minimum-spanning tree, were adapted to generate the topic hierarchy based on the graph model. We have tested the model and algorithms on real Web sites. It is found that the directed minimum-spanning tree algorithm with the decision tree as the weight learning algorithm achieves the highest performance with an average accuracy of 91.9%.

Date

22. 3.2009 12:51:47
Yang, C.C.; Chung, A.: ¬A personal agent for Chinese financial news on the Web (2002) 0.02
```
0.019369897 = product of:
  0.058109686 = sum of:
    0.058109686 = weight(_text_:search in 205) [ClassicSimilarity], result of:
      0.058109686 = score(doc=205,freq=6.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.33256388 = fieldWeight in 205, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=205)
  0.33333334 = coord(1/3)
```
Abstract

As the Web has become a major channel of information dissemination, many newspapers expand their services by providing electronic versions of news information on the Web. However, most investors find it difficult to search for the financial information of interest from the huge Web information space-information overloading problem. In this article, we present a personal agent that utilizes user profiles and user relevance feedback to search for the Chinese Web financial news articles on behalf of users. A Chinese indexing component is developed to index the continuously fetched Chinese financial news articles. User profiles capture the basic knowledge of user preferences based on the sources of news articles, the regions of the news reported, categories of industries related, the listed companies, and user-specified keywords. User feedback captures the semantics of the user rated news articles. The search engine ranks the top 20 news articles that users are most interested in and report to the user daily or on demand. Experiments are conducted to measure the performance of the agents based on the inputs from user profiles and user feedback. It shows that simply using the user profiles does not increase the precision of the retrieval. However, user relevance feedback helps to increase the performance of the retrieval as the user interact with the system until it reaches the optimal performance. Combining both user profiles and user relevance feedback produces the best performance

Search (2 results, page 1 of 1)

Authors