Search (146 results, page 8 of 8)

Shen, D.; Chen, Z.; Yang, Q.; Zeng, H.J.; Zhang, B.; Lu, Y.; Ma, W.Y.: Web page classification through summarization (2004) 0.00

0.001972769 = product of:
  0.0059183068 = sum of:
    0.0059183068 = product of:
      0.0118366135 = sum of:
        0.0118366135 = weight(_text_:of in 4132) [ClassicSimilarity], result of:
          0.0118366135 = score(doc=4132,freq=2.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.17277241 = fieldWeight in 4132, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.078125 = fieldNorm(doc=4132)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: SIGIR'04: Proceedings of the 27th Annual International ACM-SIGIR Conference an Research and Development in Information Retrieval. Ed.: K. Järvelin, u.a

Drori, O.; Alon, N.: Using document classification for displaying search results (2003) 0.00
```
0.0016739499 = product of:
  0.0050218496 = sum of:
    0.0050218496 = product of:
      0.010043699 = sum of:
        0.010043699 = weight(_text_:of in 1565) [ClassicSimilarity], result of:
          0.010043699 = score(doc=1565,freq=4.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.14660224 = fieldWeight in 1565, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=1565)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

In this paper, four self-developed user interfaces that display document search results using different methods were compared. In order to create the four interfaces, two information elements: document categories and lines from the document were used. A user study compared the four interfaces. It was found that the category addition to the interface was beneficial in both measurable and subjective measures. It was also found that displaying the relevant lines from the document increased the effectiveness and shortened the search time in all cases and tasks. It was found that the participants preferred the interface containing categories and relevant lines to all other interfaces checked. It was also the fastest in the objective time measurement. Another sub-research that was conducted showed that the most important parameter for the users was the confidence level that the answer was accurate, and the least important parameter was the feeling of comfort while conducting a search

Source

Journal of information science. 29(2003) no.2, S.97-106
Liu, R.-L.: Dynamic category profiling for text filtering and classification (2007) 0.00
```
0.0016739499 = product of:
  0.0050218496 = sum of:
    0.0050218496 = product of:
      0.010043699 = sum of:
        0.010043699 = weight(_text_:of in 900) [ClassicSimilarity], result of:
          0.010043699 = score(doc=900,freq=4.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.14660224 = fieldWeight in 900, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=900)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Information is often represented in text form and classified into categories. Unfortunately, automatic classifiers often conduct misclassifications. One of the reasons is that the documents for training the classifiers are mainly from the categories, leading the classifiers to derive category profiles for distinguishing each category from others, rather than measuring the extent to which a document's content overlaps that of a category. To tackle the problem, we present a technique DP4FC that selects suitable features to construct category profiles to distinguish relevant documents from irrelevant documents. More specially, DP4FC is associated with various classifiers. Upon receiving a document, it helps the classifiers to create dynamic category profiles with respect to the document, and accordingly make proper decisions in filtering and classification. Theoretical analysis and empirical results show that DP4FC may significantly promote different classifiers' performances under various environments.
Leroy, G.; Miller, T.; Rosemblat, G.; Browne, A.: ¬A balanced approach to health information evaluation : a vocabulary-based naïve Bayes classifier and readability formulas (2008) 0.00
```
0.0016739499 = product of:
  0.0050218496 = sum of:
    0.0050218496 = product of:
      0.010043699 = sum of:
        0.010043699 = weight(_text_:of in 1998) [ClassicSimilarity], result of:
          0.010043699 = score(doc=1998,freq=4.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.14660224 = fieldWeight in 1998, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=1998)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Since millions seek health information online, it is vital for this information to be comprehensible. Most studies use readability formulas, which ignore vocabulary, and conclude that online health information is too difficult. We developed a vocabularly-based, naïve Bayes classifier to distinguish between three difficulty levels in text. It proved 98% accurate in a 250-document evaluation. We compared our classifier with readability formulas for 90 new documents with different origins and asked representative human evaluators, an expert and a consumer, to judge each document. Average readability grade levels for educational and commercial pages was 10th grade or higher, too difficult according to current literature. In contrast, the classifier showed that 70-90% of these pages were written at an intermediate, appropriate level indicating that vocabulary usage is frequently appropriate in text considered too difficult by readability formula evaluations. The expert considered the pages more difficult for a consumer than the consumer did.

Source

Journal of the American Society for Information Science and Technology. 59(2008) no.9, S.1409-1419
Zhou, G.D.; Zhang, M.; Ji, D.H.; Zhu, Q.M.: Hierarchical learning strategy in semantic relation extraction (2008) 0.00
```
0.0016739499 = product of:
  0.0050218496 = sum of:
    0.0050218496 = product of:
      0.010043699 = sum of:
        0.010043699 = weight(_text_:of in 2077) [ClassicSimilarity], result of:
          0.010043699 = score(doc=2077,freq=4.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.14660224 = fieldWeight in 2077, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=2077)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

This paper proposes a novel hierarchical learning strategy to deal with the data sparseness problem in semantic relation extraction by modeling the commonality among related classes. For each class in the hierarchy either manually predefined or automatically clustered, a discriminative function is determined in a top-down way. As the upper-level class normally has much more positive training examples than the lower-level class, the corresponding discriminative function can be determined more reliably and guide the discriminative function learning in the lower-level one more effectively, which otherwise might suffer from limited training data. In this paper, two classifier learning approaches, i.e. the simple perceptron algorithm and the state-of-the-art Support Vector Machines, are applied using the hierarchical learning strategy. Moreover, several kinds of class hierarchies either manually predefined or automatically clustered are explored and compared. Evaluation on the ACE RDC 2003 and 2004 corpora shows that the hierarchical learning strategy much improves the performance on least- and medium-frequent relations.
Xu, Y.; Bernard, A.: Knowledge organization through statistical computation : a new approach (2009) 0.00
```
0.0016739499 = product of:
  0.0050218496 = sum of:
    0.0050218496 = product of:
      0.010043699 = sum of:
        0.010043699 = weight(_text_:of in 3252) [ClassicSimilarity], result of:
          0.010043699 = score(doc=3252,freq=4.0), product of:
            0.06850986 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043811057 = queryNorm
            0.14660224 = fieldWeight in 3252, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=3252)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Knowledge organization (KO) is an interdisciplinary issue which includes some problems in knowledge classification such as how to classify newly emerged knowledge. With the great complexity and ambiguity of knowledge, it is becoming sometimes inefficient to classify knowledge by logical reasoning. This paper attempts to propose a statistical approach to knowledge organization in order to resolve the problems in classifying complex and mass knowledge. By integrating the classification process into a mathematical model, a knowledge classifier, based on the maximum entropy theory, is constructed and the experimental results show that the classification results acquired from the classifier are reliable. The approach proposed in this paper is quite formal and is not dependent on specific contexts, so it could easily be adapted to the use of knowledge classification in other domains within KO.

Search (146 results, page 8 of 8)

Authors

Years

Languages

Themes