Search (14 results, page 1 of 1)

Jones, S.; Paynter, G.W.: Automatic extractionof document keyphrases for use in digital libraries : evaluations and applications (2002) 0.02
```
0.018951688 = product of:
  0.101075664 = sum of:
    0.05836024 = weight(_text_:author in 601) [ClassicSimilarity], result of:
      0.05836024 = score(doc=601,freq=4.0), product of:
        0.15482868 = queryWeight, product of:
          4.824759 = idf(docFreq=964, maxDocs=44218)
          0.032090448 = queryNorm
        0.3769343 = fieldWeight in 601, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.824759 = idf(docFreq=964, maxDocs=44218)
          0.0390625 = fieldNorm(doc=601)
    0.022109302 = weight(_text_:26 in 601) [ClassicSimilarity], result of:
      0.022109302 = score(doc=601,freq=2.0), product of:
        0.113328174 = queryWeight, product of:
          3.5315237 = idf(docFreq=3516, maxDocs=44218)
          0.032090448 = queryNorm
        0.19509095 = fieldWeight in 601, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5315237 = idf(docFreq=3516, maxDocs=44218)
          0.0390625 = fieldNorm(doc=601)
    0.02060612 = weight(_text_:american in 601) [ClassicSimilarity], result of:
      0.02060612 = score(doc=601,freq=2.0), product of:
        0.10940785 = queryWeight, product of:
          3.4093587 = idf(docFreq=3973, maxDocs=44218)
          0.032090448 = queryNorm
        0.18834224 = fieldWeight in 601, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.4093587 = idf(docFreq=3973, maxDocs=44218)
          0.0390625 = fieldNorm(doc=601)
  0.1875 = coord(3/16)
```
Abstract

This article describes an evaluation of the Kea automatic keyphrase extraction algorithm. Document keyphrases are conventionally used as concise descriptors of document content, and are increasingly used in novel ways, including document clustering, searching and browsing interfaces, and retrieval engines. However, it is costly and time consuming to manually assign keyphrases to documents, motivating the development of tools that automatically perform this function. Previous studies have evaluated Kea's performance by measuring its ability to identify author keywords and keyphrases, but this methodology has a number of well-known limitations. The results presented in this article are based on evaluations by human assessors of the quality and appropriateness of Kea keyphrases. The results indicate that, in general, Kea produces keyphrases that are rated positively by human assessors. However, typical Kea settings can degrade performance, particularly those relating to keyphrase length and domain specificity. We found that for some settings, Kea's performance is better than that of similar systems, and that Kea's ranking of extracted keyphrases is effective. We also determined that author-specified keyphrases appear to exhibit an inherent ranking, and that they are rated highly and therefore suitable for use in training and evaluation of automatic keyphrasing systems.

Date

26. 5.2002 15:32:08

Source

Journal of the American Society for Information Science and technology. 53(2002) no.8, S.653-677
Craven, T.C.: Abstracts produced using computer assistance (2000) 0.01
```
0.009280956 = product of:
  0.07424765 = sum of:
    0.049520306 = weight(_text_:author in 4809) [ClassicSimilarity], result of:
      0.049520306 = score(doc=4809,freq=2.0), product of:
        0.15482868 = queryWeight, product of:
          4.824759 = idf(docFreq=964, maxDocs=44218)
          0.032090448 = queryNorm
        0.31983936 = fieldWeight in 4809, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.824759 = idf(docFreq=964, maxDocs=44218)
          0.046875 = fieldNorm(doc=4809)
    0.024727343 = weight(_text_:american in 4809) [ClassicSimilarity], result of:
      0.024727343 = score(doc=4809,freq=2.0), product of:
        0.10940785 = queryWeight, product of:
          3.4093587 = idf(docFreq=3973, maxDocs=44218)
          0.032090448 = queryNorm
        0.22601068 = fieldWeight in 4809, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.4093587 = idf(docFreq=3973, maxDocs=44218)
          0.046875 = fieldNorm(doc=4809)
  0.125 = coord(2/16)
```
Abstract

Experimental subjects wrote abstracts using a simplified version of the TEXNET abstracting assistance software. In addition to the full text, subjects were presented with either keywords or phrases extracted automatically. The resulting abstracts, and the times taken, were recorded automatically; some additional information was gathered by oral questionnaire. Selected abstracts produced were evaluated on various criteria by independent raters. Results showed considerable variation among subjects, but 37% found the keywords or phrases 'quite' or 'very' useful in writing their abstracts. Statistical analysis failed to support several hypothesized relations: phrases were not viewed as significantly more helpful than keywords; and abstracting experience did not correlate with originality of wording, approximation of the author abstract, or greater conciseness. Requiring further study are some unanticipated strong correlations including the following: Windows experience and writing an abstract like the author's; experience reading abstracts and thinking one had written a good abstract; gender and abstract length; gender and use of words and phrases from the original text. Results have also suggested possible modifications to the TEXNET software

Source

Journal of the American Society for Information Science. 51(2000) no.8, S.745-756

Chen, H.-H.; Kuo, J.-J.; Huang, S.-J.; Lin, C.-J.; Wung, H.-C.: ¬A summarization system for Chinese news from multiple sources (2003) 0.01

0.006407313 = product of:
  0.051258504 = sum of:
    0.026531162 = weight(_text_:26 in 2115) [ClassicSimilarity], result of:
      0.026531162 = score(doc=2115,freq=2.0), product of:
        0.113328174 = queryWeight, product of:
          3.5315237 = idf(docFreq=3516, maxDocs=44218)
          0.032090448 = queryNorm
        0.23410915 = fieldWeight in 2115, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5315237 = idf(docFreq=3516, maxDocs=44218)
          0.046875 = fieldNorm(doc=2115)
    0.024727343 = weight(_text_:american in 2115) [ClassicSimilarity], result of:
      0.024727343 = score(doc=2115,freq=2.0), product of:
        0.10940785 = queryWeight, product of:
          3.4093587 = idf(docFreq=3973, maxDocs=44218)
          0.032090448 = queryNorm
        0.22601068 = fieldWeight in 2115, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.4093587 = idf(docFreq=3973, maxDocs=44218)
          0.046875 = fieldNorm(doc=2115)
  0.125 = coord(2/16)

Date: 24. 1.2004 18:26:52
Source: Journal of the American Society for Information Science and technology. 54(2003) no.13, S.1224-1236

Kuhlen, R.: Informationsaufbereitung III : Referieren (Abstracts - Abstracting - Grundlagen) (2004) 0.01
```
0.006187304 = product of:
  0.04949843 = sum of:
    0.033013538 = weight(_text_:author in 2917) [ClassicSimilarity], result of:
      0.033013538 = score(doc=2917,freq=2.0), product of:
        0.15482868 = queryWeight, product of:
          4.824759 = idf(docFreq=964, maxDocs=44218)
          0.032090448 = queryNorm
        0.21322623 = fieldWeight in 2917, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.824759 = idf(docFreq=964, maxDocs=44218)
          0.03125 = fieldNorm(doc=2917)
    0.016484896 = weight(_text_:american in 2917) [ClassicSimilarity], result of:
      0.016484896 = score(doc=2917,freq=2.0), product of:
        0.10940785 = queryWeight, product of:
          3.4093587 = idf(docFreq=3973, maxDocs=44218)
          0.032090448 = queryNorm
        0.15067379 = fieldWeight in 2917, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.4093587 = idf(docFreq=3973, maxDocs=44218)
          0.03125 = fieldNorm(doc=2917)
  0.125 = coord(2/16)
```
Abstract

Was ein Abstract (im Folgenden synonym mit Referat oder Kurzreferat gebraucht) ist, legt das American National Standards Institute in einer Weise fest, die sicherlich von den meisten Fachleuten akzeptiert werden kann: "An abstract is defined as an abbreviated, accurate representation of the contents of a document"; fast genauso die deutsche Norm DIN 1426: "Das Kurzreferat gibt kurz und klar den Inhalt des Dokuments wieder." Abstracts gehören zum wissenschaftlichen Alltag. Weitgehend allen Publikationen, zumindest in den naturwissenschaftlichen, technischen, informationsbezogenen oder medizinischen Bereichen, gehen Abstracts voran, "prefe-rably prepared by its author(s) for publication with it". Es gibt wohl keinen Wissenschaftler, der nicht irgendwann einmal ein Abstract geschrieben hätte. Gehört das Erstellen von Abstracts dann überhaupt zur dokumentarischen bzw informationswissenschaftlichen Methodenlehre, wenn es jeder kann? Was macht den informationellen Mehrwert aus, der durch Expertenreferate gegenüber Laienreferaten erzeugt wird? Dies ist nicht so leicht zu beantworten, zumal geeignete Bewertungsverfahren fehlen, die Qualität von Abstracts vergleichend "objektiv" zu messen. Abstracts werden in erheblichem Umfang von Informationsspezialisten erstellt, oft unter der Annahme, dass Autoren selber dafür weniger geeignet sind. Vergegenwärtigen wir uns, was wir über Abstracts und Abstracting wissen. Ein besonders gelungenes Abstract ist zuweilen klarer als der Ursprungstext selber, darf aber nicht mehr Information als dieser enthalten: "Good abstracts are highly structured, concise, and coherent, and are the result of a thorough analysis of the content of the abstracted materials. Abstracts may be more readable than the basis documents, but because of size constraints they rarely equal and never surpass the information content of the basic document". Dies ist verständlich, denn ein "Abstract" ist zunächst nichts anderes als ein Ergebnis des Vorgangs einer Abstraktion. Ohne uns zu sehr in die philosophischen Hintergründe der Abstraktion zu verlieren, besteht diese doch "in der Vernachlässigung von bestimmten Vorstellungsbzw. Begriffsinhalten, von welchen zugunsten anderer Teilinhalte abgesehen, abstrahiert' wird. Sie ist stets verbunden mit einer Fixierung von (interessierenden) Merkmalen durch die aktive Aufmerksamkeit, die unter einem bestimmten pragmatischen Gesichtspunkt als wesentlich' für einen vorgestellten bzw für einen unter einen Begriff fallenden Gegenstand (oder eine Mehrheit von Gegenständen) betrachtet werden". Abstracts reduzieren weniger Begriffsinhalte, sondern Texte bezüglich ihres proportionalen Gehaltes. Borko/ Bernier haben dies sogar quantifiziert; sie schätzen den Reduktionsfaktor auf 1:10 bis 1:12

Wei, F.; Li, W.; Lu, Q.; He, Y.: Applying two-level reinforcement ranking in query-oriented multidocument summarization (2009) 0.01

0.005339428 = product of:
  0.042715423 = sum of:
    0.022109302 = weight(_text_:26 in 3120) [ClassicSimilarity], result of:
      0.022109302 = score(doc=3120,freq=2.0), product of:
        0.113328174 = queryWeight, product of:
          3.5315237 = idf(docFreq=3516, maxDocs=44218)
          0.032090448 = queryNorm
        0.19509095 = fieldWeight in 3120, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5315237 = idf(docFreq=3516, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3120)
    0.02060612 = weight(_text_:american in 3120) [ClassicSimilarity], result of:
      0.02060612 = score(doc=3120,freq=2.0), product of:
        0.10940785 = queryWeight, product of:
          3.4093587 = idf(docFreq=3973, maxDocs=44218)
          0.032090448 = queryNorm
        0.18834224 = fieldWeight in 3120, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.4093587 = idf(docFreq=3973, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3120)
  0.125 = coord(2/16)

Date: 26. 9.2009 11:16:24
Source: Journal of the American Society for Information Science and Technology. 60(2009) no.10, S.2119-2131

Wu, Y.-f.B.; Li, Q.; Bot, R.S.; Chen, X.: Finding nuggets in documents : a machine learning approach (2006) 0.00

0.0039344565 = product of:
  0.031475652 = sum of:
    0.02060612 = weight(_text_:american in 5290) [ClassicSimilarity], result of:
      0.02060612 = score(doc=5290,freq=2.0), product of:
        0.10940785 = queryWeight, product of:
          3.4093587 = idf(docFreq=3973, maxDocs=44218)
          0.032090448 = queryNorm
        0.18834224 = fieldWeight in 5290, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.4093587 = idf(docFreq=3973, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5290)
    0.010869532 = product of:
      0.021739064 = sum of:
        0.021739064 = weight(_text_:22 in 5290) [ClassicSimilarity], result of:
          0.021739064 = score(doc=5290,freq=2.0), product of:
            0.11237528 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.032090448 = queryNorm
            0.19345059 = fieldWeight in 5290, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5290)
      0.5 = coord(1/2)
  0.125 = coord(2/16)

Date: 22. 7.2006 17:25:48
Source: Journal of the American Society for Information Science and Technology. 57(2006) no.6, S.740-752

Endres-Niggemeyer, B.: SimSum : an empirically founded simulation of summarizing (2000) 0.00

0.003869128 = product of:
  0.061906047 = sum of:
    0.061906047 = weight(_text_:26 in 3343) [ClassicSimilarity], result of:
      0.061906047 = score(doc=3343,freq=2.0), product of:
        0.113328174 = queryWeight, product of:
          3.5315237 = idf(docFreq=3516, maxDocs=44218)
          0.032090448 = queryNorm
        0.5462547 = fieldWeight in 3343, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5315237 = idf(docFreq=3516, maxDocs=44218)
          0.109375 = fieldNorm(doc=3343)
  0.0625 = coord(1/16)

Date: 15. 8.2002 18:26:20

Ercan, G.; Cicekli, I.: Using lexical chains for keyword extraction (2007) 0.00

0.001934564 = product of:
  0.030953024 = sum of:
    0.030953024 = weight(_text_:26 in 951) [ClassicSimilarity], result of:
      0.030953024 = score(doc=951,freq=2.0), product of:
        0.113328174 = queryWeight, product of:
          3.5315237 = idf(docFreq=3516, maxDocs=44218)
          0.032090448 = queryNorm
        0.27312735 = fieldWeight in 951, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5315237 = idf(docFreq=3516, maxDocs=44218)
          0.0546875 = fieldNorm(doc=951)
  0.0625 = coord(1/16)

Date: 26.12.2007 16:26:11

Lam, W.; Chan, K.; Radev, D.; Saggion, H.; Teufel, S.: Context-based generic cross-lingual retrieval of documents and automated summaries (2005) 0.00

0.0015454589 = product of:
  0.024727343 = sum of:
    0.024727343 = weight(_text_:american in 1965) [ClassicSimilarity], result of:
      0.024727343 = score(doc=1965,freq=2.0), product of:
        0.10940785 = queryWeight, product of:
          3.4093587 = idf(docFreq=3973, maxDocs=44218)
          0.032090448 = queryNorm
        0.22601068 = fieldWeight in 1965, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.4093587 = idf(docFreq=3973, maxDocs=44218)
          0.046875 = fieldNorm(doc=1965)
  0.0625 = coord(1/16)

Source: Journal of the American Society for Information Science and Technology. 56(2005) no.2, S.129-139

Marcu, D.: Automatic abstracting and summarization (2009) 0.00

0.0013869622 = product of:
  0.022191396 = sum of:
    0.022191396 = product of:
      0.044382792 = sum of:
        0.044382792 = weight(_text_:ed in 3748) [ClassicSimilarity], result of:
          0.044382792 = score(doc=3748,freq=4.0), product of:
            0.11411327 = queryWeight, product of:
              3.5559888 = idf(docFreq=3431, maxDocs=44218)
              0.032090448 = queryNorm
            0.38893628 = fieldWeight in 3748, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5559888 = idf(docFreq=3431, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3748)
      0.5 = coord(1/2)
  0.0625 = coord(1/16)

Source: Encyclopedia of library and information sciences. 3rd ed. Ed.: M.J. Bates

Ou, S.; Khoo, S.G.; Goh, D.H.: Automatic multidocument summarization of research abstracts : design and user evaluation (2007) 0.00

0.0012878824 = product of:
  0.02060612 = sum of:
    0.02060612 = weight(_text_:american in 522) [ClassicSimilarity], result of:
      0.02060612 = score(doc=522,freq=2.0), product of:
        0.10940785 = queryWeight, product of:
          3.4093587 = idf(docFreq=3973, maxDocs=44218)
          0.032090448 = queryNorm
        0.18834224 = fieldWeight in 522, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.4093587 = idf(docFreq=3973, maxDocs=44218)
          0.0390625 = fieldNorm(doc=522)
  0.0625 = coord(1/16)

Source: Journal of the American Society for Information Science and Technology. 58(2007) no.10, S.1419-1435

Yang, C.C.; Wang, F.L.: Hierarchical summarization of large documents (2008) 0.00

0.0012878824 = product of:
  0.02060612 = sum of:
    0.02060612 = weight(_text_:american in 1719) [ClassicSimilarity], result of:
      0.02060612 = score(doc=1719,freq=2.0), product of:
        0.10940785 = queryWeight, product of:
          3.4093587 = idf(docFreq=3973, maxDocs=44218)
          0.032090448 = queryNorm
        0.18834224 = fieldWeight in 1719, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.4093587 = idf(docFreq=3973, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1719)
  0.0625 = coord(1/16)

Source: Journal of the American Society for Information Science and Technology. 59(2008) no.6, S.887-902

Saggion, H.; Lapalme, G.: Selective analysis for the automatic generation of summaries (2000) 0.00

9.807304E-4 = product of:
  0.015691686 = sum of:
    0.015691686 = product of:
      0.031383373 = sum of:
        0.031383373 = weight(_text_:ed in 132) [ClassicSimilarity], result of:
          0.031383373 = score(doc=132,freq=2.0), product of:
            0.11411327 = queryWeight, product of:
              3.5559888 = idf(docFreq=3431, maxDocs=44218)
              0.032090448 = queryNorm
            0.27501947 = fieldWeight in 132, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5559888 = idf(docFreq=3431, maxDocs=44218)
              0.0546875 = fieldNorm(doc=132)
      0.5 = coord(1/2)
  0.0625 = coord(1/16)

Source: Dynamism and stability in knowledge organization: Proceedings of the 6th International ISKO-Conference, 10-13 July 2000, Toronto, Canada. Ed.: C. Beghtol et al

Vanderwende, L.; Suzuki, H.; Brockett, J.M.; Nenkova, A.: Beyond SumBasic : task-focused summarization with sentence simplification and lexical expansion (2007) 0.00
```
8.152149E-4 = product of:
  0.013043438 = sum of:
    0.013043438 = product of:
      0.026086876 = sum of:
        0.026086876 = weight(_text_:22 in 948) [ClassicSimilarity], result of:
          0.026086876 = score(doc=948,freq=2.0), product of:
            0.11237528 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.032090448 = queryNorm
            0.23214069 = fieldWeight in 948, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=948)
      0.5 = coord(1/2)
  0.0625 = coord(1/16)
```
Abstract

In recent years, there has been increased interest in topic-focused multi-document summarization. In this task, automatic summaries are produced in response to a specific information request, or topic, stated by the user. The system we have designed to accomplish this task comprises four main components: a generic extractive summarization system, a topic-focusing component, sentence simplification, and lexical expansion of topic words. This paper details each of these components, together with experiments designed to quantify their individual contributions. We include an analysis of our results on two large datasets commonly used to evaluate task-focused summarization, the DUC2005 and DUC2006 datasets, using automatic metrics. Additionally, we include an analysis of our results on the DUC2006 task according to human evaluation metrics. In the human evaluation of system summaries compared to human summaries, i.e., the Pyramid method, our system ranked first out of 22 systems in terms of overall mean Pyramid score; and in the human evaluation of summary responsiveness to the topic, our system ranked third out of 35 systems.

Search (14 results, page 1 of 1)

Authors

Languages

Themes