Search (101 results, page 2 of 6)

  • × theme_ss:"Automatisches Indexieren"
  • × year_i:[1990 TO 2000}
  1. Dow Jones unveils knowledge indexing system (1997) 0.00
    0.0047777384 = product of:
      0.023888692 = sum of:
        0.023888692 = weight(_text_:of in 751) [ClassicSimilarity], result of:
          0.023888692 = score(doc=751,freq=14.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.36569026 = fieldWeight in 751, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=751)
      0.2 = coord(1/5)
    
    Abstract
    Dow Jones Interactive Publishing has developed a sophisticated automatic knowledge indexing system that will allow searchers of the Dow Jones News / Retrieval service to get highly targeted results from a search in the service's Publications Library. Instead of relying on a thesaurus of company names, the new system uses a combination of that basic algorithm plus unique rules based on the editorial styles of individual publications in the Library. Dow Jones have also announced its acceptance of the definitions of 'selected full text' and 'full text' from Bibliodata's Fulltext Sources Online directory
  2. Bonzi, S.: Representation of concepts in text : a comparison of within-document frequency, anaphora, and synonymy (1991) 0.00
    0.004740265 = product of:
      0.023701325 = sum of:
        0.023701325 = weight(_text_:of in 4933) [ClassicSimilarity], result of:
          0.023701325 = score(doc=4933,freq=18.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.36282203 = fieldWeight in 4933, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4933)
      0.2 = coord(1/5)
    
    Abstract
    Investigates the 3 major ways by which a concept may be represented in text: within-document frequency, anaphoric reference, and synonyms in order to determine which provides the optical means of representation. Analysis a sample of 60 abstracts, drawn at random for the abstracting journals of 4 disciplines. Results show that in general, initial within-document frequency is higher for keyword terms. Additionally, frequency of keyword terms referenced anaphorically or with intellectually related terms is higher that that of other keyword terms. It appears that initial document length influences both the number and impact of both anaphoric resolutions and intellectually related terms
    Source
    Canadian journal of information science. 16(1991) no.3, S.21-31
  3. Koch, T.: Experiments with automatic classification of WAIS databases and indexing of WWW : some results from the Nordic WAIS/WWW project (1994) 0.00
    0.004740265 = product of:
      0.023701325 = sum of:
        0.023701325 = weight(_text_:of in 7209) [ClassicSimilarity], result of:
          0.023701325 = score(doc=7209,freq=18.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.36282203 = fieldWeight in 7209, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7209)
      0.2 = coord(1/5)
    
    Abstract
    The Nordic WAIS/WWW project sponsored by NORDINFO is a joint project between Lund University Library and the National Technological Library of Denmark. It aims to improve the existing networked information discovery and retrieval tools Wide Area Information System (WAIS) and World Wide Web (WWW), and to move towards unifying WWW and WAIS. Details current results focusing on the WAIS side of the project. Describes research into automatic indexing and classification of WAIS sources, development of an orientation tool for WAIS, and development of a WAIS index of WWW resources
    Source
    Internet world and document delivery world international 94: Proceedings of the 2nd Annual Conference, London, May 1994
  4. Bookstein, A.; Klein, S.T.; Raita, T.: Clumping properties of content-bearing words (1998) 0.00
    0.004740265 = product of:
      0.023701325 = sum of:
        0.023701325 = weight(_text_:of in 442) [ClassicSimilarity], result of:
          0.023701325 = score(doc=442,freq=18.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.36282203 = fieldWeight in 442, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=442)
      0.2 = coord(1/5)
    
    Abstract
    Information Retrieval Systems identify content bearing words, and possibly also assign weights, as part of the process of formulating requests. For optimal retrieval efficiency, it is desirable that this be done automatically. This article defines the notion of serial clustering of words in text, and explores the value of such clustering as an indicator of a word's bearing content. This approach is flexible in the sense that it is sensitive to context: a term may be assessed as content-bearing within one collection, but not another. Our approach, being numerical, may also be of value in assigning weights to terms in requests. Experimental support is obtained from natural text databases in three different languages
    Source
    Journal of the American Society for Information Science. 49(1998) no.2, S.102-114
  5. Salton, G.; Araya, J.: On the use of clustered file organizations in information search and retrieval (1990) 0.00
    0.004691646 = product of:
      0.02345823 = sum of:
        0.02345823 = weight(_text_:of in 2409) [ClassicSimilarity], result of:
          0.02345823 = score(doc=2409,freq=6.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.3591007 = fieldWeight in 2409, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.09375 = fieldNorm(doc=2409)
      0.2 = coord(1/5)
    
    Imprint
    Edmonton, Alberta : Univ. of Alberta, Faculty of Extension
  6. Humphrey, S.M.: Automatic indexing of documents from journal descriptors : a preliminary investigation (1999) 0.00
    0.004691646 = product of:
      0.02345823 = sum of:
        0.02345823 = weight(_text_:of in 3769) [ClassicSimilarity], result of:
          0.02345823 = score(doc=3769,freq=24.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.3591007 = fieldWeight in 3769, product of:
              4.8989797 = tf(freq=24.0), with freq of:
                24.0 = termFreq=24.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=3769)
      0.2 = coord(1/5)
    
    Abstract
    A new, fully automated approach for indedexing documents is presented based on associating textwords in a training set of bibliographic citations with the indexing of journals. This journal-level indexing is in the form of a consistent, timely set of journal descriptors (JDs) indexing the individual journals themselves. This indexing is maintained in journal records in a serials authority database. The advantage of this novel approach is that the training set does not depend on previous manual indexing of thousands of documents (i.e., any such indexing already in the training set is not used), but rather the relatively small intellectual effort of indexing at the journal level, usually a matter of a few thousand unique journals for which retrospective indexing to maintain consistency and currency may be feasible. If successful, JD indexing would provide topical categorization of documents outside the training set, i.e., journal articles, monographs, Web documents, reports from the grey literature, etc., and therefore be applied in searching. Because JDs are quite general, corresponding to subject domains, their most problable use would be for improving or refining search results
    Source
    Journal of the American Society for Information Science. 50(1999) no.8, S.661-674
  7. Yongcheng, W.; Xiaoming, G.; Lixia, W.: Automatic indexing on subject of Chinese text (1998) 0.00
    0.0045145387 = product of:
      0.022572692 = sum of:
        0.022572692 = weight(_text_:of in 3241) [ClassicSimilarity], result of:
          0.022572692 = score(doc=3241,freq=8.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.34554482 = fieldWeight in 3241, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.078125 = fieldNorm(doc=3241)
      0.2 = coord(1/5)
    
    Abstract
    Outlines the underlying ideas, the basic algorithm and structure of CSAIS 2.1, an automatic indexing system for the subjects of Chinese documents, developed by the authors in 1993
    Source
    Journal of the China Society for Scientific and Technical Information. 17(1998) no.3, S.219-225
  8. Hersh, W.R.; Hickam, D.H.: ¬A comparison of two methods for indexing and retrieval from a full-text medical database (1992) 0.00
    0.004469165 = product of:
      0.022345824 = sum of:
        0.022345824 = weight(_text_:of in 4526) [ClassicSimilarity], result of:
          0.022345824 = score(doc=4526,freq=16.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.34207192 = fieldWeight in 4526, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4526)
      0.2 = coord(1/5)
    
    Abstract
    Reports results of a study of 2 information retrieval systems on a 2.000 document full text medical database. The first system, SAPHIRE, features concept based automatic indexing and statistical retrieval techniques, while the second system, SWORD, features traditional word based Boolean techniques, 16 medical students at Oregon Health Sciences Univ. each performed 10 searches and their results, recorded in terms of recall and precision, showed nearly equal performance for both systems. SAPHIRE was also compared with a version of SWORD modified to use automatic indexing and ranked retrieval. Using batch input of queries, the latter method performed slightly better
    Source
    Proceedings of the 55th Annual Meeting of the American Society for Information Science, Pittsburgh, 26.-29.10.92. Ed.: D. Shaw
  9. Warner, A.J.: ¬A linguistic approach to the automated hierarchical organization of phrases (1990) 0.00
    0.004469165 = product of:
      0.022345824 = sum of:
        0.022345824 = weight(_text_:of in 4902) [ClassicSimilarity], result of:
          0.022345824 = score(doc=4902,freq=16.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.34207192 = fieldWeight in 4902, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4902)
      0.2 = coord(1/5)
    
    Abstract
    A linguistic analysis was carried out on 8 sets of phrases automatically selected from documents surrogates in mathematics. The purpose of this analysis was to derive an algorithm which would automatically generate a hierarchically organised arrangement of phrases for online display to the user. This would replace an alphabetical display and would be particularly useful in online browsing of large numbers of items. It is also the first step toward an automatic thesaurus generator
    Source
    ASIS'90: Information in the year 2000, from research to applications. Proc. of the 53rd Annual Meeting of the American Society for Information Science, Toronto, Canada, 4.-8.11.1990. Ed. by Diana Henderson
  10. Thiel, T.J.: Automated indexing of information stored on optical disk electronic document image management systems (1994) 0.00
    0.004469165 = product of:
      0.022345824 = sum of:
        0.022345824 = weight(_text_:of in 1260) [ClassicSimilarity], result of:
          0.022345824 = score(doc=1260,freq=4.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.34207192 = fieldWeight in 1260, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.109375 = fieldNorm(doc=1260)
      0.2 = coord(1/5)
    
    Source
    Encyclopedia of library and information science. Vol.54, [=Suppl.17]
  11. Silvester, J.P.: Computer supported indexing : a history and evaluation of NASA's MAI system (1998) 0.00
    0.004469165 = product of:
      0.022345824 = sum of:
        0.022345824 = weight(_text_:of in 1302) [ClassicSimilarity], result of:
          0.022345824 = score(doc=1302,freq=4.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.34207192 = fieldWeight in 1302, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.109375 = fieldNorm(doc=1302)
      0.2 = coord(1/5)
    
    Source
    Encyclopedia of library and information science. Vol.61, [=Suppl.24]
  12. Cheng, K.-H.: Automatic identification for topics of electronic documents (1997) 0.00
    0.004469165 = product of:
      0.022345824 = sum of:
        0.022345824 = weight(_text_:of in 1811) [ClassicSimilarity], result of:
          0.022345824 = score(doc=1811,freq=16.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.34207192 = fieldWeight in 1811, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1811)
      0.2 = coord(1/5)
    
    Abstract
    With the rapid rise in numbers of electronic documents on the Internet, how to effectively assign topics to documents become an important issue. Current research in this area focuses on the behaviour of nouns in documents. Proposes, however, that nouns and verbs together contribute to the process of topic identification. Constructs a mathematical model taking into account the following factors: word importance, word frequency, word co-occurence, and word distance. Preliminary experiments ahow that the performance of the proposed model is equivalent to that of a human being
    Source
    Bulletin of the Library Association of China. 1997, no.59, Dec., S.43-58
  13. Garfield, E.: ¬The relationship between mechanical indexing, structural linguistics and information retrieval (1992) 0.00
    0.004423326 = product of:
      0.02211663 = sum of:
        0.02211663 = weight(_text_:of in 3632) [ClassicSimilarity], result of:
          0.02211663 = score(doc=3632,freq=12.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.33856338 = fieldWeight in 3632, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=3632)
      0.2 = coord(1/5)
    
    Abstract
    It is possible to locate over 60% of indexing terms used in the Current List of Medical Literature by analysing the titles of the articles. Citation indexes contain 'noise' and lack many pertinent citations. Mechanical indexing or analysis of text must begin with some linguistic technique. Discusses Harris' methods of structural linguistics, discourse analysis and transformational analysis. Provides 3 examples with references, abstracts and index entries
    Source
    Journal of information science. 18(1992) no.5, S.343-354
  14. Wellisch, H.H.: ¬The art of indexing and some fallacies of its automation (1992) 0.00
    0.004423326 = product of:
      0.02211663 = sum of:
        0.02211663 = weight(_text_:of in 3958) [ClassicSimilarity], result of:
          0.02211663 = score(doc=3958,freq=12.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.33856338 = fieldWeight in 3958, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=3958)
      0.2 = coord(1/5)
    
    Abstract
    Reviews the history of indexing, which began with the rise of the universities in the 13th century, before the invention of printing. Describes the different skills needed for indexing books, periodicals and databases. States the belief that the quest for fully automatic indexing is a futile endeavour; machine-generated indexes need the services of human post-editors if they are to be useful and acceptable
  15. Kim, P.K.: ¬An automatic indexing of compound words based on mutual information for Korean text retrieval (1995) 0.00
    0.004423326 = product of:
      0.02211663 = sum of:
        0.02211663 = weight(_text_:of in 620) [ClassicSimilarity], result of:
          0.02211663 = score(doc=620,freq=12.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.33856338 = fieldWeight in 620, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=620)
      0.2 = coord(1/5)
    
    Abstract
    Presents an automatic indexing technique for compound words suitable for an agglutinative language, specifically Korean. Discusses some construction conditions for compound words and the rules for decomposing compound words to enhance the exhaustivity of indexing, demonstrating that this system, mutual information, enhances both the exhaustivity of indexing and the specifity of terms. Suggests that the construction conditions and rules for decomposition presented may be used in multilingual information retrieval systems to translate the indexing terms of the specific language into those of the language required
  16. Hirawa, M.: Role of keywords in the network searching era (1998) 0.00
    0.004423326 = product of:
      0.02211663 = sum of:
        0.02211663 = weight(_text_:of in 3446) [ClassicSimilarity], result of:
          0.02211663 = score(doc=3446,freq=12.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.33856338 = fieldWeight in 3446, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=3446)
      0.2 = coord(1/5)
    
    Abstract
    A survey of Japanese OPACs available on the Internet was conducted relating to use of keywords for subject access. The findings suggest that present OPACs are not capable of storing subject-oriented information. Currently available keyword access derives from a merely title-based retrieval system. Contents data should be added to bibliographic records as an efficient way of providing subject access, and costings for this process should be estimated. Word standardisation issues must also be addressed
    Source
    Igaku Toshokan (Journal of the Japan Medical Library Association). 45(1998) no.2, S.222-227
  17. Wacholder, N.; Byrd, R.J.: Retrieving information from full text using linguistic knowledge (1994) 0.00
    0.004282867 = product of:
      0.021414334 = sum of:
        0.021414334 = weight(_text_:of in 8524) [ClassicSimilarity], result of:
          0.021414334 = score(doc=8524,freq=20.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.32781258 = fieldWeight in 8524, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=8524)
      0.2 = coord(1/5)
    
    Abstract
    Examines how techniques in the field of natural language processing can be applied to the analysis of text in information retrieval. State of the art text searching programs cannot distinguish, for example, between occurrences of the sickness, AIDS and aids as tool or between library school and school nor equate such terms as online or on-line which are variants of the same form. To make these distinction, systems must incorporate knowledge about the meaning of words in context. Research in natural language processing has concentrated on the automatic 'understanding' of language; how to analyze the grammatical structure and meaning of text. Although many asoects of this research remain experimental, describes how these techniques to recognize spelling variants, names, acronyms, and abbreviations
    Source
    Proceedings of the 15th National Online Meeting 1994, New York, 10-12 May 1994. Ed. by M.E. Williams
  18. Gil-Leiva, I.; Munoz, J.V.R.: Analisis de los descriptores de diferentes areas del conocimiento indizades en bases de datos del CSIC : Aplicacion a la indizacion automatica (1997) 0.00
    0.004282867 = product of:
      0.021414334 = sum of:
        0.021414334 = weight(_text_:of in 2637) [ClassicSimilarity], result of:
          0.021414334 = score(doc=2637,freq=20.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.32781258 = fieldWeight in 2637, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=2637)
      0.2 = coord(1/5)
    
    Abstract
    Studies the value of scientific articles' titles and abstracts as sources of terms for document indexing in relation to 6 areas of knowledge: library and information science, medicine, chemistry, biology, psychology and physics, indexed in the databases ISOC, IME and ICYT of the CSIC. Also examines the syntagmatic structures of the indexing terms found in the field 'descriptors'. as well as the relationship between length of document and number of descriptors. Concludes that if the abstracts are not well made and the titles are not precise, they are not definitive sources for the extractions of concepts; the most common syntactic structure is the noun phrase, followed by noun+adjective and noun+noun; and no significant relationship was found between length of document and number of descriptors assigned to it
  19. Buckley, C.; Allan, J.; Salton, G.: Automatic routing and retrieval using Smart : TREC-2 (1995) 0.00
    0.004282867 = product of:
      0.021414334 = sum of:
        0.021414334 = weight(_text_:of in 5699) [ClassicSimilarity], result of:
          0.021414334 = score(doc=5699,freq=20.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.32781258 = fieldWeight in 5699, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=5699)
      0.2 = coord(1/5)
    
    Abstract
    The Smart information retrieval project emphazises completely automatic approaches to the understanding and retrieval of large quantities of text. The work in the TREC-2 environment continues, performing both routing and ad hoc experiments. The ad hoc work extends investigations into combining global similarities, giving an overall indication of how a document matches a query, with local similarities identifying a smaller part of the document that matches the query. The performance of ad hoc runs is good, but it is clear that full advantage of the available local information is not been taken advantage of. The routing experiments use conventional relevance feedback approaches to routing, but with a much greater degree of query expansion than was previously done. The length of a query vector is increased by a factor of 5 to 10 by adding terms found in previously seen relevant documents. This approach improves effectiveness by 30-40% over the original query
  20. Renouf, A.: Sticking to the text : a corpus linguist's view of language (1993) 0.00
    0.0041805212 = product of:
      0.020902606 = sum of:
        0.020902606 = weight(_text_:of in 2314) [ClassicSimilarity], result of:
          0.020902606 = score(doc=2314,freq=14.0), product of:
            0.06532493 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.04177434 = queryNorm
            0.31997898 = fieldWeight in 2314, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2314)
      0.2 = coord(1/5)
    
    Abstract
    Corpus linguistics is the study of large, computer held bodies of text. Some corpus linguists are concerned with language descriptions for its own sake. On the corpus-linguistic continuum, the study of raw ASCII text is situated at one end, and the study of heavily pre-coded text at the other. Discusses the use of word frequency to identify changes in the lexicon; word repetition and word positioning in automatic abstracting and word clusters in automatic text retrieval. Compares the machine extract with manual abstracts. Abstractors and indexers may find themselves taking the original wording of the text more into account as the focus moves towards the electronic medium and away from the hard copy

Languages

Types

  • a 99
  • el 1
  • s 1
  • More… Less…