Search (132 results, page 1 of 7)

Milstead, J.L.: Thesauri in a full-text world (1998) 0.07

0.07255965 = product of:
  0.10883947 = sum of:
    0.022522911 = weight(_text_:im in 2337) [ClassicSimilarity], result of:
      0.022522911 = score(doc=2337,freq=2.0), product of:
        0.1442303 = queryWeight, product of:
          2.8267863 = idf(docFreq=7115, maxDocs=44218)
          0.051022716 = queryNorm
        0.15615936 = fieldWeight in 2337, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.8267863 = idf(docFreq=7115, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2337)
    0.086316556 = sum of:
      0.025961377 = weight(_text_:online in 2337) [ClassicSimilarity], result of:
        0.025961377 = score(doc=2337,freq=2.0), product of:
          0.1548489 = queryWeight, product of:
            3.0349014 = idf(docFreq=5778, maxDocs=44218)
            0.051022716 = queryNorm
          0.16765618 = fieldWeight in 2337, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.0349014 = idf(docFreq=5778, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2337)
      0.025790809 = weight(_text_:retrieval in 2337) [ClassicSimilarity], result of:
        0.025790809 = score(doc=2337,freq=2.0), product of:
          0.15433937 = queryWeight, product of:
            3.024915 = idf(docFreq=5836, maxDocs=44218)
            0.051022716 = queryNorm
          0.16710453 = fieldWeight in 2337, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.024915 = idf(docFreq=5836, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2337)
      0.03456437 = weight(_text_:22 in 2337) [ClassicSimilarity], result of:
        0.03456437 = score(doc=2337,freq=2.0), product of:
          0.17867287 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.051022716 = queryNorm
          0.19345059 = fieldWeight in 2337, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2337)
  0.6666667 = coord(2/3)

Date: 22. 9.1997 19:16:05
Theme: Verbale Doksprachen im Online-Retrieval

Micco, M.; Popp, R.: Improving library subject access (ILSA) : a theory of clustering based in classification (1994) 0.05

0.053222746 = product of:
  0.07983412 = sum of:
    0.031532075 = weight(_text_:im in 7715) [ClassicSimilarity], result of:
      0.031532075 = score(doc=7715,freq=2.0), product of:
        0.1442303 = queryWeight, product of:
          2.8267863 = idf(docFreq=7115, maxDocs=44218)
          0.051022716 = queryNorm
        0.2186231 = fieldWeight in 7715, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.8267863 = idf(docFreq=7115, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7715)
    0.04830204 = product of:
      0.07245306 = sum of:
        0.03634593 = weight(_text_:online in 7715) [ClassicSimilarity], result of:
          0.03634593 = score(doc=7715,freq=2.0), product of:
            0.1548489 = queryWeight, product of:
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.051022716 = queryNorm
            0.23471867 = fieldWeight in 7715, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7715)
        0.03610713 = weight(_text_:retrieval in 7715) [ClassicSimilarity], result of:
          0.03610713 = score(doc=7715,freq=2.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.23394634 = fieldWeight in 7715, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7715)
      0.6666667 = coord(2/3)
  0.6666667 = coord(2/3)

Theme: Klassifikationssysteme im Online-Retrieval

Munkelt, J.: Erstellung einer DNB-Retrieval-Testkollektion (2018) 0.04

0.044433918 = product of:
  0.066650875 = sum of:
    0.054615162 = weight(_text_:im in 4310) [ClassicSimilarity], result of:
      0.054615162 = score(doc=4310,freq=6.0), product of:
        0.1442303 = queryWeight, product of:
          2.8267863 = idf(docFreq=7115, maxDocs=44218)
          0.051022716 = queryNorm
        0.37866634 = fieldWeight in 4310, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          2.8267863 = idf(docFreq=7115, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4310)
    0.012035711 = product of:
      0.03610713 = sum of:
        0.03610713 = weight(_text_:retrieval in 4310) [ClassicSimilarity], result of:
          0.03610713 = score(doc=4310,freq=2.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.23394634 = fieldWeight in 4310, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4310)
      0.33333334 = coord(1/3)
  0.6666667 = coord(2/3)

Abstract: Seit Herbst 2017 findet in der Deutschen Nationalbibliothek die Inhaltserschließung bestimmter Medienwerke rein maschinell statt. Die Qualität dieses Verfahrens, das die Prozessorganisation von Bibliotheken maßgeblich prägen kann, wird unter Fachleuten kontrovers diskutiert. Ihre Standpunkte werden zunächst hinreichend erläutert, ehe die Notwendigkeit einer Qualitätsprüfung des Verfahrens und dessen Grundlagen dargelegt werden. Zentraler Bestandteil einer künftigen Prüfung ist eine Testkollektion. Ihre Erstellung und deren Dokumentation steht im Fokus dieser Arbeit. In diesem Zusammenhang werden auch die Entstehungsgeschichte und Anforderungen an gelungene Testkollektionen behandelt. Abschließend wird ein Retrievaltest durchgeführt, der die Einsatzfähigkeit der erarbeiteten Testkollektion belegt. Seine Ergebnisse dienen ausschließlich der Funktionsüberprüfung. Eine Qualitätsbeurteilung maschineller Inhaltserschließung im Speziellen sowie im Allgemeinen findet nicht statt und ist nicht Ziel der Ausarbeitung.

Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.04

0.04291924 = product of:
  0.12875772 = sum of:
    0.12875772 = product of:
      0.19313657 = sum of:
        0.08253059 = weight(_text_:retrieval in 402) [ClassicSimilarity], result of:
          0.08253059 = score(doc=402,freq=2.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.5347345 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
        0.110605985 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
          0.110605985 = score(doc=402,freq=2.0), product of:
            0.17867287 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051022716 = queryNorm
            0.61904186 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)

Source: Information processing and management. 22(1986) no.6, S.465-476

Biebricher, N.; Fuhr, N.; Lustig, G.; Schwantner, M.; Knorz, G.: ¬The automatic indexing system AIR/PHYS : from research to application (1988) 0.03

0.031572483 = product of:
  0.09471744 = sum of:
    0.09471744 = product of:
      0.14207616 = sum of:
        0.07294742 = weight(_text_:retrieval in 1952) [ClassicSimilarity], result of:
          0.07294742 = score(doc=1952,freq=4.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.47264296 = fieldWeight in 1952, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.078125 = fieldNorm(doc=1952)
        0.06912874 = weight(_text_:22 in 1952) [ClassicSimilarity], result of:
          0.06912874 = score(doc=1952,freq=2.0), product of:
            0.17867287 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051022716 = queryNorm
            0.38690117 = fieldWeight in 1952, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=1952)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)

Date: 16. 8.1998 12:51:22
Footnote: Wiederabgedruckt in: Readings in information retrieval. Ed.: K. Sparck Jones u. P. Willett. San Francisco: Morgan Kaufmann 1997. S.513-517.
Source: Proceedings of the 11th annual conference on research and development in information retrieval. Ed.: Y. Chiaramella

Hodges, P.R.: Keyword in title indexes : effectiveness of retrieval in computer searches (1983) 0.03

0.026800975 = product of:
  0.080402926 = sum of:
    0.080402926 = product of:
      0.12060438 = sum of:
        0.07221426 = weight(_text_:retrieval in 5001) [ClassicSimilarity], result of:
          0.07221426 = score(doc=5001,freq=8.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.46789268 = fieldWeight in 5001, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5001)
        0.048390117 = weight(_text_:22 in 5001) [ClassicSimilarity], result of:
          0.048390117 = score(doc=5001,freq=2.0), product of:
            0.17867287 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051022716 = queryNorm
            0.2708308 = fieldWeight in 5001, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5001)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)

Abstract: A study was done to test the effectiveness of retrieval using title word searching. It was based on actual search profiles used in the Mechanized Information Center at Ohio State University, in order ro replicate as closely as possible actual searching conditions. Fewer than 50% of the relevant titles were retrieved by keywords in titles. The low rate of retrieval can be attributes to three sources: titles themselves, user and information specialist ignorance of the subject vocabulary in use, and to general language problems. Across fields it was found that the social sciences had the best retrieval rate, with science having the next best, and arts and humanities the lowest. Ways to enhance and supplement keyword in title searching on the computer and in printed indexes are discussed.
Date: 14. 3.1996 13:22:21

Advances in intelligent retrieval: Proc. of a conference ... Wadham College, Oxford, 16.-17.4.1985 (1986) 0.03
```
0.02511932 = product of:
  0.07535796 = sum of:
    0.07535796 = product of:
      0.11303693 = sum of:
        0.031153653 = weight(_text_:online in 1384) [ClassicSimilarity], result of:
          0.031153653 = score(doc=1384,freq=2.0), product of:
            0.1548489 = queryWeight, product of:
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.051022716 = queryNorm
            0.20118743 = fieldWeight in 1384, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.046875 = fieldNorm(doc=1384)
        0.081883274 = weight(_text_:retrieval in 1384) [ClassicSimilarity], result of:
          0.081883274 = score(doc=1384,freq=14.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.5305404 = fieldWeight in 1384, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=1384)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)
```
Content

Enthält die Beiträge: ADDIS, T.: Extended relational analysis: a design approach to knowledge-based systems; PARKINSON, D.: Supercomputers and non-numeric processing; McGREGOR, D.R. u. J.R. MALONE: An architectural approach to advances in information retrieval; ALLEN, M.J. u. O.S. HARRISON: Word processing and information retrieval: some practical problems; MURTAGH, F.: Clustering and nearest neighborhood searching; ENSER, P.G.B.: Experimenting with the automatic classification of books; TESKEY, N. u. Z. RAZAK: An analysis of ranking for free text retrieval systems; ZARRI, G.P.: Interactive information retrieval: an artificial intelligence approach to deal with biographical data; HANCOX, P. u. F. SMITH: A case system processor for the PRECIS indexing language; ROUAULT, J.: Linguistic methods in information retrieval systems; ARAGON-RAMIREZ, V. u. C.D. PAICE: Design of a system for the online elucidation of natural language search statements; BROOKS, H.M., P.J. DANIELS u. N.J. BELKIN: Problem descriptions and user models: developing an intelligent interface for document retrieval systems; BLACK, W.J., P. HARGREAVES u. P.B. MAYES: HEADS: a cataloguing advisory system; BELL, D.A.: An architecture for integrating data, knowledge, and information bases

Pritchard-Schoch, T.: Natural language comes of age (1993) 0.02

0.022199143 = product of:
  0.066597424 = sum of:
    0.066597424 = product of:
      0.09989613 = sum of:
        0.0415382 = weight(_text_:online in 2570) [ClassicSimilarity], result of:
          0.0415382 = score(doc=2570,freq=2.0), product of:
            0.1548489 = queryWeight, product of:
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.051022716 = queryNorm
            0.2682499 = fieldWeight in 2570, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.0625 = fieldNorm(doc=2570)
        0.058357935 = weight(_text_:retrieval in 2570) [ClassicSimilarity], result of:
          0.058357935 = score(doc=2570,freq=4.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.37811437 = fieldWeight in 2570, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=2570)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)

Abstract: Discusses natural languages and the natural language implementations of Westlaw's full-text legal documents, Westlaw Is Natural. Natural language is not aritificial intelligence but a hybrid of linguistics, mathematics and statistics. Provides 3 classes of retrieval models. Explains how Westlaw processes an English query. Assesses WIN. Covers WIN enhancements; the natural language features of Congressional Quarterly's Washington Alert using a document for a query; the personal librarian front end search software and Dowquest from Dow Jones news/retrieval. Conmsiders whether natural language encourages fuzzy thinking and whether Boolean logic will still be needed
Source: Online. 17(1993) no.3, S.33-43

Bordoni, L.; Pazienza, M.T.: Documents automatic indexing in an environmental domain (1997) 0.02

0.022100737 = product of:
  0.06630221 = sum of:
    0.06630221 = product of:
      0.09945331 = sum of:
        0.05106319 = weight(_text_:retrieval in 530) [ClassicSimilarity], result of:
          0.05106319 = score(doc=530,freq=4.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.33085006 = fieldWeight in 530, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=530)
        0.048390117 = weight(_text_:22 in 530) [ClassicSimilarity], result of:
          0.048390117 = score(doc=530,freq=2.0), product of:
            0.17867287 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051022716 = queryNorm
            0.2708308 = fieldWeight in 530, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=530)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)

Abstract: Describes an application of Natural Language Processing (NLP) techniques, in HIRMA (Hypertextual Information Retrieval Managed by ARIOSTO), to the problem of document indexing by referring to a system which incorporates natural language processing techniques to determine the subject of the text of documents and to associate them with relevant semantic indexes. Describes briefly the overall system, details of its implementation on a corpus of scientific abstracts related to environmental topics and experimental evidence of the system's behaviour. Analyzes in detail an experiment designed to evaluate the system's retrieval ability in terms of recall and precision
Source: International forum on information and documentation. 22(1997) no.1, S.17-28

Dow Jones unveils knowledge indexing system (1997) 0.02

0.018400777 = product of:
  0.05520233 = sum of:
    0.05520233 = product of:
      0.082803495 = sum of:
        0.0415382 = weight(_text_:online in 751) [ClassicSimilarity], result of:
          0.0415382 = score(doc=751,freq=2.0), product of:
            0.1548489 = queryWeight, product of:
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.051022716 = queryNorm
            0.2682499 = fieldWeight in 751, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.0625 = fieldNorm(doc=751)
        0.041265294 = weight(_text_:retrieval in 751) [ClassicSimilarity], result of:
          0.041265294 = score(doc=751,freq=2.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.26736724 = fieldWeight in 751, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=751)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)

Abstract: Dow Jones Interactive Publishing has developed a sophisticated automatic knowledge indexing system that will allow searchers of the Dow Jones News / Retrieval service to get highly targeted results from a search in the service's Publications Library. Instead of relying on a thesaurus of company names, the new system uses a combination of that basic algorithm plus unique rules based on the editorial styles of individual publications in the Library. Dow Jones have also announced its acceptance of the definitions of 'selected full text' and 'full text' from Bibliodata's Fulltext Sources Online directory

Wacholder, N.; Byrd, R.J.: Retrieving information from full text using linguistic knowledge (1994) 0.02

0.016668199 = product of:
  0.050004594 = sum of:
    0.050004594 = product of:
      0.07500689 = sum of:
        0.04405792 = weight(_text_:online in 8524) [ClassicSimilarity], result of:
          0.04405792 = score(doc=8524,freq=4.0), product of:
            0.1548489 = queryWeight, product of:
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.051022716 = queryNorm
            0.284522 = fieldWeight in 8524, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.046875 = fieldNorm(doc=8524)
        0.03094897 = weight(_text_:retrieval in 8524) [ClassicSimilarity], result of:
          0.03094897 = score(doc=8524,freq=2.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.20052543 = fieldWeight in 8524, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=8524)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)

Abstract: Examines how techniques in the field of natural language processing can be applied to the analysis of text in information retrieval. State of the art text searching programs cannot distinguish, for example, between occurrences of the sickness, AIDS and aids as tool or between library school and school nor equate such terms as online or on-line which are variants of the same form. To make these distinction, systems must incorporate knowledge about the meaning of words in context. Research in natural language processing has concentrated on the automatic 'understanding' of language; how to analyze the grammatical structure and meaning of text. Although many asoects of this research remain experimental, describes how these techniques to recognize spelling variants, names, acronyms, and abbreviations
Source: Proceedings of the 15th National Online Meeting 1994, New York, 10-12 May 1994. Ed. by M.E. Williams

MacDougall, S.: Rethinking indexing : the impact of the Internet (1996) 0.02

0.016649358 = product of:
  0.049948074 = sum of:
    0.049948074 = product of:
      0.07492211 = sum of:
        0.031153653 = weight(_text_:online in 704) [ClassicSimilarity], result of:
          0.031153653 = score(doc=704,freq=2.0), product of:
            0.1548489 = queryWeight, product of:
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.051022716 = queryNorm
            0.20118743 = fieldWeight in 704, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.046875 = fieldNorm(doc=704)
        0.043768454 = weight(_text_:retrieval in 704) [ClassicSimilarity], result of:
          0.043768454 = score(doc=704,freq=4.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.2835858 = fieldWeight in 704, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=704)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)

Abstract: Considers the challenge to professional indexers posed by the Internet. Indexing and searching on the Internet appears to have a retrograde step, as well developed and efficient information retrieval techniques have been replaced by cruder techniques, involving automatic keyword indexing and frequency ranking, leading to large retrieval sets and low precision. This is made worse by the apparent acceptance of this poor perfromance by Internet users and the feeling, on the part of indexers, that they are being bypassed by the producers of these hyperlinked menus and search engines. Key issues are: how far 'human' indexing will still be required in the Internet environment; how indexing techniques will have to change to stay relevant; and the future role of indexers. The challenge facing indexers is to adapt their skills to suit the online environment and to convince publishers of the need for efficient indexes on the Internet

Rasmussen, E.M.: Indexing and retrieval for the Web (2002) 0.01
```
0.014652936 = product of:
  0.043958806 = sum of:
    0.043958806 = product of:
      0.065938205 = sum of:
        0.018172964 = weight(_text_:online in 4285) [ClassicSimilarity], result of:
          0.018172964 = score(doc=4285,freq=2.0), product of:
            0.1548489 = queryWeight, product of:
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.051022716 = queryNorm
            0.11735933 = fieldWeight in 4285, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.02734375 = fieldNorm(doc=4285)
        0.047765244 = weight(_text_:retrieval in 4285) [ClassicSimilarity], result of:
          0.047765244 = score(doc=4285,freq=14.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.30948192 = fieldWeight in 4285, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.02734375 = fieldNorm(doc=4285)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)
```
Abstract

The introduction and growth of the World Wide Web (WWW, or Web) have resulted in a profound change in the way individuals and organizations access information. In terms of volume, nature, and accessibility, the characteristics of electronic information are significantly different from those of even five or six years ago. Control of, and access to, this flood of information rely heavily an automated techniques for indexing and retrieval. According to Gudivada, Raghavan, Grosky, and Kasanagottu (1997, p. 58), "The ability to search and retrieve information from the Web efficiently and effectively is an enabling technology for realizing its full potential." Almost 93 percent of those surveyed consider the Web an "indispensable" Internet technology, second only to e-mail (Graphie, Visualization & Usability Center, 1998). Although there are other ways of locating information an the Web (browsing or following directory structures), 85 percent of users identify Web pages by means of a search engine (Graphie, Visualization & Usability Center, 1998). A more recent study conducted by the Stanford Institute for the Quantitative Study of Society confirms the finding that searching for information is second only to e-mail as an Internet activity (Nie & Ebring, 2000, online). In fact, Nie and Ebring conclude, "... the Internet today is a giant public library with a decidedly commercial tilt. The most widespread use of the Internet today is as an information search utility for products, travel, hobbies, and general information. Virtually all users interviewed responded that they engaged in one or more of these information gathering activities."
Techniques for automated indexing and information retrieval (IR) have been developed, tested, and refined over the past 40 years, and are well documented (see, for example, Agosti & Smeaton, 1996; BaezaYates & Ribeiro-Neto, 1999a; Frakes & Baeza-Yates, 1992; Korfhage, 1997; Salton, 1989; Witten, Moffat, & Bell, 1999). With the introduction of the Web, and the capability to index and retrieve via search engines, these techniques have been extended to a new environment. They have been adopted, altered, and in some Gases extended to include new methods. "In short, search engines are indispensable for searching the Web, they employ a variety of relatively advanced IR techniques, and there are some peculiar aspects of search engines that make searching the Web different than more conventional information retrieval" (Gordon & Pathak, 1999, p. 145). The environment for information retrieval an the World Wide Web differs from that of "conventional" information retrieval in a number of fundamental ways. The collection is very large and changes continuously, with pages being added, deleted, and altered. Wide variability between the size, structure, focus, quality, and usefulness of documents makes Web documents much more heterogeneous than a typical electronic document collection. The wide variety of document types includes images, video, audio, and scripts, as well as many different document languages. Duplication of documents and sites is common. Documents are interconnected through networks of hyperlinks. Because of the size and dynamic nature of the Web, preprocessing all documents requires considerable resources and is often not feasible, certainly not an the frequent basis required to ensure currency. Query length is usually much shorter than in other environments-only a few words-and user behavior differs from that in other environments. These differences make the Web a novel environment for information retrieval (Baeza-Yates & Ribeiro-Neto, 1999b; Bharat & Henzinger, 1998; Huang, 2000).
Plaunt, C.; Norgard, B.A.: ¬An association-based method for automatic indexing with a controlled vocabulary (1998) 0.01
```
0.013412262 = product of:
  0.040236786 = sum of:
    0.040236786 = product of:
      0.06035518 = sum of:
        0.025790809 = weight(_text_:retrieval in 1794) [ClassicSimilarity], result of:
          0.025790809 = score(doc=1794,freq=2.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.16710453 = fieldWeight in 1794, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1794)
        0.03456437 = weight(_text_:22 in 1794) [ClassicSimilarity], result of:
          0.03456437 = score(doc=1794,freq=2.0), product of:
            0.17867287 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051022716 = queryNorm
            0.19345059 = fieldWeight in 1794, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1794)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)
```
Abstract

In this article, we describe and test a two-stage algorithm based on a lexical collocation technique which maps from the lexical clues contained in a document representation into a controlled vocabulary list of subject headings. Using a collection of 4.626 INSPEC documents, we create a 'dictionary' of associations between the lexical items contained in the titles, authors, and abstracts, and controlled vocabulary subject headings assigned to those records by human indexers using a likelihood ratio statistic as the measure of association. In the deployment stage, we use the dictiony to predict which of the controlled vocabulary subject headings best describe new documents when they are presented to the system. Our evaluation of this algorithm, in which we compare the automatically assigned subject headings to the subject headings assigned to the test documents by human catalogers, shows that we can obtain results comparable to, and consistent with, human cataloging. In effect we have cast this as a classic partial match information retrieval problem. We consider the problem to be one of 'retrieving' (or assigning) the most probably 'relevant' (or correct) controlled vocabulary subject headings to a document based on the clues contained in that document

Date

11. 9.2000 19:53:22

Jardine, N.; Rijsbergen, C.J. van: ¬The use of hierarchic clustering in information retrieval (1971) 0.01

0.012968431 = product of:
  0.038905293 = sum of:
    0.038905293 = product of:
      0.11671587 = sum of:
        0.11671587 = weight(_text_:retrieval in 5170) [ClassicSimilarity], result of:
          0.11671587 = score(doc=5170,freq=4.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.75622874 = fieldWeight in 5170, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.125 = fieldNorm(doc=5170)
      0.33333334 = coord(1/3)
  0.33333334 = coord(1/3)

Source: Information storage and retrieval. 7(1971), S.217-240

Sparck Jones, K.; Jackson, D.M.: ¬The use of automatically obtained keyword classification for information retrieval (1970) 0.01

0.012968431 = product of:
  0.038905293 = sum of:
    0.038905293 = product of:
      0.11671587 = sum of:
        0.11671587 = weight(_text_:retrieval in 5177) [ClassicSimilarity], result of:
          0.11671587 = score(doc=5177,freq=4.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.75622874 = fieldWeight in 5177, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.125 = fieldNorm(doc=5177)
      0.33333334 = coord(1/3)
  0.33333334 = coord(1/3)

Source: Information storage and retrieval. 5(1970), S.175-201

Kantor, P.B.; Voorhees, E.: Information retrieval with scanned texts (2000) 0.01

0.012968431 = product of:
  0.038905293 = sum of:
    0.038905293 = product of:
      0.11671587 = sum of:
        0.11671587 = weight(_text_:retrieval in 3901) [ClassicSimilarity], result of:
          0.11671587 = score(doc=3901,freq=4.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.75622874 = fieldWeight in 3901, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.125 = fieldNorm(doc=3901)
      0.33333334 = coord(1/3)
  0.33333334 = coord(1/3)

Source: Information retrieval. 2(2000), S.165-176

Salton, G.: Another look at automatic text-retrieval systems (1986) 0.01

0.0128155565 = product of:
  0.03844667 = sum of:
    0.03844667 = product of:
      0.11534 = sum of:
        0.11534 = weight(_text_:retrieval in 1356) [ClassicSimilarity], result of:
          0.11534 = score(doc=1356,freq=10.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.74731416 = fieldWeight in 1356, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.078125 = fieldNorm(doc=1356)
      0.33333334 = coord(1/3)
  0.33333334 = coord(1/3)

Footnote: Bezugnahme auf: Blair, D.C.: An evaluation of retrieval effectiveness for a full-text document-retrieval system. Comm. ACM 28(1985) S.280-299. - Vgl. auch: Blair, D.C.: Full text retrieval ... Int. Class. 13(1986) S.18-23; Blair, D.C., M.E. Maron: full-text information retrieval ... Inf. Proc. Man. 26(1990) S.437-447.

Greiner-Petter, A.; Schubotz, M.; Cohl, H.S.; Gipp, B.: Semantic preserving bijective mappings for expressions involving special functions between computer algebra systems and document preparation systems (2019) 0.01
```
0.012671877 = product of:
  0.03801563 = sum of:
    0.03801563 = product of:
      0.057023443 = sum of:
        0.029371947 = weight(_text_:online in 5499) [ClassicSimilarity], result of:
          0.029371947 = score(doc=5499,freq=4.0), product of:
            0.1548489 = queryWeight, product of:
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.051022716 = queryNorm
            0.18968134 = fieldWeight in 5499, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.03125 = fieldNorm(doc=5499)
        0.027651496 = weight(_text_:22 in 5499) [ClassicSimilarity], result of:
          0.027651496 = score(doc=5499,freq=2.0), product of:
            0.17867287 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051022716 = queryNorm
            0.15476047 = fieldWeight in 5499, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=5499)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)
```
Abstract

Purpose Modern mathematicians and scientists of math-related disciplines often use Document Preparation Systems (DPS) to write and Computer Algebra Systems (CAS) to calculate mathematical expressions. Usually, they translate the expressions manually between DPS and CAS. This process is time-consuming and error-prone. The purpose of this paper is to automate this translation. This paper uses Maple and Mathematica as the CAS, and LaTeX as the DPS. Design/methodology/approach Bruce Miller at the National Institute of Standards and Technology (NIST) developed a collection of special LaTeX macros that create links from mathematical symbols to their definitions in the NIST Digital Library of Mathematical Functions (DLMF). The authors are using these macros to perform rule-based translations between the formulae in the DLMF and CAS. Moreover, the authors develop software to ease the creation of new rules and to discover inconsistencies. Findings The authors created 396 mappings and translated 58.8 percent of DLMF formulae (2,405 expressions) successfully between Maple and DLMF. For a significant percentage, the special function definitions in Maple and the DLMF were different. An atomic symbol in one system maps to a composite expression in the other system. The translator was also successfully used for automatic verification of mathematical online compendia and CAS. The evaluation techniques discovered two errors in the DLMF and one defect in Maple. Originality/value This paper introduces the first translation tool for special functions between LaTeX and CAS. The approach improves error-prone manual translations and can be used to verify mathematical online compendia and CAS.

Date

20. 1.2015 18:30:22
Toepfer, M.; Seifert, C.: Content-based quality estimation for automatic subject indexing of short texts under precision and recall constraints 0.01
```
0.011500487 = product of:
  0.03450146 = sum of:
    0.03450146 = product of:
      0.051752187 = sum of:
        0.025961377 = weight(_text_:online in 4309) [ClassicSimilarity], result of:
          0.025961377 = score(doc=4309,freq=2.0), product of:
            0.1548489 = queryWeight, product of:
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.051022716 = queryNorm
            0.16765618 = fieldWeight in 4309, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4309)
        0.025790809 = weight(_text_:retrieval in 4309) [ClassicSimilarity], result of:
          0.025790809 = score(doc=4309,freq=2.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.16710453 = fieldWeight in 4309, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4309)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)
```
Abstract

Semantic annotations have to satisfy quality constraints to be useful for digital libraries, which is particularly challenging on large and diverse datasets. Confidence scores of multi-label classification methods typically refer only to the relevance of particular subjects, disregarding indicators of insufficient content representation at the document-level. Therefore, we propose a novel approach that detects documents rather than concepts where quality criteria are met. Our approach uses a deep, multi-layered regression architecture, which comprises a variety of content-based indicators. We evaluated multiple configurations using text collections from law and economics, where the available content is restricted to very short texts. Notably, we demonstrate that the proposed quality estimation technique can determine subsets of the previously unseen data where considerable gains in document-level recall can be achieved, while upholding precision at the same time. Hence, the approach effectively performs a filtering that ensures high data quality standards in operative information retrieval systems.

Content

This is an authors' manuscript version of a paper accepted for proceedings of TPDL-2018, Porto, Portugal, Sept 10-13. The nal authenticated publication is available online at https://doi.org/will be added as soon as available.

Search (132 results, page 1 of 7)

Authors

Years

Types

Themes