Search (36 results, page 2 of 2)

Biebricher, P.; Fuhr, N.; Niewelt, B.: ¬Der AIR-Retrievaltest (1986) 0.01

0.0068394816 = product of:
  0.04103689 = sum of:
    0.04103689 = weight(_text_:und in 4040) [ClassicSimilarity], result of:
      0.04103689 = score(doc=4040,freq=6.0), product of:
        0.09675359 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.043654136 = queryNorm
        0.42413816 = fieldWeight in 4040, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.078125 = fieldNorm(doc=4040)
  0.16666667 = coord(1/6)

Abstract: Der Beitrag enthält eine Darstellung zur Durchführung und zu den Ergebnissen des Retrievaltests zum AIR/PHYS-Projekt. Er zählt mit seinen 309 Fragen und 15.000 Dokumenten zu den größten Retrievaltests, die bisher zur Evaluierung automatisierter Indexierungs- oder Retrievalverfahren vorgenommen wurden.
Source: Automatische Indexierung zwischen Forschung und Anwendung, Hrsg.: G. Lustig

Lück, W.: Erfahrungen mit dem automatischen Indexierungssystem AIR/PHYS (1988) 0.01

0.006701296 = product of:
  0.040207777 = sum of:
    0.040207777 = weight(_text_:und in 526) [ClassicSimilarity], result of:
      0.040207777 = score(doc=526,freq=4.0), product of:
        0.09675359 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.043654136 = queryNorm
        0.41556883 = fieldWeight in 526, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.09375 = fieldNorm(doc=526)
  0.16666667 = coord(1/6)

Source: Von der Information zum Wissen - vom Wissen zur Information: traditionelle und moderne Informationssysteme für Wissenschaft und Praxis, Deutscher Dokumentartag 1987, Bad Dürkheim, vom 23.-25.9.1987. Hrsg.: H. Strohl-Goebel

Schneider, C.; Womser-Hacker, C.: Inhaltserschließungssysteme für Patenttexte : Test und Systemvergleich im Projekt PADOK (1986) 0.01

0.006701296 = product of:
  0.040207777 = sum of:
    0.040207777 = weight(_text_:und in 2648) [ClassicSimilarity], result of:
      0.040207777 = score(doc=2648,freq=4.0), product of:
        0.09675359 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.043654136 = queryNorm
        0.41556883 = fieldWeight in 2648, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.09375 = fieldNorm(doc=2648)
  0.16666667 = coord(1/6)

Source: Deutscher Dokumentartag 1986, Freiburg, 8.-10.10.1986: Bedarfsorientierte Fachinformation: Methoden und Techniken am Arbeitsplatz. Bearb.: H. Strohl-Goebel

Fuhr, N.: Ranking-Experimente mit gewichteter Indexierung (1986) 0.01

0.0059145326 = product of:
  0.035487194 = sum of:
    0.035487194 = product of:
      0.07097439 = sum of:
        0.07097439 = weight(_text_:22 in 58) [ClassicSimilarity], result of:
          0.07097439 = score(doc=58,freq=2.0), product of:
            0.15286934 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.043654136 = queryNorm
            0.46428138 = fieldWeight in 58, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=58)
      0.5 = coord(1/2)
  0.16666667 = coord(1/6)

Date: 14. 6.2015 22:12:44

Biebricher, P.; Fuhr, N.; Lustig, G.; Schwantner, M.; Knorz, G.: ¬Das automatische Indexierungssystem AIR/PHYS (1988) 0.01

0.005584413 = product of:
  0.03350648 = sum of:
    0.03350648 = weight(_text_:und in 528) [ClassicSimilarity], result of:
      0.03350648 = score(doc=528,freq=4.0), product of:
        0.09675359 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.043654136 = queryNorm
        0.34630734 = fieldWeight in 528, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.078125 = fieldNorm(doc=528)
  0.16666667 = coord(1/6)

Source: Von der Information zum Wissen - vom Wissen zur Information: traditionelle und moderne Informationssysteme für Wissenschaft und Praxis, Deutscher Dokumentartag 1987, Bad Dürkheim, vom 23.-25.9.1987. Hrsg.: H. Strohl-Goebel

Lustig, G.: Weiterentwicklung der automatischen Indexierung im Projekt AIR (1984) 0.01

0.005528287 = product of:
  0.03316972 = sum of:
    0.03316972 = weight(_text_:und in 458) [ClassicSimilarity], result of:
      0.03316972 = score(doc=458,freq=2.0), product of:
        0.09675359 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.043654136 = queryNorm
        0.34282678 = fieldWeight in 458, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.109375 = fieldNorm(doc=458)
  0.16666667 = coord(1/6)

Source: Deutscher Dokumentartag 1983, Göttingen, 3.-7.10.1983: Fachinformation und Bildschirmtext. Bearb.: H. Strohl-Goebel

Oliver, C.T.: One-eyed king: automated indexing (1989) 0.00

0.0026606917 = product of:
  0.01596415 = sum of:
    0.01596415 = weight(_text_:in in 2316) [ClassicSimilarity], result of:
      0.01596415 = score(doc=2316,freq=10.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.26884392 = fieldWeight in 2316, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0625 = fieldNorm(doc=2316)
  0.16666667 = coord(1/6)

Abstract: In a work entitled 'Adagia' published in 1508, Erasmus collected ancient Greek and Roman proverbs. He included this proverb: "Among the blind, the one-eyed man is king". In a field where there is little interest in the theoretical research of related fields, and in understanding the theoretical assumptions on which practical activity is based, a one-eyed man, such as autumatic or mechanical indexing, easily appears respectable and becomes widely practiced despite its obvious deficiencies

Porter, M.F.: ¬An algorithm for suffix stripping (1980) 0.00

0.0025241538 = product of:
  0.015144923 = sum of:
    0.015144923 = weight(_text_:in in 3122) [ClassicSimilarity], result of:
      0.015144923 = score(doc=3122,freq=4.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.25504774 = fieldWeight in 3122, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.09375 = fieldNorm(doc=3122)
  0.16666667 = coord(1/6)

Footnote: Wiederabgedruckt in: Readings in information retrieval. Ed.: K. Sparck Jones u. P. Willett. San Francisco: Morgan Kaufmann 1997. S.313-316.

Willett, P.: Recent trends in hierarchic document clustering : a critical review (1988) 0.00

0.0023797948 = product of:
  0.014278769 = sum of:
    0.014278769 = weight(_text_:in in 2604) [ClassicSimilarity], result of:
      0.014278769 = score(doc=2604,freq=2.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.24046129 = fieldWeight in 2604, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.125 = fieldNorm(doc=2604)
  0.16666667 = coord(1/6)

Needham, R.M.; Sparck Jones, K.: Keywords and clumps (1985) 0.00
```
0.0022086345 = product of:
  0.013251807 = sum of:
    0.013251807 = weight(_text_:in in 3645) [ClassicSimilarity], result of:
      0.013251807 = score(doc=3645,freq=36.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.22316676 = fieldWeight in 3645, product of:
          6.0 = tf(freq=36.0), with freq of:
            36.0 = termFreq=36.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.02734375 = fieldNorm(doc=3645)
  0.16666667 = coord(1/6)
```
Abstract

The selection that follows was chosen as it represents "a very early paper an the possibilities allowed by computers an documentation." In the early 1960s computers were being used to provide simple automatic indexing systems wherein keywords were extracted from documents. The problem with such systems was that they lacked vocabulary control, thus documents related in subject matter were not always collocated in retrieval. To improve retrieval by improving recall is the raison d'être of vocabulary control tools such as classifications and thesauri. The question arose whether it was possible by automatic means to construct classes of terms, which when substituted, one for another, could be used to improve retrieval performance? One of the first theoretical approaches to this question was initiated by R. M. Needham and Karen Sparck Jones at the Cambridge Language Research Institute in England.t The question was later pursued using experimental methodologies by Sparck Jones, who, as a Senior Research Associate in the Computer Laboratory at the University of Cambridge, has devoted her life's work to research in information retrieval and automatic naturai language processing. Based an the principles of numerical taxonomy, automatic classification techniques start from the premise that two objects are similar to the degree that they share attributes in common. When these two objects are keywords, their similarity is measured in terms of the number of documents they index in common. Step 1 in automatic classification is to compute mathematically the degree to which two terms are similar. Step 2 is to group together those terms that are "most similar" to each other, forming equivalence classes of intersubstitutable terms. The technique for forming such classes varies and is the factor that characteristically distinguishes different approaches to automatic classification. The technique used by Needham and Sparck Jones, that of clumping, is described in the selection that follows. Questions that must be asked are whether the use of automatically generated classes really does improve retrieval performance and whether there is a true eco nomic advantage in substituting mechanical for manual labor. Several years after her work with clumping, Sparck Jones was to observe that while it was not wholly satisfactory in itself, it was valuable in that it stimulated research into automatic classification. To this it might be added that it was valuable in that it introduced to libraryl information science the methods of numerical taxonomy, thus stimulating us to think again about the fundamental nature and purpose of classification. In this connection it might be useful to review how automatically derived classes differ from those of manually constructed classifications: 1) the manner of their derivation is purely a posteriori, the ultimate operationalization of the principle of literary warrant; 2) the relationship between members forming such classes is essentially statistical; the members of a given class are similar to each other not because they possess the class-defining characteristic but by virtue of sharing a family resemblance; and finally, 3) automatically derived classes are not related meaningfully one to another, that is, they are not ordered in traditional hierarchical and precedence relationships.

Footnote

Original in: Journal of documentation 20(1964) no.1, S.5-15.
Fagan, J.L.: ¬The effectiveness of a nonsyntactic approach to automatic phrase indexing for document retrieval (1989) 0.00
```
0.0021034614 = product of:
  0.012620768 = sum of:
    0.012620768 = weight(_text_:in in 1845) [ClassicSimilarity], result of:
      0.012620768 = score(doc=1845,freq=16.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.21253976 = fieldWeight in 1845, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1845)
  0.16666667 = coord(1/6)
```
Abstract

It may be possible to improve the quality of automatic indexing systems by using complex descriptors, for example, phrases, in addition to the simple descriptors (words or word stems) that are normally used in automatically constructed representations of document content. This study is directed toward the goal of developing effective methods of identifying phrases in natural language text from which good quality phrase descriptors can be constructed. The effectiveness of one method, a simple nonsyntactic phrase indexing procedure, has been tested on five experimental document collections. The results have been analyzed in order to identify the inadequacies of the procedure, and to determine what kinds of information about text structure are needed in order to construct phrase descriptors that are good indicators of document content. Two primary conclusions have been reached: (1) In the retrieval experiments, the nonsyntactic phrase construction procedure did not consistently yield substantial improvements in effectiveness. It is therefore not likely that phrase indexing of this kind will prove to be an important method of enhancing the performance of automatic document indexing and retrieval systems in operational environments. (2) Many of the shortcomings of the nonsyntactic approach can be overcome by incorporating syntactic information into the phrase construction process. However, a general syntactic analysis facility may be required, since many useful sources of phrases cannot be exploited if only a limited inventory of syntactic patterns can be recognized. Further research should be conducted into methods of incorporating automatic syntactic analysis into content analysis for document retrieval.

Griffiths, A.; Luckhurst, H.C.; Willett, P.: Using interdocument similarity information in document retrieval systems (1986) 0.00

0.0020823204 = product of:
  0.012493922 = sum of:
    0.012493922 = weight(_text_:in in 2415) [ClassicSimilarity], result of:
      0.012493922 = score(doc=2415,freq=2.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.21040362 = fieldWeight in 2415, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.109375 = fieldNorm(doc=2415)
  0.16666667 = coord(1/6)

Lochbaum, K.E.; Streeter, A.R.: Comparing and combining the effectiveness of latent semantic indexing and the ordinary vector space model for information retrieval (1989) 0.00
```
0.0019955188 = product of:
  0.011973113 = sum of:
    0.011973113 = weight(_text_:in in 3458) [ClassicSimilarity], result of:
      0.011973113 = score(doc=3458,freq=10.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.20163295 = fieldWeight in 3458, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=3458)
  0.16666667 = coord(1/6)
```
Abstract

A retrievalsystem was built to find individuals with appropriate expertise within a large research establishment on the basis of their authored documents. The expert-locating system uses a new method for automatic indexing and retrieval based on singular value decomposition, a matrix decomposition technique related to the factor analysis. Organizational groups, represented by the documents they write, and the terms contained in these documents, are fit simultaneously into a 100-dimensional "semantic" space. User queries are positioned in the semantic space, and the most similar groups are returned to the user. Here we compared the standard vector-space model with this new technique and found that combining the two methods improved performance over either alone. We also examined the effects of various experimental variables on the system`s retrieval accuracy. In particular, the effects of: term weighting functions in the semantic space construction and in query construction, suffix stripping, and using lexical units larger than a a single word were studied.

Fuhr, N.; Knorz, G.: Retrieval test evaluation of a rule based automatic indexing (AIR/PHYS) (1984) 0.00

0.0017848461 = product of:
  0.010709076 = sum of:
    0.010709076 = weight(_text_:in in 2321) [ClassicSimilarity], result of:
      0.010709076 = score(doc=2321,freq=2.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.18034597 = fieldWeight in 2321, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.09375 = fieldNorm(doc=2321)
  0.16666667 = coord(1/6)

Source: Research and development in information retrieval. Proc. of the 3rd joint BCS and ACM symp., Cambridge, 2.-6.7.1984. Ed.: C.J. van Rijsbergen

Vledutz-Stokolov, N.: Concept recognition in an automatic text-processing system for the life sciences (1987) 0.00
```
0.0016629322 = product of:
  0.009977593 = sum of:
    0.009977593 = weight(_text_:in in 2849) [ClassicSimilarity], result of:
      0.009977593 = score(doc=2849,freq=10.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.16802745 = fieldWeight in 2849, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2849)
  0.16666667 = coord(1/6)
```
Abstract

This article describes a natural-language text-processing system designed as an automatic aid to subject indexing at BIOSIS. The intellectual procedure the system should model is a deep indexing with a controlled vocabulary of biological concepts - Concept Headings (CHs). On the average, ten CHs are assigned to each article by BIOSIS indexers. The automatic procedure consists of two stages: (1) translation of natural-language biological titles into title-semantic representations which are in the constructed formalized language of Concept Primitives, and (2) translation of the latter representations into the language of CHs. The first stage is performed by matching the titles agianst the system's Semantic Vocabulary (SV). The SV currently contains approximately 15.000 biological natural-language terms and their translations in the language of Concept Primitives. Tor the ambiguous terms, the SV contains the algorithmical rules of term disambiguation, ruels based on semantic analysis of the contexts. The second stage of the automatic procedure is performed by matching the title representations against the CH definitions, formulated as Boolean search strategies in the language of Concept Primitives. Three experiments performed with the system and their results are decribed. The most typical problems the system encounters, the problems of lexical and situational ambiguities, are discussed. The disambiguation techniques employed are described and demonstrated in many examples
Salton, G.: Automatic processing of foreign language documents (1985) 0.00
```
0.0013303459 = product of:
  0.007982075 = sum of:
    0.007982075 = weight(_text_:in in 3650) [ClassicSimilarity], result of:
      0.007982075 = score(doc=3650,freq=10.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.13442196 = fieldWeight in 3650, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.03125 = fieldNorm(doc=3650)
  0.16666667 = coord(1/6)
```
Abstract

The attempt to computerize a process, such as indexing, abstracting, classifying, or retrieving information, begins with an analysis of the process into its intellectual and nonintellectual components. That part of the process which is amenable to computerization is mechanical or algorithmic. What is not is intellectual or creative and requires human intervention. Gerard Salton has been an innovator, experimenter, and promoter in the area of mechanized information systems since the early 1960s. He has been particularly ingenious at analyzing the process of information retrieval into its algorithmic components. He received a doctorate in applied mathematics from Harvard University before moving to the computer science department at Cornell, where he developed a prototype automatic retrieval system called SMART. Working with this system he and his students contributed for over a decade to our theoretical understanding of the retrieval process. On a more practical level, they have contributed design criteria for operating retrieval systems. The following selection presents one of the early descriptions of the SMART system; it is valuable as it shows the direction automatic retrieval methods were to take beyond simple word-matching techniques. These include various word normalization techniques to improve recall, for instance, the separation of words into stems and affixes; the correlation and clustering, using statistical association measures, of related terms; and the identification, using a concept thesaurus, of synonymous, broader, narrower, and sibling terms. They include, as weIl, techniques, both linguistic and statistical, to deal with the thorny problem of how to automatically extract from texts index terms that consist of more than one word. They include weighting techniques and various documentrequest matching algorithms. Significant among the latter are those which produce a retrieval output of citations ranked in relevante order. During the 1970s, Salton and his students went an to further refine these various techniques, particularly the weighting and statistical association measures. Many of their early innovations seem commonplace today. Some of their later techniques are still ahead of their time and await technological developments for implementation. The particular focus of the selection that follows is an the evaluation of a particular component of the SMART system, a multilingual thesaurus. By mapping English language expressions and their German equivalents to a common concept number, the thesaurus permitted the automatic processing of German language documents against English language queries and vice versa. The results of the evaluation, as it turned out, were somewhat inconclusive. However, this SMART experiment suggested in a bold and optimistic way how one might proceed to answer such complex questions as What is meant by retrieval language compatability? How it is to be achieved, and how evaluated?

Footnote

Original in: Journal of the American Society for Information Science 21(1970) no.3, S.187-194.

Search (36 results, page 2 of 2)

Authors

Languages

Themes