Search (15 results, page 1 of 1)

  • × author_ss:"Salton, G."
  1. Salton, G.: Thoughts about modern retrieval technologies (1988) 0.05
    0.052262664 = product of:
      0.15678799 = sum of:
        0.1323866 = weight(_text_:graphic in 1522) [ClassicSimilarity], result of:
          0.1323866 = score(doc=1522,freq=2.0), product of:
            0.25850594 = queryWeight, product of:
              6.6217136 = idf(docFreq=159, maxDocs=44218)
              0.03903913 = queryNorm
            0.51212204 = fieldWeight in 1522, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.6217136 = idf(docFreq=159, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1522)
        0.024401393 = product of:
          0.048802786 = sum of:
            0.048802786 = weight(_text_:methods in 1522) [ClassicSimilarity], result of:
              0.048802786 = score(doc=1522,freq=2.0), product of:
                0.15695344 = queryWeight, product of:
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.03903913 = queryNorm
                0.31093797 = fieldWeight in 1522, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1522)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Abstract
    Paper presented at the 30th Annual Conference of the National Federation of Astracting and Information Services, Philadelphia, 28 Feb-2 Mar 88. In recent years, the amount and the variety of available machine-readable data, new technologies have been introduced, such as high density storage devices, and fancy graphic displays useful for information transformation and access. New approaches have also been considered for processing the stored data based on the construction of knowledge bases representing the contents and structure of the information, and the use of expert system techniques to control the user-system interactions. Provides a brief evaluation of the new information processing technologies, and of the software methods proposed for information manipulation.
  2. Salton, G.: Mathematics and information retrieval (1979) 0.01
    0.014928497 = product of:
      0.089570984 = sum of:
        0.089570984 = sum of:
          0.052210055 = weight(_text_:theory in 5467) [ClassicSimilarity], result of:
            0.052210055 = score(doc=5467,freq=2.0), product of:
              0.16234003 = queryWeight, product of:
                4.1583924 = idf(docFreq=1878, maxDocs=44218)
                0.03903913 = queryNorm
              0.32160926 = fieldWeight in 5467, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1583924 = idf(docFreq=1878, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5467)
          0.03736093 = weight(_text_:29 in 5467) [ClassicSimilarity], result of:
            0.03736093 = score(doc=5467,freq=2.0), product of:
              0.13732746 = queryWeight, product of:
                3.5176873 = idf(docFreq=3565, maxDocs=44218)
                0.03903913 = queryNorm
              0.27205724 = fieldWeight in 5467, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5176873 = idf(docFreq=3565, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5467)
      0.16666667 = coord(1/6)
    
    Abstract
    The development of a given discipline in science and technology often depends on the availability of theorie capable of describing the processes which control the field and of modelling the interactions between the processes. The absence of an accepted theory of information retrieval has benn blamed for the relative disorder and the lack of technical advances in the area. The main mathematical approaches to information retrieval are examined in this study, including both algebraic and probabilistic models, and the difficulties which impede the formalization of information retrieval processes are described. A number of developments are covered where new theoretical understandings have directly led to the improvemenet of retrieval techniques and operations
    Source
    Journal of documentation. 35(1979) no.1, S.1-29
  3. Salton, G.: ¬A theory of indexing (1975) 0.01
    0.009944773 = product of:
      0.059668638 = sum of:
        0.059668638 = product of:
          0.119337276 = sum of:
            0.119337276 = weight(_text_:theory in 5475) [ClassicSimilarity], result of:
              0.119337276 = score(doc=5475,freq=2.0), product of:
                0.16234003 = queryWeight, product of:
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.03903913 = queryNorm
                0.7351069 = fieldWeight in 5475, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.125 = fieldNorm(doc=5475)
          0.5 = coord(1/2)
      0.16666667 = coord(1/6)
    
  4. Salton, G.: Automatic processing of foreign language documents (1985) 0.01
    0.009620271 = product of:
      0.028860811 = sum of:
        0.014917159 = product of:
          0.029834319 = sum of:
            0.029834319 = weight(_text_:theory in 3650) [ClassicSimilarity], result of:
              0.029834319 = score(doc=3650,freq=2.0), product of:
                0.16234003 = queryWeight, product of:
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.03903913 = queryNorm
                0.18377672 = fieldWeight in 3650, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.1583924 = idf(docFreq=1878, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3650)
          0.5 = coord(1/2)
        0.013943653 = product of:
          0.027887305 = sum of:
            0.027887305 = weight(_text_:methods in 3650) [ClassicSimilarity], result of:
              0.027887305 = score(doc=3650,freq=2.0), product of:
                0.15695344 = queryWeight, product of:
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.03903913 = queryNorm
                0.17767884 = fieldWeight in 3650, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3650)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Abstract
    The attempt to computerize a process, such as indexing, abstracting, classifying, or retrieving information, begins with an analysis of the process into its intellectual and nonintellectual components. That part of the process which is amenable to computerization is mechanical or algorithmic. What is not is intellectual or creative and requires human intervention. Gerard Salton has been an innovator, experimenter, and promoter in the area of mechanized information systems since the early 1960s. He has been particularly ingenious at analyzing the process of information retrieval into its algorithmic components. He received a doctorate in applied mathematics from Harvard University before moving to the computer science department at Cornell, where he developed a prototype automatic retrieval system called SMART. Working with this system he and his students contributed for over a decade to our theoretical understanding of the retrieval process. On a more practical level, they have contributed design criteria for operating retrieval systems. The following selection presents one of the early descriptions of the SMART system; it is valuable as it shows the direction automatic retrieval methods were to take beyond simple word-matching techniques. These include various word normalization techniques to improve recall, for instance, the separation of words into stems and affixes; the correlation and clustering, using statistical association measures, of related terms; and the identification, using a concept thesaurus, of synonymous, broader, narrower, and sibling terms. They include, as weIl, techniques, both linguistic and statistical, to deal with the thorny problem of how to automatically extract from texts index terms that consist of more than one word. They include weighting techniques and various documentrequest matching algorithms. Significant among the latter are those which produce a retrieval output of citations ranked in relevante order. During the 1970s, Salton and his students went an to further refine these various techniques, particularly the weighting and statistical association measures. Many of their early innovations seem commonplace today. Some of their later techniques are still ahead of their time and await technological developments for implementation. The particular focus of the selection that follows is an the evaluation of a particular component of the SMART system, a multilingual thesaurus. By mapping English language expressions and their German equivalents to a common concept number, the thesaurus permitted the automatic processing of German language documents against English language queries and vice versa. The results of the evaluation, as it turned out, were somewhat inconclusive. However, this SMART experiment suggested in a bold and optimistic way how one might proceed to answer such complex questions as What is meant by retrieval language compatability? How it is to be achieved, and how evaluated?
    Source
    Theory of subject analysis: a sourcebook. Ed.: L.M. Chan, et al
  5. Salton, G.; Buckley, C.: Parallel text search methods (1988) 0.01
    0.009295769 = product of:
      0.05577461 = sum of:
        0.05577461 = product of:
          0.11154922 = sum of:
            0.11154922 = weight(_text_:methods in 404) [ClassicSimilarity], result of:
              0.11154922 = score(doc=404,freq=2.0), product of:
                0.15695344 = queryWeight, product of:
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.03903913 = queryNorm
                0.71071535 = fieldWeight in 404, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.125 = fieldNorm(doc=404)
          0.5 = coord(1/2)
      0.16666667 = coord(1/6)
    
  6. Salton, G.; Fox, E.A.; Voorhees, E.: Advanced feedback methods in information retrieval (1985) 0.01
    0.009295769 = product of:
      0.05577461 = sum of:
        0.05577461 = product of:
          0.11154922 = sum of:
            0.11154922 = weight(_text_:methods in 5445) [ClassicSimilarity], result of:
              0.11154922 = score(doc=5445,freq=2.0), product of:
                0.15695344 = queryWeight, product of:
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.03903913 = queryNorm
                0.71071535 = fieldWeight in 5445, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.125 = fieldNorm(doc=5445)
          0.5 = coord(1/2)
      0.16666667 = coord(1/6)
    
  7. Salton, G.; Voorhees, E.; Fox, E.A.: ¬A comparison of two methods for Boolean query relevance feedback (1984) 0.01
    0.009295769 = product of:
      0.05577461 = sum of:
        0.05577461 = product of:
          0.11154922 = sum of:
            0.11154922 = weight(_text_:methods in 5446) [ClassicSimilarity], result of:
              0.11154922 = score(doc=5446,freq=2.0), product of:
                0.15695344 = queryWeight, product of:
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.03903913 = queryNorm
                0.71071535 = fieldWeight in 5446, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.125 = fieldNorm(doc=5446)
          0.5 = coord(1/2)
      0.16666667 = coord(1/6)
    
  8. Salton, G.; Yang, C.S.: On the specification of term values in automatic indexing (1973) 0.01
    0.007116368 = product of:
      0.04269821 = sum of:
        0.04269821 = product of:
          0.08539642 = sum of:
            0.08539642 = weight(_text_:29 in 5476) [ClassicSimilarity], result of:
              0.08539642 = score(doc=5476,freq=2.0), product of:
                0.13732746 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03903913 = queryNorm
                0.6218451 = fieldWeight in 5476, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.125 = fieldNorm(doc=5476)
          0.5 = coord(1/2)
      0.16666667 = coord(1/6)
    
    Source
    Journal of documentation. 29(1973), S.351-372
  9. Salton, G.: Fast document classification in automatic information retrieval (1978) 0.01
    0.006573101 = product of:
      0.039438605 = sum of:
        0.039438605 = product of:
          0.07887721 = sum of:
            0.07887721 = weight(_text_:methods in 2331) [ClassicSimilarity], result of:
              0.07887721 = score(doc=2331,freq=4.0), product of:
                0.15695344 = queryWeight, product of:
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.03903913 = queryNorm
                0.5025517 = fieldWeight in 2331, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2331)
          0.5 = coord(1/2)
      0.16666667 = coord(1/6)
    
    Abstract
    A classified or clustered file is one where related or similar records are grouped into classes or clusters of items in such a way that all itmes within a cluster are jointly retrievable. Clustered files are easily adapted to to broad and narrow search strategies, and simple file updating methods are available. An inexpensive file clustering method applicable to large files is given together with appropriate file search methods
  10. Salton, G.; Buckley, C.: Improving retrieval performance by relevance feedback (1990) 0.01
    0.006573101 = product of:
      0.039438605 = sum of:
        0.039438605 = product of:
          0.07887721 = sum of:
            0.07887721 = weight(_text_:methods in 5442) [ClassicSimilarity], result of:
              0.07887721 = score(doc=5442,freq=4.0), product of:
                0.15695344 = queryWeight, product of:
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.03903913 = queryNorm
                0.5025517 = fieldWeight in 5442, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5442)
          0.5 = coord(1/2)
      0.16666667 = coord(1/6)
    
    Abstract
    Relevance feedback is an automatic process, introduced over 20 years ago, designed to produce improved query formulations following an initial retrieval operation. The principal relevance feedback methods described over the years are examined briefly, and evaluation data are included to demonstrate the effectiveness of the various methods. Prescriptions are given for conducting text retrieval operations iteratively using relevance feedback
  11. Salton, G.; Buckley, C.: Approaches to global text analysis (1990) 0.01
    0.005751464 = product of:
      0.034508783 = sum of:
        0.034508783 = product of:
          0.06901757 = sum of:
            0.06901757 = weight(_text_:methods in 4901) [ClassicSimilarity], result of:
              0.06901757 = score(doc=4901,freq=4.0), product of:
                0.15695344 = queryWeight, product of:
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.03903913 = queryNorm
                0.43973273 = fieldWeight in 4901, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4901)
          0.5 = coord(1/2)
      0.16666667 = coord(1/6)
    
    Abstract
    Current approaches to the analysis of natural language text are not viable for documents of unrestricted scope. A global text analysis system is proposed designed to identify homogeneous text environments in which the meaning of text words and phrases remains unambiguous, and useful term relationships may be automatically determined. The proposed methods include document clustering methods, as well as comparisons of local document excerpts in specified global contexts, leading to structured text representations in which similar texts, or text excerpts, are appropriately linked
  12. Salton, G.: Automatic text structuring and summarization (1997) 0.01
    0.005751464 = product of:
      0.034508783 = sum of:
        0.034508783 = product of:
          0.06901757 = sum of:
            0.06901757 = weight(_text_:methods in 145) [ClassicSimilarity], result of:
              0.06901757 = score(doc=145,freq=4.0), product of:
                0.15695344 = queryWeight, product of:
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.03903913 = queryNorm
                0.43973273 = fieldWeight in 145, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=145)
          0.5 = coord(1/2)
      0.16666667 = coord(1/6)
    
    Abstract
    Applies the ideas from the automatic link generation research to automatic text summarisation. Using techniques for inter-document link generation, generates intra-document links between passages of a document. Based on the intra-document linkage pattern of a text, characterises the structure of the text. Applies the knowledge of text structure to do automatic text summarisation by passage extraction. Evaluates a set of 50 summaries generated using these techniques by comparing the to paragraph extracts constructed by humans. The automatic summarisation methods perform well, especially in view of the fact that the summaries generates by 2 humans for the same article are surprisingly dissimilar
    Footnote
    Contribution to a special issue on methods and tools for the automatic construction of hypertext
  13. Salton, G.; Buckley, C.; Allan, J.: Automatic structuring of text files (1992) 0.00
    0.0046478845 = product of:
      0.027887305 = sum of:
        0.027887305 = product of:
          0.05577461 = sum of:
            0.05577461 = weight(_text_:methods in 6507) [ClassicSimilarity], result of:
              0.05577461 = score(doc=6507,freq=2.0), product of:
                0.15695344 = queryWeight, product of:
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.03903913 = queryNorm
                0.35535768 = fieldWeight in 6507, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6507)
          0.5 = coord(1/2)
      0.16666667 = coord(1/6)
    
    Abstract
    In many practical information retrieval situations, it is necessary to process heterogeneous text databases that vary greatly in scope and coverage and deal with many different subjects. In such an environment it is important to provide flexible access to individual text pieces and to structure the collection so that related text elements are identified and properly linked. Describes methods for the automatic structuring of heterogeneous text collections and the construction of browsing tools and access procedures that facilitate collection use. Illustrates these emthods with searches using a large automated encyclopedia
  14. Salton, G.: Another look at automatic text-retrieval systems (1986) 0.00
    0.0044477303 = product of:
      0.02668638 = sum of:
        0.02668638 = product of:
          0.05337276 = sum of:
            0.05337276 = weight(_text_:29 in 1356) [ClassicSimilarity], result of:
              0.05337276 = score(doc=1356,freq=2.0), product of:
                0.13732746 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03903913 = queryNorm
                0.38865322 = fieldWeight in 1356, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1356)
          0.5 = coord(1/2)
      0.16666667 = coord(1/6)
    
    Source
    Communications of the Association for Computing Machinery. 29(1986), S.648-656
  15. Salton, G.; Allan, J.; Buckley, C.; Singhal, A.: Automatic analysis, theme generation, and summarization of machine readable texts (1994) 0.00
    0.0044477303 = product of:
      0.02668638 = sum of:
        0.02668638 = product of:
          0.05337276 = sum of:
            0.05337276 = weight(_text_:29 in 1949) [ClassicSimilarity], result of:
              0.05337276 = score(doc=1949,freq=2.0), product of:
                0.13732746 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03903913 = queryNorm
                0.38865322 = fieldWeight in 1949, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1949)
          0.5 = coord(1/2)
      0.16666667 = coord(1/6)
    
    Date
    16. 8.1998 12:30:29