Search (779 results, page 39 of 39)

  • × language_ss:"e"
  • × type_ss:"a"
  • × year_i:[1980 TO 1990}
  1. Farradane, J.E.L.: Fundamental fallacies and new needs in classification (1985) 0.00
    0.0025748524 = product of:
      0.01029941 = sum of:
        0.01029941 = weight(_text_:information in 3642) [ClassicSimilarity], result of:
          0.01029941 = score(doc=3642,freq=8.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.116372846 = fieldWeight in 3642, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0234375 = fieldNorm(doc=3642)
      0.25 = coord(1/4)
    
    Abstract
    This chapter from The Sayers Memorial Volume summarizes Farradane's earlier work in which he developed his major themes by drawing in part upon research in psychology, and particularly those discoveries called "cognitive" which now form part of cognitive science. Farradane, a chemist by training who later became an information scientist and Director of the Center for Information Science, City University, London, from 1958 to 1973, defines the various types of methods used to achieve classification systems-philosophic, scientific, and synthetic. Early an he distinguishes the view that classification is "some part of external 'reality' waiting to be discovered" from that view which considers it "an intellectual operation upon mental entities and concepts." Classification, therefore, is to be treated as a mental construct and not as something "out there" to be discovered as, say, in astronomy or botany. His approach could be termed, somewhat facetiously, as an "in there" one, meaning found by utilizing the human brain as the key tool. This is not to say that discoveries in astronomy or botany do not require the use of the brain as a key tool. It is merely that the "material" worked upon by this tool is presented to it for observation by "that inward eye," by memory and by inference rather than by planned physical observation, memory, and inference. This distinction could be refined or clarified by considering the initial "observation" as a specific kind of mental set required in each case. Farradane then proceeds to demolish the notion of main classes as "fictitious," partly because the various category-defining methodologies used in library classification are "randomly mixed." The implication, probably correct, is that this results in mixed metaphorical concepts. It is an interesting contrast to the approach of Julia Pettee (q.v.), who began with indexing terms and, in studying relationships between terms, discovered hidden hierarchies both between the terms themselves and between the cross-references leading from one term or set of terms to another. One is tempted to ask two questions: "Is hierarchy innate but misinterpreted?" and "ls it possible to have meaningful terms which have only categorical relationships (that have no see also or equivalent relationships to other, out-of-category terms)?" Partly as a result of the rejection of existing general library classification systems, the Classification Research Group-of which Farradane was a charter member decided to adopt the principles of Ranganathan's faceted classification system, while rejecting his limit an the number of fundamental categories. The advantage of the faceted method is that it is created by inductive, rather than deductive, methods. It can be altered more readily to keep up with changes in and additions to the knowledge base in a subject without having to re-do the major schedules. In 1961, when Farradane's paper appeared, the computer was beginning to be viewed as a tool for solving all information retrieval problems. He tartly remarks:
    The basic fallacy of mechanised information retrieval systems seems to be the often unconscious but apparently implied assumption that the machine can inject meaning into a group of juxtaposed terms although no methods of conceptual analysis and re-synthesis have been programmed (p. 203). As an example, he suggests considering the slight but vital differences in the meaning of the word "of" in selected examples: swarm of bees house of the mayor House of Lords spectrum of the sun basket of fish meeting of councillors cooking of meat book of the film Farradane's distinctive contribution is his matrix of basic relationships. The rows concern time and memory, in degree of happenstance: coincidentally, occasionally, or always. The columns represent degree of the "powers of discrimination": occurring together, linked by common elements only, or standing alone. To make these relationships easily managed, he used symbols for each of the nine kinds - "symbols found an every typewriter": /O (Theta) /* /; /= /+ /( /) /_ /: Farradane has maintained his basic insights to the present day. Though he has gone an to do other kinds of research in classification, his work indicates that he still believes that "the primary task ... is that of establishing satisfactory and enduring principles of subject analysis, or classification" (p. 208).
  2. Miller, D.R.; Brewer, K.: Usefulness of OCLC archive tapes as a basis for local online systems (1982) 0.00
    0.0025748524 = product of:
      0.01029941 = sum of:
        0.01029941 = weight(_text_:information in 295) [ClassicSimilarity], result of:
          0.01029941 = score(doc=295,freq=2.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.116372846 = fieldWeight in 295, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=295)
      0.25 = coord(1/4)
    
    Abstract
    Many health science libraries are now in the planning stages for acquisition of local online catalogs and circulation systems. Whether turn-key or in-house, in most cases such systems will be based on machine-readable records, or archive tapes, produced as a by-product of automated cataloging. Because most libraries originally used these systems as a more efficient means to produce catalog cards, the usefulness of the records is questioned. A review of selected aspects of cataloging via OCLC reveals several areas in which local card production priorities have made the resultant archive tapes more difficult and costly to use as a machine-readable database. Some specific suggestions are given for altering input procedures to improve the usefulness of archive tapes. In conclusion, it is recommended that librarians re-examine local input procedures in light of cost-effective production of archive lapes to produce consistent bibliographic entries for local online catalogs, resource sharing projects and management information systems.
  3. Markey, K.: Searching and browsing the Dewey Decimal Classification in an online catalog (1987) 0.00
    0.0025748524 = product of:
      0.01029941 = sum of:
        0.01029941 = weight(_text_:information in 384) [ClassicSimilarity], result of:
          0.01029941 = score(doc=384,freq=2.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.116372846 = fieldWeight in 384, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=384)
      0.25 = coord(1/4)
    
    Abstract
    In the DDC Online Project, subject searching and browsing of DDC schedules and relative index were featured in an experimental online catalog. The effectiveness of this DDC in an online catalog was tested in online retrieval experiments at four participating libraries. These experiments provided data for analyses of subject searchers' use of a library classification in the information retrieval environment of an online catalog. Recommendations were provided for the enhancement of bibliographic records, online catalogs, and online cataloging systems with a library classification. In this paper, subject searchers' use of the subject outline search capability of the experimental online catalog is described. This capability was unique to the experimental online catalog and all other online catalogs, because it referred searchers to online displays of the classification schedules based on their entry of subject terms. Failure analyses of subject outline searches demonstrated its specific strenghts and weaknesses. Users' postsearch interview comments highlighted their experiences and their satisfaction with this search. Based on the failure analyses and users' interview comments, recommendations are provided for the improvement of the subject outline search in online catalogs.
  4. Carlyle, A.: Matching LCSH and user vocabulary in the library catalog (1989) 0.00
    0.0025748524 = product of:
      0.01029941 = sum of:
        0.01029941 = weight(_text_:information in 449) [ClassicSimilarity], result of:
          0.01029941 = score(doc=449,freq=2.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.116372846 = fieldWeight in 449, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=449)
      0.25 = coord(1/4)
    
    Abstract
    Central to subject searching is the match between user vocabulary and the headings from Library of Congress Subject-Headings (LCSH) used in a library catalog. This paper evaluates previous matching studies, proposes a detailed list of matching categories, and tests LCSH in a study using these categories. Exact and partial match categories are defined for single LCSH and multiple LCSH matches to user expressions. One no-match category is included. Transaction logs from ORION, UCLA's online Information system, were used to collect user expressions for a comparison of LCSH and user language. Results show that single LCSH headings match user expressions exactly about 47% of the time; that single subject heading matches, including exact matches, comprise 74% of the total; that partial matches, to both single and multiple headings, comprise about 21% of the total; and that no match occurs 5% of the time.
  5. Garfield, E.: Citation indexes for science (1985) 0.00
    0.002427594 = product of:
      0.009710376 = sum of:
        0.009710376 = weight(_text_:information in 3632) [ClassicSimilarity], result of:
          0.009710376 = score(doc=3632,freq=4.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.10971737 = fieldWeight in 3632, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=3632)
      0.25 = coord(1/4)
    
    Abstract
    Indexes in general seek to provide a "key" to a body of literature intending to help the user in identifying, verifying, and/or locating individual or related items. The most common devices for collocation in indexes are authors' names and subjects. A different approach to collocating related items in an index is provided by a method called "citation indexing." Citation indexes attempt to link items through citations or references, in other works, by bringing together items cited in a particular work and the works citing a particular item. Citation indexing is based an the concept that there is a significant intellectual link between a document and each bibliographic item cited in it and that this link is useful to the scholar because an author's references to earlier writings identify relevant information to the subject of his current work. One of the major differences between the citation index and the traditional subject index is that the former, while listing current literature, also provides a retrospec tive view of past literature. While each issue of a traditional index is normally concerned only with the current literature, the citation index brings back retrospective literature in the form of cited references, thereby linking current scholarly works with earlier works. The advantages of the citation index have been considered to be its value as a tool for tracing the history of ideas or discoveries, for associating ideas between current and past work, and for evaluating works of individual authors or library collections. The concept of citation indexing is not new. It has been applied to legal literature since 1873 in a legal reference tool called Shepard's Citations. In the 1950s Eugene Garfield, a documentation consultant and founder and President of the Institute for Scientific Information (Philadelphia), developed the technique of citation indexing for scientific literature. This new application was facilitated by the availability of computer technology, resulting in a series of services: Science Citation Index (1955- ), Social Sciences Citation Index (1966- ), and the Arts & Humanities Index (1976- ). All three appear in printed versions and as machine-readable databases. In the following essay, the first in a series of articles and books elucidating the citation indexing system, Garfield traces the origin and beginning of this idea, its advantages, and the methods of preparing such indexes.
  6. Vickery, B.C.: Systematic subject indexing (1985) 0.00
    0.002427594 = product of:
      0.009710376 = sum of:
        0.009710376 = weight(_text_:information in 3636) [ClassicSimilarity], result of:
          0.009710376 = score(doc=3636,freq=4.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.10971737 = fieldWeight in 3636, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=3636)
      0.25 = coord(1/4)
    
    Abstract
    Brian C. Vickery, Director and Professor, School of Library, Archive and Information Studies, University College, London, is a prolific writer on classification and information retrieval. This paper was one of the earliest to present initial efforts by the Classification Research Group (q.v.). In it he clearly outlined the need for classification in subject indexing, which, at the time he wrote, was not a commonplace understanding. In fact, some indexing systems were made in the first place specifically to avoid general classification systems which were out of date in all fast-moving disciplines, especially in the "hard" sciences. Vickery picked up Julia Pettee's work (q.v.) an the concealed classification in subject headings (1947) and added to it, mainly adopting concepts from the work of S. R. Ranganathan (q.v.). He had already published a paper an notation in classification, pointing out connections between notation, words, and the concepts which they represent. He was especially concerned about the structure of notational symbols as such symbols represented relationships among subjects. Vickery also emphasized that index terms cover all aspects of a subject so that, in addition to having a basis in classification, the ideal index system should also have standardized nomenclature, as weIl as show evidence of a systematic classing of elementary terms. The necessary linkage between system and terms should be one of a number of methods, notably:
  7. Fugmann, R.: ¬The complementarity of natural and indexing languages (1985) 0.00
    0.002427594 = product of:
      0.009710376 = sum of:
        0.009710376 = weight(_text_:information in 3641) [ClassicSimilarity], result of:
          0.009710376 = score(doc=3641,freq=4.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.10971737 = fieldWeight in 3641, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=3641)
      0.25 = coord(1/4)
    
    Abstract
    The second Cranfield experiment (Cranfield II) in the mid-1960s challenged assumptions held by librarians for nearly a century, namely, that the objective of providing subject access was to bring together all materials an a given topic and that the achieving of this objective required vocabulary control in the form of an index language. The results of Cranfield II were replicated by other retrieval experiments quick to follow its lead and increasing support was given to the opinion that natural language information systems could perform at least as effectively, and certainly more economically, than those employing index languages. When the results of empirical research dramatically counter conventional wisdom, an obvious course is to question the validity of the research and, in the case of retrieval experiments, this eventually happened. Retrieval experiments were criticized for their artificiality, their unrepresentative sampies, and their problematic definitions-particularly the definition of relevance. In the minds of some, at least, the relative merits of natural languages vs. indexing languages continued to be an unresolved issue. As with many eitherlor options, a seemingly safe course to follow is to opt for "both," and indeed there seems to be an increasing amount of counsel advising a combination of natural language and index language search capabilities. One strong voice offering such counsel is that of Robert Fugmann, a chemist by training, a theoretician by predilection, and, currently, a practicing information scientist at Hoechst AG, Frankfurt/Main. This selection from his writings sheds light an the capabilities and limitations of both kinds of indexing. Its special significance lies in the fact that its arguments are based not an empirical but an rational grounds. Fugmann's major argument starts from the observation that in natural language there are essentially two different kinds of concepts: 1) individual concepts, repre sented by names of individual things (e.g., the name of the town Augsburg), and 2) general concepts represented by names of classes of things (e.g., pesticides). Individual concepts can be represented in language simply and succinctly, often by a single string of alphanumeric characters; general concepts, an the other hand, can be expressed in a multiplicity of ways. The word pesticides refers to the concept of pesticides, but also referring to this concept are numerous circumlocutions, such as "Substance X was effective against pests." Because natural language is capable of infinite variety, we cannot predict a priori the manifold ways a general concept, like pesticides, will be represented by any given author. It is this lack of predictability that limits natural language retrieval and causes poor precision and recall. Thus, the essential and defining characteristic of an index language ls that it is a tool for representational predictability.
  8. Cleverdon, C.W.; Mills, J.: ¬The testing of index language devices (1985) 0.00
    0.002427594 = product of:
      0.009710376 = sum of:
        0.009710376 = weight(_text_:information in 3643) [ClassicSimilarity], result of:
          0.009710376 = score(doc=3643,freq=4.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.10971737 = fieldWeight in 3643, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=3643)
      0.25 = coord(1/4)
    
    Abstract
    A landmark event in the twentieth-century development of subject analysis theory was a retrieval experiment, begun in 1957, by Cyril Cleverdon, Librarian of the Cranfield Institute of Technology. For this work he received the Professional Award of the Special Libraries Association in 1962 and the Award of Merit of the American Society for Information Science in 1970. The objective of the experiment, called Cranfield I, was to test the ability of four indexing systems-UDC, Facet, Uniterm, and Alphabetic-Subject Headings-to retrieve material responsive to questions addressed to a collection of documents. The experiment was ambitious in scale, consisting of eighteen thousand documents and twelve hundred questions. Prior to Cranfield I, the question of what constitutes good indexing was approached subjectively and reference was made to assumptions in the form of principles that should be observed or user needs that should be met. Cranfield I was the first large-scale effort to use objective criteria for determining the parameters of good indexing. Its creative impetus was the definition of user satisfaction in terms of precision and recall. Out of the experiment emerged the definition of recall as the percentage of relevant documents retrieved and precision as the percentage of retrieved documents that were relevant. Operationalizing the concept of user satisfaction, that is, making it measurable, meant that it could be studied empirically and manipulated as a variable in mathematical equations. Much has been made of the fact that the experimental methodology of Cranfield I was seriously flawed. This is unfortunate as it tends to diminish Cleverdon's contribu tion, which was not methodological-such contributions can be left to benchmark researchers-but rather creative: the introduction of a new paradigm, one that proved to be eminently productive. The criticism leveled at the methodological shortcomings of Cranfield I underscored the need for more precise definitions of the variables involved in information retrieval. Particularly important was the need for a definition of the dependent variable index language. Like the definitions of precision and recall, that of index language provided a new way of looking at the indexing process. It was a re-visioning that stimulated research activity and led not only to a better understanding of indexing but also the design of better retrieval systems." Cranfield I was followed by Cranfield II. While Cranfield I was a wholesale comparison of four indexing "systems," Cranfield II aimed to single out various individual factors in index languages, called "indexing devices," and to measure how variations in these affected retrieval performance. The following selection represents the thinking at Cranfield midway between these two notable retrieval experiments.
  9. Rolling, L.: ¬The role of graphic display of concept relationships in indexing and retrieval vocabularies (1985) 0.00
    0.002427594 = product of:
      0.009710376 = sum of:
        0.009710376 = weight(_text_:information in 3646) [ClassicSimilarity], result of:
          0.009710376 = score(doc=3646,freq=4.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.10971737 = fieldWeight in 3646, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=3646)
      0.25 = coord(1/4)
    
    Abstract
    The use of diagrams to express relationships in classification is not new. Many classificationists have used this approach, but usually in a minor display to make a point or for part of a difficult relational situation. Ranganathan, for example, used diagrams for some of his more elusive concepts. The thesaurus in particular and subject headings in general, with direct and indirect crossreferences or equivalents, need many more diagrams than normally are included to make relationships and even semantics clear. A picture very often is worth a thousand words. Rolling has used directed graphs (arrowgraphs) to join terms as a practical method for rendering relationships between indexing terms lucid. He has succeeded very weIl in this endeavor. Four diagrams in this selection are all that one needs to explain how to employ the system; from initial listing to completed arrowgraph. The samples of his work include illustration of off-page connectors between arrowgraphs. The great advantage to using diagrams like this is that they present relations between individual terms in a format that is easy to comprehend. But of even greater value is the fact that one can use his arrowgraphs as schematics for making three-dimensional wire-and-ball models, in which the relationships may be seen even more clearly. In fact, errors or gaps in relations are much easier to find with this methodology. One also can get across the notion of the threedimensionality of classification systems with such models. Pettee's "hand reaching up and over" (q.v.) is not a figment of the imagination. While the actual hand is a wire or stick, the concept visualized is helpful in illuminating the three-dimensional figure that is latent in all systems that have cross-references or "broader," "narrower," or, especially, "related" terms. Classification schedules, being hemmed in by the dimensions of the printed page, also benefit from such physical illustrations. Rolling, an engineer by conviction, was the developer of information systems for the Cobalt Institute, the European Atomic Energy Community, and European Coal and Steel Community. He also developed and promoted computer-aided translation at the Commission of the European Communities in Luxembourg. One of his objectives has always been to increase the efficiency of mono- and multilingual thesauri for use in multinational information systems.
  10. Lancaster, F.W.: Evaluating the performance of a large computerized information system (1985) 0.00
    0.002427594 = product of:
      0.009710376 = sum of:
        0.009710376 = weight(_text_:information in 3649) [ClassicSimilarity], result of:
          0.009710376 = score(doc=3649,freq=4.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.10971737 = fieldWeight in 3649, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=3649)
      0.25 = coord(1/4)
    
    Abstract
    F. W. Lancaster is known for his writing an the state of the art in librarylinformation science. His skill in identifying significant contributions and synthesizing literature in fields as diverse as online systems, vocabulary control, measurement and evaluation, and the paperless society have earned him esteem as a chronicler of information science. Equally deserving of repute is his own contribution to research in the discipline-his evaluation of the MEDLARS operating system. The MEDLARS study is notable for several reasons. It was the first large-scale application of retrieval experiment methodology to the evaluation of an actual operating system. As such, problems had to be faced that do not arise in laboratory-like conditions. One example is the problem of recall: how to determine, for a very large and dynamic database, the number of documents relevant to a given search request. By solving this problem and others attendant upon transferring an experimental methodology to the real world, Lancaster created a constructive procedure that could be used to improve the design and functioning of retrieval systems. The MEDLARS study is notable also for its contribution to our understanding of what constitutes a good index language and good indexing. The ideal retrieval system would be one that retrieves all and only relevant documents. The failures that occur in real operating systems, when a relevant document is not retrieved (a recall failure) or an irrelevant document is retrieved (a precision failure), can be analysed to assess the impact of various factors an the performance of the system. This is exactly what Lancaster did. He found both the MEDLARS indexing and the McSH index language to be significant factors affecting retrieval performance. The indexing, primarily because it was insufficiently exhaustive, explained a large number of recall failures. The index language, largely because of its insufficient specificity, accounted for a large number of precision failures. The purpose of identifying factors responsible for a system's failures is ultimately to improve the system. Unlike many user studies, the MEDLARS evaluation yielded recommendations that were eventually implemented.* Indexing exhaustivity was increased and the McSH index language was enriched with more specific terms and a larger entry vocabulary.
  11. Kaiser, J.O.: Systematic indexing (1985) 0.00
    0.002427594 = product of:
      0.009710376 = sum of:
        0.009710376 = weight(_text_:information in 571) [ClassicSimilarity], result of:
          0.009710376 = score(doc=571,freq=4.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.10971737 = fieldWeight in 571, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=571)
      0.25 = coord(1/4)
    
    Abstract
    A native of Germany and a former teacher of languages and music, Julius Otto Kaiser (1868-1927) came to the Philadelphia Commercial Museum to be its librarian in 1896. Faced with the problem of making "information" accessible, he developed a method of indexing he called systematic indexing. The first draft of his scheme, published in 1896-97, was an important landmark in the history of subject analysis. R. K. Olding credits Kaiser with making the greatest single advance in indexing theory since Charles A. Cutter and John Metcalfe eulogizes him by observing that "in sheer capacity for really scientific and logical thinking, Kaiser's was probably the best mind that has ever applied itself to subject indexing." Kaiser was an admirer of "system." By systematic indexing he meant indicating information not with natural language expressions as, for instance, Cutter had advocated, but with artificial expressions constructed according to formulas. Kaiser grudged natural language its approximateness, its vagaries, and its ambiguities. The formulas he introduced were to provide a "machinery for regularising or standardising language" (paragraph 67). Kaiser recognized three categories or "facets" of index terms: (1) terms of concretes, representing things, real or imaginary (e.g., money, machines); (2) terms of processes, representing either conditions attaching to things or their actions (e.g., trade, manufacture); and (3) terms of localities, representing, for the most part, countries (e.g., France, South Africa). Expressions in Kaiser's index language were called statements. Statements consisted of sequences of terms, the syntax of which was prescribed by formula. These formulas specified sequences of terms by reference to category types. Only three citation orders were permitted: a term in the concrete category followed by one in the process category (e.g., Wool-Scouring); (2) a country term followed by a process term (e.g., Brazil - Education); and (3) a concrete term followed by a country term, followed by a process term (e.g., Nitrate-Chile-Trade). Kaiser's system was a precursor of two of the most significant developments in twentieth-century approaches to subject access-the special purpose use of language for indexing, thus the concept of index language, which was to emerge as a generative idea at the time of the second Cranfield experiment (1966) and the use of facets to categorize subject indicators, which was to become the characterizing feature of analytico-synthetic indexing methods such as the Colon classification. In addition to its visionary quality, Kaiser's work is notable for its meticulousness and honesty, as can be seen, for instance, in his observations about the difficulties in facet definition.
  12. Vledutz-Stokolov, N.: Concept recognition in an automatic text-processing system for the life sciences (1987) 0.00
    0.0021457102 = product of:
      0.008582841 = sum of:
        0.008582841 = weight(_text_:information in 2849) [ClassicSimilarity], result of:
          0.008582841 = score(doc=2849,freq=2.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.09697737 = fieldWeight in 2849, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2849)
      0.25 = coord(1/4)
    
    Source
    Journal of the American Society for Information Science. 38(1987) no.4, S.269-287
  13. MacCain, K.W.; White, H.D.; Griffith, B.C.: Comparing retrieval performance in online data bases (1987) 0.00
    0.0021457102 = product of:
      0.008582841 = sum of:
        0.008582841 = weight(_text_:information in 1167) [ClassicSimilarity], result of:
          0.008582841 = score(doc=1167,freq=2.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.09697737 = fieldWeight in 1167, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1167)
      0.25 = coord(1/4)
    
    Source
    Information processing and management. 23(1987), S.539-553
  14. Needham, R.M.; Sparck Jones, K.: Keywords and clumps (1985) 0.00
    0.002124145 = product of:
      0.00849658 = sum of:
        0.00849658 = weight(_text_:information in 3645) [ClassicSimilarity], result of:
          0.00849658 = score(doc=3645,freq=4.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.0960027 = fieldWeight in 3645, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.02734375 = fieldNorm(doc=3645)
      0.25 = coord(1/4)
    
    Abstract
    The selection that follows was chosen as it represents "a very early paper an the possibilities allowed by computers an documentation." In the early 1960s computers were being used to provide simple automatic indexing systems wherein keywords were extracted from documents. The problem with such systems was that they lacked vocabulary control, thus documents related in subject matter were not always collocated in retrieval. To improve retrieval by improving recall is the raison d'être of vocabulary control tools such as classifications and thesauri. The question arose whether it was possible by automatic means to construct classes of terms, which when substituted, one for another, could be used to improve retrieval performance? One of the first theoretical approaches to this question was initiated by R. M. Needham and Karen Sparck Jones at the Cambridge Language Research Institute in England.t The question was later pursued using experimental methodologies by Sparck Jones, who, as a Senior Research Associate in the Computer Laboratory at the University of Cambridge, has devoted her life's work to research in information retrieval and automatic naturai language processing. Based an the principles of numerical taxonomy, automatic classification techniques start from the premise that two objects are similar to the degree that they share attributes in common. When these two objects are keywords, their similarity is measured in terms of the number of documents they index in common. Step 1 in automatic classification is to compute mathematically the degree to which two terms are similar. Step 2 is to group together those terms that are "most similar" to each other, forming equivalence classes of intersubstitutable terms. The technique for forming such classes varies and is the factor that characteristically distinguishes different approaches to automatic classification. The technique used by Needham and Sparck Jones, that of clumping, is described in the selection that follows. Questions that must be asked are whether the use of automatically generated classes really does improve retrieval performance and whether there is a true eco nomic advantage in substituting mechanical for manual labor. Several years after her work with clumping, Sparck Jones was to observe that while it was not wholly satisfactory in itself, it was valuable in that it stimulated research into automatic classification. To this it might be added that it was valuable in that it introduced to libraryl information science the methods of numerical taxonomy, thus stimulating us to think again about the fundamental nature and purpose of classification. In this connection it might be useful to review how automatically derived classes differ from those of manually constructed classifications: 1) the manner of their derivation is purely a posteriori, the ultimate operationalization of the principle of literary warrant; 2) the relationship between members forming such classes is essentially statistical; the members of a given class are similar to each other not because they possess the class-defining characteristic but by virtue of sharing a family resemblance; and finally, 3) automatically derived classes are not related meaningfully one to another, that is, they are not ordered in traditional hierarchical and precedence relationships.
  15. Devadason, F.J.: Online construction of alphabetic classaurus : a vocabulary control and indexing tool (1985) 0.00
    0.0017165683 = product of:
      0.006866273 = sum of:
        0.006866273 = weight(_text_:information in 1467) [ClassicSimilarity], result of:
          0.006866273 = score(doc=1467,freq=2.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.0775819 = fieldWeight in 1467, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=1467)
      0.25 = coord(1/4)
    
    Source
    Information processing and management. 21(1985), S.11-26
  16. Hulme, E.W.: Principles of book classification (1985) 0.00
    0.0017165683 = product of:
      0.006866273 = sum of:
        0.006866273 = weight(_text_:information in 3626) [ClassicSimilarity], result of:
          0.006866273 = score(doc=3626,freq=2.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.0775819 = fieldWeight in 3626, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=3626)
      0.25 = coord(1/4)
    
    Abstract
    One of the earliest works on the theory of classification appeared in a series of six chapters an the "Principles of Book Classification" published between October 1911 and May 1912 in the Library Association Record. In this publication, the author, E. Wyndham Hulme (1859-1954) whose career included twenty-five years as Librarian of the British Patent Office, set forth the fundamentals of classification as manifested in both the classed and the alphabetical catalogs. The work and the ideas contained therein have largely been forgotten. However, one phrase stands out and has been used frequently in the discussions of classification and indexing, particularly in reference to systems such as Library of Congress Classification, Dewey Decimal Classification, and Library of Congress Subject Headings. That phrase is "literary warrant"-meaning that the basis for classification is to be found in the actual published literature rather than abstract philosophical ideas or concepts in the universe of knowledge or the "order of nature and system of the sciences." To the extent that classification and indexing systems should be based upon existing literature rather than the universe of human knowledge, the concept of "literary warrant" defines systems used in library and information services, as distinguished from a purely philosophical classification. Library classification attempts to classify library materials-the records of knowledge-rather than knowledge itself; the establishment of a class or a heading for a subject is based an existing literature treating that subject. The following excerpt contains Hulme's definition of "literary warrant." Hulme first rejects the notion of using "the nature of the subject matter to be divided" as the basis for establishing headings, then he proceeds to propose the use of "literary warrant," that is, "an accurate survey and measurement of classes in literature," as the determinant.
  17. Dewey, M.: Decimal classification and relativ index : introduction (1985) 0.00
    0.0017165683 = product of:
      0.006866273 = sum of:
        0.006866273 = weight(_text_:information in 3628) [ClassicSimilarity], result of:
          0.006866273 = score(doc=3628,freq=2.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.0775819 = fieldWeight in 3628, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=3628)
      0.25 = coord(1/4)
    
    Abstract
    To those outside the field of library science, the name Melvil Dewey (1851-1931) is virtually synonymous with library classification. To those in the field, Dewey has been recognized as the premier classification maker. His enormously successful system (i.e., successful in terms of the wide adoption of the system around the world for over one hundred years) has now undergone nineteen editions. The Dewey Decimal Classification has been translated into more than twenty languages and is the most widely adopted classification scheme in the world. Even in its earliest manifestations, the Dewey Decimal Classification contained features that anticipated modern classification theory. Among these are the use of mnemonics and the commonly applied standard subdivisions, later called "common isolates" by S. R. Ranganathan (q.v.), which are the mainstays of facet analysis and synthesis. The device of standard subdivisions is an indication of the recognition of common aspects that pervade all subjects. The use of mnemonics, whereby recurring concepts in the scheme are represented by the same notation, for example, geographic concepts and language concepts, eased the transition of the Dewey Decimal Classification from a largely enumerative system to an increasingly faceted one. Another significant feature of the Dewey Decimal Classification is the use of the hierarchical notation based an the arabic numeral system. To a large extent, this feature accounts for the wide use and success of the system in the world across language barriers. With the prospect of increasing online information retrieval, the hierarchical notation will have a significant impact an the effectiveness of the Dewey Decimal Classification as an online retrieval tool. Because the notation is hierarchical, for example, with increasing digits in a number representing narrower subjects and decreasing digits indicating broader subjects, the Dewey Decimal Classification is particularly useful in generic searches for broadening or narrowing search results. In the preface to the second edition of his Decimal Classification Dewey explained the features of his "new" system. The excerpt below presents his ideas and theory concerning the rational basis of his classification, the standard subdivisions, the hierarchical notation based an decimal numbers, the use of mnemonics, the relative index, and relative location. It also reflects Dewey's lifelong interest in simplified spelling.
  18. Pettee, J.: Public libraries and libraries as purveyors of information (1985) 0.00
    0.0017165683 = product of:
      0.006866273 = sum of:
        0.006866273 = weight(_text_:information in 3630) [ClassicSimilarity], result of:
          0.006866273 = score(doc=3630,freq=2.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.0775819 = fieldWeight in 3630, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=3630)
      0.25 = coord(1/4)
    
  19. Fairthorne, R.A.: Temporal structure in bibliographic classification (1985) 0.00
    0.0012874262 = product of:
      0.005149705 = sum of:
        0.005149705 = weight(_text_:information in 3651) [ClassicSimilarity], result of:
          0.005149705 = score(doc=3651,freq=2.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.058186423 = fieldWeight in 3651, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0234375 = fieldNorm(doc=3651)
      0.25 = coord(1/4)
    
    Abstract
    The fan of past documents may be seen across time as a philosophical "wake," translated documents as a sideways relationship and future documents as another fan spreading forward from a given document (p. 365). The "overlap of reading histories can be used to detect common interests among readers," (p. 365) and readers may be classified accordingly. Finally, Fairthorne rejects the notion of a "general" classification, which he regards as a mirage, to be replaced by a citation-type network to identify classes. An interesting feature of his work lies in his linkage between old and new documents via a bibliographic method-citations, authors' names, imprints, style, and vocabulary - rather than topical (subject) terms. This is an indirect method of creating classes. The subject (aboutness) is conceived as a finite, common sharing of knowledge over time (past, present, and future) as opposed to the more common hierarchy of topics in an infinite schema assumed to be universally useful. Fairthorne, a mathematician by training, is a prolific writer an the foundations of classification and information. His professional career includes work with the Royal Engineers Chemical Warfare Section and the Royal Aircraft Establishment (RAE). He was the founder of the Computing Unit which became the RAE Mathematics Department.

Authors

Types

Themes