Search (28 results, page 2 of 2)

  • × type_ss:"el"
  • × type_ss:"r"
  1. Multilingual information management : current levels and future abilities. A report Commissioned by the US National Science Foundation and also delivered to the European Commission's Language Engineering Office and the US Defense Advanced Research Projects Agency, April 1999 (1999) 0.00
    0.0038499737 = product of:
      0.019249868 = sum of:
        0.019249868 = weight(_text_:information in 6068) [ClassicSimilarity], result of:
          0.019249868 = score(doc=6068,freq=18.0), product of:
            0.08270773 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047114085 = queryNorm
            0.23274568 = fieldWeight in 6068, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=6068)
      0.2 = coord(1/5)
    
    Abstract
    Over the past 50 years, a variety of language-related capabilities has been developed in machine translation, information retrieval, speech recognition, text summarization, and so on. These applications rest upon a set of core techniques such as language modeling, information extraction, parsing, generation, and multimedia planning and integration; and they involve methods using statistics, rules, grammars, lexicons, ontologies, training techniques, and so on. It is a puzzling fact that although all of this work deals with language in some form or other, the major applications have each developed a separate research field. For example, there is no reason why speech recognition techniques involving n-grams and hidden Markov models could not have been used in machine translation 15 years earlier than they were, or why some of the lexical and semantic insights from the subarea called Computational Linguistics are still not used in information retrieval.
    This picture will rapidly change. The twin challenges of massive information overload via the web and ubiquitous computers present us with an unavoidable task: developing techniques to handle multilingual and multi-modal information robustly and efficiently, with as high quality performance as possible. The most effective way for us to address such a mammoth task, and to ensure that our various techniques and applications fit together, is to start talking across the artificial research boundaries. Extending the current technologies will require integrating the various capabilities into multi-functional and multi-lingual natural language systems. However, at this time there is no clear vision of how these technologies could or should be assembled into a coherent framework. What would be involved in connecting a speech recognition system to an information retrieval engine, and then using machine translation and summarization software to process the retrieved text? How can traditional parsing and generation be enhanced with statistical techniques? What would be the effect of carefully crafted lexicons on traditional information retrieval? At which points should machine translation be interleaved within information retrieval systems to enable multilingual processing?
  2. Gradmann, S.: Knowledge = Information in context : on the importance of semantic contextualisation in Europeana (2010) 0.00
    0.0025666493 = product of:
      0.012833246 = sum of:
        0.012833246 = weight(_text_:information in 3475) [ClassicSimilarity], result of:
          0.012833246 = score(doc=3475,freq=8.0), product of:
            0.08270773 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047114085 = queryNorm
            0.1551638 = fieldWeight in 3475, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=3475)
      0.2 = coord(1/5)
    
    Abstract
    "Europeana.eu is about ideas and inspiration. It links you to 6 million digital items." This is the opening statement taken from the Europeana WWW-site (http://www.europeana.eu/portal/aboutus.html), and it clearly is concerned with the mission of Europeana - without, however, being over-explicit as to the precise nature of that mission. Europeana's current logo, too, has a programmatic aspect: the slogan "Think Culture" clearly again is related to Europeana's mission and at same time seems somewhat closer to the point: 'thinking' culture evokes notions like conceptualisation, reasoning, semantics and the like. Still, all this remains fragmentary and insufficient to actually clarify the functional scope and mission of Europeana. In fact, the author of the present contribution is convinced that Europeana has too often been described in terms of sheer quantity, as a high volume aggregation of digital representations of cultural heritage objects without sufficiently stressing the functional aspects of this endeavour. This conviction motivates the present contribution on some of the essential functional aspects of Europeana making clear that such a contribution - even if its author is deeply involved in building Europeana - should not be read as an official statement of the project or of the European Commission (which it is not!) - but as the personal statement from an information science perspective! From this perspective the opening statement is that Europeana is much more than a machine for mechanical accumulation of object representations but that one of its main characteristics should be to enable the generation of knowledge pertaining to cultural artefacts. The rest of the paper is about the implications of this initial statement in terms of information science, on the way we technically prepare to implement the necessary data structures and functionality and on the novel functionality Europeana will offer based on these elements and which go well beyond the 'traditional' digital library paradigm. However, prior to exploring these areas it may be useful to recall the notion of 'knowledge' that forms the basis of this contribution and which in turn is part of the well known continuum reaching from data via information and knowledge to wisdom.
  3. Report on the future of bibliographic control : draft for public comment (2007) 0.00
    0.0023576177 = product of:
      0.011788089 = sum of:
        0.011788089 = weight(_text_:information in 1271) [ClassicSimilarity], result of:
          0.011788089 = score(doc=1271,freq=12.0), product of:
            0.08270773 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047114085 = queryNorm
            0.14252704 = fieldWeight in 1271, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0234375 = fieldNorm(doc=1271)
      0.2 = coord(1/5)
    
    Abstract
    The future of bibliographic control will be collaborative, decentralized, international in scope, and Web-based. Its realization will occur in cooperation with the private sector, and with the active collaboration of library users. Data will be gathered from multiple sources; change will happen quickly; and bibliographic control will be dynamic, not static. The underlying technology that makes this future possible and necessary-the World Wide Web-is now almost two decades old. Libraries must continue the transition to this future without delay in order to retain their relevance as information providers. The Working Group on the Future of Bibliographic Control encourages the library community to take a thoughtful and coordinated approach to effecting significant changes in bibliographic control. Such an approach will call for leadership that is neither unitary nor centralized. Nor will the responsibility to provide such leadership fall solely to the Library of Congress (LC). That said, the Working Group recognizes that LC plays a unique role in the library community of the United States, and the directions that LC takes have great impact on all libraries. We also recognize that there are many other institutions and organizations that have the expertise and the capacity to play significant roles in the bibliographic future. Wherever possible, those institutions must step forward and take responsibility for assisting with navigating the transition and for playing appropriate ongoing roles after that transition is complete. To achieve the goals set out in this document, we must look beyond individual libraries to a system wide deployment of resources. We must realize efficiencies in order to be able to reallocate resources from certain lower-value components of the bibliographic control ecosystem into other higher-value components of that same ecosystem. The recommendations in this report are directed at a number of parties, indicated either by their common initialism (e.g., "LC" for Library of Congress, "PCC" for Program for Cooperative Cataloging) or by their general category (e.g., "Publishers," "National Libraries"). When the recommendation is addressed to "All," it is intended for the library community as a whole and its close collaborators.
    The Library of Congress must begin by prioritizing the recommendations that are directed in whole or in part at LC. Some define tasks that can be achieved immediately and with moderate effort; others will require analysis and planning that will have to be coordinated broadly and carefully. The Working Group has consciously not associated time frames with any of its recommendations. The recommendations fall into five general areas: 1. Increase the efficiency of bibliographic production for all libraries through increased cooperation and increased sharing of bibliographic records, and by maximizing the use of data produced throughout the entire "supply chain" for information resources. 2. Transfer effort into higher-value activity. In particular, expand the possibilities for knowledge creation by "exposing" rare and unique materials held by libraries that are currently hidden from view and, thus, underused. 3. Position our technology for the future by recognizing that the World Wide Web is both our technology platform and the appropriate platform for the delivery of our standards. Recognize that people are not the only users of the data we produce in the name of bibliographic control, but so too are machine applications that interact with those data in a variety of ways. 4. Position our community for the future by facilitating the incorporation of evaluative and other user-supplied information into our resource descriptions. Work to realize the potential of the FRBR framework for revealing and capitalizing on the various relationships that exist among information resources. 5. Strengthen the library profession through education and the development of metrics that will inform decision-making now and in the future. The Working Group intends what follows to serve as a broad blueprint for the Library of Congress and its colleagues in the library and information technology communities for extending and promoting access to information resources.
  4. Riva, P.; Boeuf, P. le; Zumer, M.: IFLA Library Reference Model : a conceptual model for bibliographic information (2017) 0.00
    0.002245818 = product of:
      0.01122909 = sum of:
        0.01122909 = weight(_text_:information in 5179) [ClassicSimilarity], result of:
          0.01122909 = score(doc=5179,freq=2.0), product of:
            0.08270773 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047114085 = queryNorm
            0.13576832 = fieldWeight in 5179, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5179)
      0.2 = coord(1/5)
    
  5. Koch, T.; Ardö, A.; Brümmer, A.: ¬The building and maintenance of robot based internet search services : A review of current indexing and data collection methods. Prepared to meet the requirements of Work Package 3 of EU Telematics for Research, project DESIRE. Version D3.11v0.3 (Draft version 3) (1996) 0.00
    0.0022227836 = product of:
      0.011113917 = sum of:
        0.011113917 = weight(_text_:information in 1669) [ClassicSimilarity], result of:
          0.011113917 = score(doc=1669,freq=6.0), product of:
            0.08270773 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047114085 = queryNorm
            0.1343758 = fieldWeight in 1669, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=1669)
      0.2 = coord(1/5)
    
    Abstract
    After a short outline of problems, possibilities and difficulties of systematic information retrieval on the Internet and a description of efforts for development in this area, a specification of the terminology for this report is required. Although the process of retrieval is generally seen as an iterative process of browsing and information retrieval and several important services on the net have taken this fact into consideration, the emphasis of this report lays on the general retrieval tools for the whole of Internet. In order to be able to evaluate the differences, possibilities and restrictions of the different services it is necessary to begin with organizing the existing varieties in a typological/ taxonomical survey. The possibilities and weaknesses will be briefly compared and described for the most important services in the categories robot-based WWW-catalogues of different types, list- or form-based catalogues and simultaneous or collected search services respectively. It will however for different reasons not be possible to rank them in order of "best" services. Still more important are the weaknesses and problems common for all attempts of indexing the Internet. The problems of the quality of the input, the technical performance and the general problem of indexing virtual hypertext are shown to be at least as difficult as the different aspects of harvesting, indexing and information retrieval. Some of the attempts made in the area of further development of retrieval services will be mentioned in relation to descriptions of the contents of documents and standardization efforts. Internet harvesting and indexing technology and retrieval software is thoroughly reviewed. Details about all services and software are listed in analytical forms in Annex 1-3.
  6. Calhoun, K.: ¬The changing nature of the catalog and its integration with other discovery tools : Prepared for the Library of Congress (2006) 0.00
    0.0022227836 = product of:
      0.011113917 = sum of:
        0.011113917 = weight(_text_:information in 5013) [ClassicSimilarity], result of:
          0.011113917 = score(doc=5013,freq=6.0), product of:
            0.08270773 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047114085 = queryNorm
            0.1343758 = fieldWeight in 5013, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=5013)
      0.2 = coord(1/5)
    
    Abstract
    The destabilizing influences of the Web, widespread ownership of personal computers, and rising computer literacy have created an era of discontinuous change in research libraries a time when the cumulated assets of the past do not guarantee future success. The library catalog is such an asset. Today, a large and growing number of students and scholars routinely bypass library catalogs in favor of other discovery tools, and the catalog represents a shrinking proportion of the universe of scholarly information. The catalog is in decline, its processes and structures are unsustainable, and change needs to be swift. At the same time, books and serials are not dead, and they are not yet digital. Notwithstanding widespread expansion of digitization projects, ubiquitous e-journals, and a market that seems poised to move to e-books, the role of catalog records in discovery and retrieval of the world's library collections seems likely to continue for at least a couple of decades and probably longer. This report, commissioned by the Library of Congress (LC), offers an analysis of the current situation, options for revitalizing research library catalogs, a feasibility assessment, a vision for change, and a blueprint for action. Library decision makers are the primary audience for this report, whose aim is to elicit support, dialogue, collaboration, and movement toward solutions. Readers from the business community, particularly those that directly serve libraries, may find the report helpful for defining research and development efforts. The same is true for readers from membership organizations such as OCLC Online Computer Library Center, the Research Libraries Group, the Association for Research Libraries, the Council on Library and Information Resources, the Coalition for Networked Information, and the Digital Library Federation. Library managers and practitioners from all functional groups are likely to take an interest in the interview findings and in specific actions laid out in the blueprint.
  7. Adler, R.; Ewing, J.; Taylor, P.: Citation statistics : A report from the International Mathematical Union (IMU) in cooperation with the International Council of Industrial and Applied Mathematics (ICIAM) and the Institute of Mathematical Statistics (IMS) (2008) 0.00
    0.0021522008 = product of:
      0.010761004 = sum of:
        0.010761004 = weight(_text_:information in 2417) [ClassicSimilarity], result of:
          0.010761004 = score(doc=2417,freq=10.0), product of:
            0.08270773 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047114085 = queryNorm
            0.1301088 = fieldWeight in 2417, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0234375 = fieldNorm(doc=2417)
      0.2 = coord(1/5)
    
    Abstract
    Using citation data to assess research ultimately means using citation-based statistics to rank things.journals, papers, people, programs, and disciplines. The statistical tools used to rank these things are often misunderstood and misused. - For journals, the impact factor is most often used for ranking. This is a simple average derived from the distribution of citations for a collection of articles in the journal. The average captures only a small amount of information about that distribution, and it is a rather crude statistic. In addition, there are many confounding factors when judging journals by citations, and any comparison of journals requires caution when using impact factors. Using the impact factor alone to judge a journal is like using weight alone to judge a person's health. - For papers, instead of relying on the actual count of citations to compare individual papers, people frequently substitute the impact factor of the journals in which the papers appear. They believe that higher impact factors must mean higher citation counts. But this is often not the case! This is a pervasive misuse of statistics that needs to be challenged whenever and wherever it occurs. -For individual scientists, complete citation records can be difficult to compare. As a consequence, there have been attempts to find simple statistics that capture the full complexity of a scientist's citation record with a single number. The most notable of these is the h-index, which seems to be gaining in popularity. But even a casual inspection of the h-index and its variants shows that these are naive attempts to understand complicated citation records. While they capture a small amount of information about the distribution of a scientist's citations, they lose crucial information that is essential for the assessment of research.
    The validity of statistics such as the impact factor and h-index is neither well understood nor well studied. The connection of these statistics with research quality is sometimes established on the basis of "experience." The justification for relying on them is that they are "readily available." The few studies of these statistics that were done focused narrowly on showing a correlation with some other measure of quality rather than on determining how one can best derive useful information from citation data. We do not dismiss citation statistics as a tool for assessing the quality of research.citation data and statistics can provide some valuable information. We recognize that assessment must be practical, and for this reason easily-derived citation statistics almost surely will be part of the process. But citation data provide only a limited and incomplete view of research quality, and the statistics derived from citation data are sometimes poorly understood and misused. Research is too important to measure its value with only a single coarse tool. We hope those involved in assessment will read both the commentary and the details of this report in order to understand not only the limitations of citation statistics but also how better to use them. If we set high standards for the conduct of science, surely we should set equally high standards for assessing its quality.
  8. Babeu, A.: Building a "FRBR-inspired" catalog : the Perseus digital library experience (2008) 0.00
    0.001814895 = product of:
      0.009074475 = sum of:
        0.009074475 = weight(_text_:information in 2429) [ClassicSimilarity], result of:
          0.009074475 = score(doc=2429,freq=4.0), product of:
            0.08270773 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047114085 = queryNorm
            0.10971737 = fieldWeight in 2429, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=2429)
      0.2 = coord(1/5)
    
    Abstract
    Our catalog should not be called a FRBR catalog perhaps, but instead a "FRBR Inspired catalog." As such our main goal has been "practical findability," we are seeking to support the four identified user tasks of the FRBR model, or to "Search, Identify, Select, and Obtain," rather than to create a FRBR catalog, per se. By encoding as much information as possible in the MODS and MADS records we have created, we believe that useful searching will be supported, that by using unique identifiers for works and authors users will be able to identify that the entity they have located is the desired one, that by encoding expression level information (such as the language of the work, the translator, etc) users will be able to select which expression of a work they are interested in, and that by supplying links to different online manifestations that users will be able to obtain access to a digital copy of a work. This white paper will discuss previous and current efforts by the Perseus Project in creating a FRBRized catalog, including the cataloging workflow, lessons learned during the process and will also seek to place this work in the larger context of research regarding FRBR, cataloging, Library 2.0 and the Semantic Web, and the growing importance of the FRBR model in the face of growing million book digital libraries.