Search (214 results, page 10 of 11)

Day, M.; Koch, T.: ¬The role of classification schemes in Internet resource description and discovery : DESIRE - Development of a European Service for Information on Research and Education. Specification for resource description methods, part 3 (1997) 0.00

3.2888478E-4 = product of:
  0.0049332716 = sum of:
    0.0049332716 = product of:
      0.009866543 = sum of:
        0.009866543 = weight(_text_:information in 3067) [ClassicSimilarity], result of:
          0.009866543 = score(doc=3067,freq=2.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.19395474 = fieldWeight in 3067, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.078125 = fieldNorm(doc=3067)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)

Shafer, K.E.: Evaluating Scorpion results (1998) 0.00

3.2888478E-4 = product of:
  0.0049332716 = sum of:
    0.0049332716 = product of:
      0.009866543 = sum of:
        0.009866543 = weight(_text_:information in 1569) [ClassicSimilarity], result of:
          0.009866543 = score(doc=1569,freq=2.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.19395474 = fieldWeight in 1569, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.078125 = fieldNorm(doc=1569)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)

Abstract: Scorpion is a research project at OCLC that builds tools for automatic subject assignment by combining library science and information retrieval techniques. A thesis of Scorpion is that the Dewey Decimal Classification (Dewey) can be used to perform automatic subject assignment for electronic items.

Lackes, R.; Mack, D.: Computer Based Training on neural nets : Basics, development, and practice (1998) 0.00
```
3.255793E-4 = product of:
  0.0048836893 = sum of:
    0.0048836893 = product of:
      0.009767379 = sum of:
        0.009767379 = weight(_text_:information in 964) [ClassicSimilarity], result of:
          0.009767379 = score(doc=964,freq=4.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.1920054 = fieldWeight in 964, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=964)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)
```
Abstract

Here is an interactive introduction to neural nets and how to apply them that is easy to understand and use. Neural nets are information processing systems that mimic the basic structure of the human brain. They learn by adjusting the interaction of their individual components (neurons). A neural net can learn from patterns of information supplied as input to generate useful output that can serve as a basis for decision making. Numerous multimedia and interactive components give the learning program an almost game-like feel as it takes the learner from the basics to the use of neural nets for real projects
Dolin, R.; Agrawal, D.; El Abbadi, A.; Pearlman, J.: Using automated classification for summarizing and selecting heterogeneous information sources (1998) 0.00
```
2.9599632E-4 = product of:
  0.0044399444 = sum of:
    0.0044399444 = product of:
      0.008879889 = sum of:
        0.008879889 = weight(_text_:information in 1253) [ClassicSimilarity], result of:
          0.008879889 = score(doc=1253,freq=18.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.17455927 = fieldWeight in 1253, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0234375 = fieldNorm(doc=1253)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)
```
Abstract

Information retrieval over the Internet increasingly requires the filtering of thousands of heterogeneous information sources. Important sources of information include not only traditional databases with structured data and queries, but also increasing numbers of non-traditional, semi- or unstructured collections such as Web sites, FTP archives, etc. As the number and variability of sources increases, new ways of automatically summarizing, discovering, and selecting collections relevant to a user's query are needed. One such method involves the use of classification schemes, such as the Library of Congress Classification (LCC), within which a collection may be represented based on its content, irrespective of the structure of the actual data or documents. For such a system to be useful in a large-scale distributed environment, it must be easy to use for both collection managers and users. As a result, it must be possible to classify documents automatically within a classification scheme. Furthermore, there must be a straightforward and intuitive interface with which the user may use the scheme to assist in information retrieval (IR). Our work with the Alexandria Digital Library (ADL) Project focuses on geo-referenced information, whether text, maps, aerial photographs, or satellite images. As a result, we have emphasized techniques which work with both text and non-text, such as combined textual and graphical queries, multi-dimensional indexing, and IR methods which are not solely dependent on words or phrases. Part of this work involves locating relevant online sources of information. In particular, we have designed and are currently testing aspects of an architecture, Pharos, which we believe will scale up to 1.000.000 heterogeneous sources. Pharos accommodates heterogeneity in content and format, both among multiple sources as well as within a single source. That is, we consider sources to include Web sites, FTP archives, newsgroups, and full digital libraries; all of these systems can include a wide variety of content and multimedia data formats. Pharos is based on the use of hierarchical classification schemes. These include not only well-known 'subject' (or 'concept') based schemes such as the Dewey Decimal System and the LCC, but also, for example, geographic classifications, which might be constructed as layers of smaller and smaller hierarchical longitude/latitude boxes. Pharos is designed to work with sophisticated queries which utilize subjects, geographical locations, temporal specifications, and other types of information domains. The Pharos architecture requires that hierarchically structured collection metadata be extracted so that it can be partitioned in such a way as to greatly enhance scalability. Automated classification is important to Pharos because it allows information sources to extract the requisite collection metadata automatically that must be distributed.
Liu, S.: Decomposing DDC synthesized numbers (1996) 0.00
```
2.848226E-4 = product of:
  0.004272339 = sum of:
    0.004272339 = product of:
      0.008544678 = sum of:
        0.008544678 = weight(_text_:information in 5969) [ClassicSimilarity], result of:
          0.008544678 = score(doc=5969,freq=6.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.16796975 = fieldWeight in 5969, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5969)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)
```
Abstract

Much literature has been written speculating upon how classification can be used in online catalogs to improve information retrieval. While some empirical studies have been done exploring whether the direct use of traditional classification schemes designed for a manual environment is effective and efficient in the online environment, none has manipulated these manual classifications in such a w ay as to take full advantage of the power of both the classification and computer. It has been suggested by some authors, such as Wajenberg and Drabenstott, that this power could be realized if the individual components of synthesized DDC numbers could be identified and indexed. This paper looks at the feasibility of automatically decomposing DDC synthesized numbers and the implications of such decomposition for information retrieval. Based on an analysis of the instructions for synthesizing numbers in the main class Arts (700) and all DDC Tables, 17 decomposition rules were defined, 13 covering the Add Notes and four the Standard Subdivisions. 1,701 DDC synthesized numbers were decomposed by a computer system called DND (Dewey Number Decomposer), developed by the author. From the 1,701 numbers, 600 were randomly selected fo r examination by three judges, each evaluating 200 numbers. The decomposition success rate was 100% and it was concluded that synthesized DDC numbers can be accurately decomposed automatically. The study has implications for information retrieval, expert systems for assigning DDC numbers, automatic indexing, switching language development, enhancing classifiers' work, teaching library school students, and providing quality control for DDC number assignments. These implications were explored using a prototype retrieval system.
Kirriemuir, J.; Brickley, D.; Welsh, S.; Knight, J.; Hamilton, M.: Cross-searching subject gateways : the query routing and forward knowledge approach (1998) 0.00
```
2.848226E-4 = product of:
  0.004272339 = sum of:
    0.004272339 = product of:
      0.008544678 = sum of:
        0.008544678 = weight(_text_:information in 1252) [ClassicSimilarity], result of:
          0.008544678 = score(doc=1252,freq=6.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.16796975 = fieldWeight in 1252, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1252)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)
```
Abstract

A subject gateway, in the context of network-based resource access, can be defined as some facility that allows easier access to network-based resources in a defined subject area. The simplest types of subject gateways are sets of Web pages containing lists of links to resources. Some gateways index their lists of links and provide a simple search facility. More advanced gateways offer a much enhanced service via a system consisting of a resource database and various indexes, which can be searched and/or browsed through a Web-based interface. Each entry in the database contains information about a network-based resource, such as a Web page, Web site, mailing list or document. Entries are usually created by a cataloguer manually identifying a suitable resource, describing the resource using a template, and submitting the template to the database for indexing. Subject gateways are also known as subject-based information gateways (SBIGs), subject-based gateways, subject index gateways, virtual libraries, clearing houses, subject trees, pathfinders and other variations thereof. This paper describes the characteristics of some of the subject gateways currently accessible through the Web, and compares them to automatic "vacuum cleaner" type search engines, such as AltaVista. The application of WHOIS++, centroids, query routing, and forward knowledge to searching several of these subject gateways simultaneously is outlined. The paper concludes with looking at some of the issues facing subject gateway development in the near future. The paper touches on many of the issues mentioned in a previous paper in D-Lib Magazine, especially regarding resource-discovery related initiatives and services.

Theme

Information Gateway
Chowdhury, A.; Mccabe, M.C.: Improving information retrieval systems using part of speech tagging (1993) 0.00
```
2.79068E-4 = product of:
  0.0041860198 = sum of:
    0.0041860198 = product of:
      0.0083720395 = sum of:
        0.0083720395 = weight(_text_:information in 1061) [ClassicSimilarity], result of:
          0.0083720395 = score(doc=1061,freq=4.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.16457605 = fieldWeight in 1061, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=1061)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)
```
Abstract

The object of Information Retrieval is to retrieve all relevant documents for a user query and only those relevant documents. Much research has focused on achieving this objective with little regard for storage overhead or performance. In the paper we evaluate the use of Part of Speech Tagging to improve, the index storage overhead and general speed of the system with only a minimal reduction to precision recall measurements. We tagged 500Mbs of the Los Angeles Times 1990 and 1989 document collection provided by TREC for parts of speech. We then experimented to find the most relevant part of speech to index. We show that 90% of precision recall is achieved with 40% of the document collections terms. We also show that this is a improvement in overhead with only a 1% reduction in precision recall.

Grant, S.: Developing cognitive architecture for modelling and simulation of cognition and error in complex tasks (1995) 0.00

2.6310782E-4 = product of:
  0.0039466172 = sum of:
    0.0039466172 = product of:
      0.0078932345 = sum of:
        0.0078932345 = weight(_text_:information in 3288) [ClassicSimilarity], result of:
          0.0078932345 = score(doc=3288,freq=2.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.1551638 = fieldWeight in 3288, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=3288)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)

Theme: Information

Hert, C.A.: Information retrieval as situated action (1995) 0.00

2.6310782E-4 = product of:
  0.0039466172 = sum of:
    0.0039466172 = product of:
      0.0078932345 = sum of:
        0.0078932345 = weight(_text_:information in 3317) [ClassicSimilarity], result of:
          0.0078932345 = score(doc=3317,freq=2.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.1551638 = fieldWeight in 3317, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=3317)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)

Tsvetkova, I.; Selivanova, J.: Probleme bei der Entwicklung einer nationalen russischen Schlagwortnormdatei (1999) 0.00

2.6310782E-4 = product of:
  0.0039466172 = sum of:
    0.0039466172 = product of:
      0.0078932345 = sum of:
        0.0078932345 = weight(_text_:information in 4191) [ClassicSimilarity], result of:
          0.0078932345 = score(doc=4191,freq=2.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.1551638 = fieldWeight in 4191, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=4191)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)

Abstract: The report deals with the necessity for developing a National Subject Authority File in russia. Basic approaches creating such a file creation are introduced. The history and main features of the National Library of Russia Subject authority File are described. The report includes information on number of projects being undertaken both in NLR and in Russia

Maple, A.: Faceted access : a review of the literature (1995) 0.00

2.6310782E-4 = product of:
  0.0039466172 = sum of:
    0.0039466172 = product of:
      0.0078932345 = sum of:
        0.0078932345 = weight(_text_:information in 5099) [ClassicSimilarity], result of:
          0.0078932345 = score(doc=5099,freq=2.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.1551638 = fieldWeight in 5099, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=5099)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)

Abstract: The purpose of this paper is to define what is meant by facet analysis, and to review briefly the history of facet analysis within the context of other types of subject analysis in libraries and within the context of information retrieval research

Hill, L.L.; Frew, J.; Zheng, Q.: Geographic names : the implementation of a gazetteer in a georeferenced digital library (1999) 0.00
```
2.6310782E-4 = product of:
  0.0039466172 = sum of:
    0.0039466172 = product of:
      0.0078932345 = sum of:
        0.0078932345 = weight(_text_:information in 1240) [ClassicSimilarity], result of:
          0.0078932345 = score(doc=1240,freq=8.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.1551638 = fieldWeight in 1240, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=1240)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)
```
Abstract

The Alexandria Digital Library (ADL) Project has developed a content standard for gazetteer objects and a hierarchical type scheme for geographic features. Both of these developments are based on ADL experience with an earlier gazetteer component for the Library, based on two gazetteers maintained by the U.S. federal government. We define the minimum components of a gazetteer entry as (1) a geographic name, (2) a geographic location represented by coordinates, and (3) a type designation. With these attributes, a gazetteer can function as a tool for indirect spatial location identification through names and types. The ADL Gazetteer Content Standard supports contribution and sharing of gazetteer entries with rich descriptions beyond the minimum requirements. This paper describes the content standard, the feature type thesaurus, and the implementation and research issues. A gazetteer is list of geographic names, together with their geographic locations and other descriptive information. A geographic name is a proper name for a geographic place and feature, such as Santa Barbara County, Mount Washington, St. Francis Hospital, and Southern California. There are many types of printed gazetteers. For example, the New York Times Atlas has a gazetteer section that can be used to look up a geographic name and find the page(s) and grid reference(s) where the corresponding feature is shown. Some gazetteers provide information about places and features; for example, a history of the locale, population data, physical data such as elevation, or the pronunciation of the name. Some lists of geographic names are available as hierarchical term sets (thesauri) designed for information retreival; these are used to describe bibliographic or museum materials. Examples include the authority files of the U.S. Library of Congress and the GeoRef Thesaurus produced by the American Geological Institute. The Getty Museum has recently made their Thesaurus of Geographic Names available online. This is a major project to develop a controlled vocabulary of current and historical names to describe (i.e., catalog) art and architecture literature. U.S. federal government mapping agencies maintain gazetteers containing the official names of places and/or the names that appear on map series. Examples include the U.S. Geological Survey's Geographic Names Information System (GNIS) and the National Imagery and Mapping Agency's Geographic Names Processing System (GNPS). Both of these are maintained in cooperation with the U.S. Board of Geographic Names (BGN). Many other examples could be cited -- for local areas, for other countries, and for special purposes. There is remarkable diversity in approaches to the description of geographic places and no standardization beyond authoritative sources for the geographic names themselves.
Oard, D.W.: Alternative approaches for cross-language text retrieval (1997) 0.00
```
2.5739305E-4 = product of:
  0.0038608958 = sum of:
    0.0038608958 = product of:
      0.0077217915 = sum of:
        0.0077217915 = weight(_text_:information in 1164) [ClassicSimilarity], result of:
          0.0077217915 = score(doc=1164,freq=10.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.1517936 = fieldWeight in 1164, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1164)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)
```
Abstract

The explosive growth of the Internet and other sources of networked information have made automatic mediation of access to networked information sources an increasingly important problem. Much of this information is expressed as electronic text, and it is becoming practical to automatically convert some printed documents and recorded speech to electronic text as well. Thus, automated systems capable of detecting useful documents are finding widespread application. With even a small number of languages it can be inconvenient to issue the same query repeatedly in every language, so users who are able to read more than one language will likely prefer a multilingual text retrieval system over a collection of monolingual systems. And since reading ability in a language does not always imply fluent writing ability in that language, such users will likely find cross-language text retrieval particularly useful for languages in which they are less confident of their ability to express their information needs effectively. The use of such systems can be also be beneficial if the user is able to read only a single language. For example, when only a small portion of the document collection will ever be examined by the user, performing retrieval before translation can be significantly more economical than performing translation before retrieval. So when the application is sufficiently important to justify the time and effort required for translation, those costs can be minimized if an effective cross-language text retrieval system is available. Even when translation is not available, there are circumstances in which cross-language text retrieval could be useful to a monolingual user. For example, a researcher might find a paper published in an unfamiliar language useful if that paper contains references to works by the same author that are in the researcher's native language.
Multilingual text retrieval can be defined as selection of useful documents from collections that may contain several languages (English, French, Chinese, etc.). This formulation allows for the possibility that individual documents might contain more than one language, a common occurrence in some applications. Both cross-language and within-language retrieval are included in this formulation, but it is the cross-language aspect of the problem which distinguishes multilingual text retrieval from its well studied monolingual counterpart. At the SIGIR 96 workshop on "Cross-Linguistic Information Retrieval" the participants discussed the proliferation of terminology being used to describe the field and settled on "Cross-Language" as the best single description of the salient aspect of the problem. "Multilingual" was felt to be too broad, since that term has also been used to describe systems able to perform within-language retrieval in more than one language but that lack any cross-language capability. "Cross-lingual" and "cross-linguistic" were felt to be equally good descriptions of the field, but "crosslanguage" was selected as the preferred term in the interest of standardization. Unfortunately, at about the same time the U.S. Defense Advanced Research Projects Agency (DARPA) introduced "translingual" as their preferred term, so we are still some distance from reaching consensus on this matter.
Brin, S.; Page, L.: ¬The anatomy of a large-scale hypertextual Web search engine (1998) 0.00
```
2.3255666E-4 = product of:
  0.0034883497 = sum of:
    0.0034883497 = product of:
      0.0069766995 = sum of:
        0.0069766995 = weight(_text_:information in 947) [ClassicSimilarity], result of:
          0.0069766995 = score(doc=947,freq=4.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.13714671 = fieldWeight in 947, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=947)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)
```
Abstract

In this paper, we present Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext. Google is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems. The prototype with a full text and hyperlink database of at least 24 million pages is available at http://google.stanford.edu/. To engineer a search engine is a challenging task. Search engines index tens to hundreds of millions of web pages involving a comparable number of distinct terms. They answer tens of millions of queries every day. Despite the importance of large-scale search engines on the web, very little academic research has been done on them. Furthermore, due to rapid advance in technology and web proliferation, creating a web search engine today is very different from three years ago. This paper provides an in-depth description of our large-scale web search engine -- the first such detailed public description we know of to date. Apart from the problems of scaling traditional search techniques to data of this magnitude, there are new technical challenges involved with using the additional information present in hypertext to produce better search results. This paper addresses this question of how to build a practical large-scale system which can exploit the additional information present in hypertext. Also we look at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want
Robertson, S.E.; Sparck Jones, K.: Simple, proven approaches to text retrieval (1997) 0.00
```
2.3255666E-4 = product of:
  0.0034883497 = sum of:
    0.0034883497 = product of:
      0.0069766995 = sum of:
        0.0069766995 = weight(_text_:information in 4532) [ClassicSimilarity], result of:
          0.0069766995 = score(doc=4532,freq=4.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.13714671 = fieldWeight in 4532, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4532)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)
```
Abstract

This technical note describes straightforward techniques for document indexing and retrieval that have been solidly established through extensive testing and are easy to apply. They are useful for many different types of text material, are viable for very large files, and have the advantage that they do not require special skills or training for searching, but are easy for end users. The document and text retrieval methods described here have a sound theoretical basis, are well established by extensive testing, and the ideas involved are now implemented in some commercial retrieval systems. Testing in the last few years has, in particular, shown that the methods presented here work very well with full texts, not only title and abstracts, and with large files of texts containing three quarters of a million documents. These tests, the TREC Tests (see Harman 1993 - 1997; IP&M 1995), have been rigorous comparative evaluations involving many different approaches to information retrieval. These techniques depend an the use of simple terms for indexing both request and document texts; an term weighting exploiting statistical information about term occurrences; an scoring for request-document matching, using these weights, to obtain a ranked search output; and an relevance feedback to modify request weights or term sets in iterative searching. The normal implementation is via an inverted file organisation using a term list with linked document identifiers, plus counting data, and pointers to the actual texts. The user's request can be a word list, phrases, sentences or extended text.
Fife, E.D.; Husch, L.: ¬The Mathematics Archives : making mathematics easy to find on the Web (1999) 0.00
```
2.3255666E-4 = product of:
  0.0034883497 = sum of:
    0.0034883497 = product of:
      0.0069766995 = sum of:
        0.0069766995 = weight(_text_:information in 1239) [ClassicSimilarity], result of:
          0.0069766995 = score(doc=1239,freq=4.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.13714671 = fieldWeight in 1239, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1239)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)
```
Abstract

Do a search on AltaVista for "algebra". What do you get? Nearly 700,000 hits, of which AltaVista will allow you to view only what it determines is the top 200. Major search engines such as AltaVista, Excite, HotBot, Lycos, and the like continue to provide a valuable service, but with the recent growth of the Internet, topic-specific sites that provide some organization to the topic are increasingly important. It the goal of the Mathematics Archives to make it easier for the ordinary user to find useful mathematical information on the Web. The Mathematics Archives (http://archives.math.utk.edu) is a multipurpose site for mathematics on the Internet. The focus is on materials which can be used in mathematics education (primarily at the undergraduate level). Resources available range from shareware and public domain software to electronic proceedings of various conferences, to an extensive collection of annotated links to other mathematical sites. All materials on the Archives are categorized and cross referenced for the convenience of the user. Several search mechanisms are provided. The Harvest search engine is implemented to provide a full text search of most of the pages on the Archives. The software we house and our list of annotated links to mathematical sites are both categorized by subject matter. Each of these collections has a specialized search engine to assist the user in locating desired material. Services at the Mathematics Archives are divided up into five broad topics: * Links organized by Mathematical Topics * Software * Teaching Materials * Other Math Archives Features * Other Links

Theme

Information Gateway

Mayes, T.: Hypermedia and cognitive tools (1995) 0.00

2.3021935E-4 = product of:
  0.00345329 = sum of:
    0.00345329 = product of:
      0.00690658 = sum of:
        0.00690658 = weight(_text_:information in 3289) [ClassicSimilarity], result of:
          0.00690658 = score(doc=3289,freq=2.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.13576832 = fieldWeight in 3289, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3289)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)

Theme: Information

Hirsch, C.C.: InterBRAIN : topographical atlas of the anatomy of the human CNS (1998) 0.00

2.3021935E-4 = product of:
  0.00345329 = sum of:
    0.00345329 = product of:
      0.00690658 = sum of:
        0.00690658 = weight(_text_:information in 822) [ClassicSimilarity], result of:
          0.00690658 = score(doc=822,freq=2.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.13576832 = fieldWeight in 822, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=822)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)

Abstract: The intricate 3D structure of the CNS lends itself to multimedia presentation, and is depicted here by way of dynamic 3D models that can be freely rotated, and in over 200 illustrations taken from the successful book "The Human Central Nervous System" by R. Nieuwenhuys et al, allowing the user to explore all aspects of this complex and fascinating subject. All this fully hyperlinked with over 2000 specialist terms. Optimal exam revision is guaranteed with the self-study option. For further information please contact: http://www.brainmedia.de/html/frames/pr/pr<BL>5/pr<BL>5<BL>02.html

Miller, E.: ¬An introduction to the Resource Description Framework (1998) 0.00
```
2.3021935E-4 = product of:
  0.00345329 = sum of:
    0.00345329 = product of:
      0.00690658 = sum of:
        0.00690658 = weight(_text_:information in 1231) [ClassicSimilarity], result of:
          0.00690658 = score(doc=1231,freq=2.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.13576832 = fieldWeight in 1231, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1231)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)
```
Abstract

The Resource Description Framework (RDF) is an infrastructure that enables the encoding, exchange and reuse of structured metadata. RDF is an application of XML that imposes needed structural constraints to provide unambiguous methods of expressing semantics. RDF additionally provides a means for publishing both human-readable and machine-processable vocabularies designed to encourage the reuse and extension of metadata semantics among disparate information communities. The structural constraints RDF imposes to support the consistent encoding and exchange of standardized metadata provides for the interchangeability of separate packages of metadata defined by different resource description communities.
Koch, T.; Ardö, A.; Brümmer, A.: ¬The building and maintenance of robot based internet search services : A review of current indexing and data collection methods. Prepared to meet the requirements of Work Package 3 of EU Telematics for Research, project DESIRE. Version D3.11v0.3 (Draft version 3) (1996) 0.00
```
2.2785808E-4 = product of:
  0.003417871 = sum of:
    0.003417871 = product of:
      0.006835742 = sum of:
        0.006835742 = weight(_text_:information in 1669) [ClassicSimilarity], result of:
          0.006835742 = score(doc=1669,freq=6.0), product of:
            0.050870337 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.028978055 = queryNorm
            0.1343758 = fieldWeight in 1669, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.03125 = fieldNorm(doc=1669)
      0.5 = coord(1/2)
  0.06666667 = coord(1/15)
```
Abstract

After a short outline of problems, possibilities and difficulties of systematic information retrieval on the Internet and a description of efforts for development in this area, a specification of the terminology for this report is required. Although the process of retrieval is generally seen as an iterative process of browsing and information retrieval and several important services on the net have taken this fact into consideration, the emphasis of this report lays on the general retrieval tools for the whole of Internet. In order to be able to evaluate the differences, possibilities and restrictions of the different services it is necessary to begin with organizing the existing varieties in a typological/ taxonomical survey. The possibilities and weaknesses will be briefly compared and described for the most important services in the categories robot-based WWW-catalogues of different types, list- or form-based catalogues and simultaneous or collected search services respectively. It will however for different reasons not be possible to rank them in order of "best" services. Still more important are the weaknesses and problems common for all attempts of indexing the Internet. The problems of the quality of the input, the technical performance and the general problem of indexing virtual hypertext are shown to be at least as difficult as the different aspects of harvesting, indexing and information retrieval. Some of the attempts made in the area of further development of retrieval services will be mentioned in relation to descriptions of the contents of documents and standardization efforts. Internet harvesting and indexing technology and retrieval software is thoroughly reviewed. Details about all services and software are listed in analytical forms in Annex 1-3.

Search (214 results, page 10 of 11)

Authors

Languages

Types

Themes

Subjects

Classifications