Search (91 results, page 1 of 5)

  • × theme_ss:"Automatisches Indexieren"
  1. Search Engines and Beyond : Developing efficient knowledge management systems, April 19-20 1999, Boston, Mass (1999) 0.12
    0.121988446 = product of:
      0.16265126 = sum of:
        0.023270661 = weight(_text_:web in 2596) [ClassicSimilarity], result of:
          0.023270661 = score(doc=2596,freq=2.0), product of:
            0.16134618 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.049439456 = queryNorm
            0.14422815 = fieldWeight in 2596, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03125 = fieldNorm(doc=2596)
        0.09516762 = weight(_text_:search in 2596) [ClassicSimilarity], result of:
          0.09516762 = score(doc=2596,freq=26.0), product of:
            0.17183559 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.049439456 = queryNorm
            0.55382955 = fieldWeight in 2596, product of:
              5.0990195 = tf(freq=26.0), with freq of:
                26.0 = termFreq=26.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.03125 = fieldNorm(doc=2596)
        0.044212975 = product of:
          0.08842595 = sum of:
            0.08842595 = weight(_text_:engine in 2596) [ClassicSimilarity], result of:
              0.08842595 = score(doc=2596,freq=4.0), product of:
                0.26447627 = queryWeight, product of:
                  5.349498 = idf(docFreq=570, maxDocs=44218)
                  0.049439456 = queryNorm
                0.3343436 = fieldWeight in 2596, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.349498 = idf(docFreq=570, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2596)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    This series of meetings originated in Albuquerque, New Mexico in 1995. This inaugural meeting (part of an ASIDIC series) was transplanted to Bath in England (1996 and 1997) and then to Boston, Massachusetts (1998 and 1999). The Search Engines Meetings bring together commercial search engine developers, academics and corporate professionals to learn from each other. Infonortics, sponsor of meetings post-1995 with Ev Brenner, plans to continue the same success in Boston in 2000.
    Content
    Ramana Rao (Inxight, Palo Alto, CA) 7 ± 2 Insights on achieving Effective Information Access Session One: Updates and a twelve month perspective Danny Sullivan (Search Engine Watch, US / England) Portalization and other search trends Carol Tenopir (University of Tennessee) Search realities faced by end users and professional searchers Session Two: Today's search engines and beyond Daniel Hoogterp (Retrieval Technologies, McLean, VA) Effective presentation and utilization of search techniques Rick Kenny (Fulcrum Technologies, Ontario, Canada) Beyond document clustering: The knowledge impact statement Gary Stock (Ingenius, Kalamazoo, MI) Automated change monitoring Gary Culliss (Direct Hit, Wellesley Hills, MA) User popularity ranked search engines Byron Dom (IBM, CA) Automatically finding the best pages on the World Wide Web (CLEVER) Peter Tomassi (LookSmart, San Francisco, CA) Adding human intellect to search technology Session Three: Panel discussion: Human v automated categorization and editing Ev Brenner (New York, NY)- Chairman James Callan (University of Massachusetts, MA) Marc Krellenstein (Northern Light Technology, Cambridge, MA) Dan Miller (Ask Jeeves, Berkeley, CA) Session Four: Updates and a twelve month perspective Steve Arnold (AIT, Harrods Creek, KY) Review: The leading edge in search and retrieval software Ellen Voorhees (NIST, Gaithersburg, MD) TREC update Session Five: Search engines now and beyond Intelligent Agents John Snyder (Muscat, Cambridge, England) Practical issues behind intelligent agents Text summarization Therese Firmin, (Dept of Defense, Ft George G. Meade, MD) The TIPSTER/SUMMAC evaluation of automatic text summarization systems Cross language searching Elizabeth Liddy (TextWise, Syracuse, NY) A conceptual interlingua approach to cross-language retrieval. Video search and retrieval Armon Amir (IBM, Almaden, CA) CueVideo: Modular system for automatic indexing and browsing of video/audio Speech recognition Michael Witbrock (Lycos, Waltham, MA) Retrieval of spoken documents Visualization James A. Wise (Integral Visuals, Richland, WA) Information visualization in the new millennium: Emerging science or passing fashion? Text mining David Evans (Claritech, Pittsburgh, PA) Text mining - towards decision support
  2. Rasmussen, E.M.: Indexing and retrieval for the Web (2002) 0.12
    0.12008574 = product of:
      0.16011432 = sum of:
        0.07618698 = weight(_text_:web in 4285) [ClassicSimilarity], result of:
          0.07618698 = score(doc=4285,freq=28.0), product of:
            0.16134618 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.049439456 = queryNorm
            0.47219574 = fieldWeight in 4285, product of:
              5.2915025 = tf(freq=28.0), with freq of:
                28.0 = termFreq=28.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.02734375 = fieldNorm(doc=4285)
        0.056571957 = weight(_text_:search in 4285) [ClassicSimilarity], result of:
          0.056571957 = score(doc=4285,freq=12.0), product of:
            0.17183559 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.049439456 = queryNorm
            0.32922143 = fieldWeight in 4285, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.02734375 = fieldNorm(doc=4285)
        0.027355384 = product of:
          0.05471077 = sum of:
            0.05471077 = weight(_text_:engine in 4285) [ClassicSimilarity], result of:
              0.05471077 = score(doc=4285,freq=2.0), product of:
                0.26447627 = queryWeight, product of:
                  5.349498 = idf(docFreq=570, maxDocs=44218)
                  0.049439456 = queryNorm
                0.20686457 = fieldWeight in 4285, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.349498 = idf(docFreq=570, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=4285)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    The introduction and growth of the World Wide Web (WWW, or Web) have resulted in a profound change in the way individuals and organizations access information. In terms of volume, nature, and accessibility, the characteristics of electronic information are significantly different from those of even five or six years ago. Control of, and access to, this flood of information rely heavily an automated techniques for indexing and retrieval. According to Gudivada, Raghavan, Grosky, and Kasanagottu (1997, p. 58), "The ability to search and retrieve information from the Web efficiently and effectively is an enabling technology for realizing its full potential." Almost 93 percent of those surveyed consider the Web an "indispensable" Internet technology, second only to e-mail (Graphie, Visualization & Usability Center, 1998). Although there are other ways of locating information an the Web (browsing or following directory structures), 85 percent of users identify Web pages by means of a search engine (Graphie, Visualization & Usability Center, 1998). A more recent study conducted by the Stanford Institute for the Quantitative Study of Society confirms the finding that searching for information is second only to e-mail as an Internet activity (Nie & Ebring, 2000, online). In fact, Nie and Ebring conclude, "... the Internet today is a giant public library with a decidedly commercial tilt. The most widespread use of the Internet today is as an information search utility for products, travel, hobbies, and general information. Virtually all users interviewed responded that they engaged in one or more of these information gathering activities."
    Techniques for automated indexing and information retrieval (IR) have been developed, tested, and refined over the past 40 years, and are well documented (see, for example, Agosti & Smeaton, 1996; BaezaYates & Ribeiro-Neto, 1999a; Frakes & Baeza-Yates, 1992; Korfhage, 1997; Salton, 1989; Witten, Moffat, & Bell, 1999). With the introduction of the Web, and the capability to index and retrieve via search engines, these techniques have been extended to a new environment. They have been adopted, altered, and in some Gases extended to include new methods. "In short, search engines are indispensable for searching the Web, they employ a variety of relatively advanced IR techniques, and there are some peculiar aspects of search engines that make searching the Web different than more conventional information retrieval" (Gordon & Pathak, 1999, p. 145). The environment for information retrieval an the World Wide Web differs from that of "conventional" information retrieval in a number of fundamental ways. The collection is very large and changes continuously, with pages being added, deleted, and altered. Wide variability between the size, structure, focus, quality, and usefulness of documents makes Web documents much more heterogeneous than a typical electronic document collection. The wide variety of document types includes images, video, audio, and scripts, as well as many different document languages. Duplication of documents and sites is common. Documents are interconnected through networks of hyperlinks. Because of the size and dynamic nature of the Web, preprocessing all documents requires considerable resources and is often not feasible, certainly not an the frequent basis required to ensure currency. Query length is usually much shorter than in other environments-only a few words-and user behavior differs from that in other environments. These differences make the Web a novel environment for information retrieval (Baeza-Yates & Ribeiro-Neto, 1999b; Bharat & Henzinger, 1998; Huang, 2000).
  3. Milstead, J.L.: Thesauri in a full-text world (1998) 0.09
    0.088508844 = product of:
      0.17701769 = sum of:
        0.032993436 = weight(_text_:search in 2337) [ClassicSimilarity], result of:
          0.032993436 = score(doc=2337,freq=2.0), product of:
            0.17183559 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.049439456 = queryNorm
            0.19200584 = fieldWeight in 2337, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2337)
        0.14402425 = sum of:
          0.11053244 = weight(_text_:engine in 2337) [ClassicSimilarity], result of:
            0.11053244 = score(doc=2337,freq=4.0), product of:
              0.26447627 = queryWeight, product of:
                5.349498 = idf(docFreq=570, maxDocs=44218)
                0.049439456 = queryNorm
              0.41792953 = fieldWeight in 2337, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.349498 = idf(docFreq=570, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2337)
          0.03349182 = weight(_text_:22 in 2337) [ClassicSimilarity], result of:
            0.03349182 = score(doc=2337,freq=2.0), product of:
              0.17312855 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.049439456 = queryNorm
              0.19345059 = fieldWeight in 2337, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2337)
      0.5 = coord(2/4)
    
    Abstract
    Despite early claims to the contemporary, thesauri continue to find use as access tools for information in the full-text environment. Their mode of use is changing, but this change actually represents an expansion rather than a contrdiction of their utility. Thesauri and similar vocabulary tools can complement full-text access by aiding users in focusing their searches, by supplementing the linguistic analysis of the text search engine, and even by serving as one of the tools used by the linguistic engine for its analysis. While human indexing contunues to be used for many databases, the trend is to increase the use of machine aids for this purpose. All machine-aided indexing (MAI) systems rely on thesauri as the basis for term selection. In the 21st century, the balance of effort between human and machine will change at both input and output, but thesauri will continue to play an important role for the foreseeable future
    Date
    22. 9.1997 19:16:05
  4. Zhitomirsky-Geffet, M.; Prebor, G.; Bloch, O.: Improving proverb search and retrieval with a generic multidimensional ontology (2017) 0.07
    0.06982846 = product of:
      0.13965692 = sum of:
        0.03490599 = weight(_text_:web in 3320) [ClassicSimilarity], result of:
          0.03490599 = score(doc=3320,freq=2.0), product of:
            0.16134618 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.049439456 = queryNorm
            0.21634221 = fieldWeight in 3320, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=3320)
        0.104750924 = weight(_text_:search in 3320) [ClassicSimilarity], result of:
          0.104750924 = score(doc=3320,freq=14.0), product of:
            0.17183559 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.049439456 = queryNorm
            0.6095997 = fieldWeight in 3320, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.046875 = fieldNorm(doc=3320)
      0.5 = coord(2/4)
    
    Abstract
    The goal of this research is to develop a generic ontological model for proverbs that unifies potential classification criteria and various characteristics of proverbs to enable their effective retrieval and large-scale analysis. Because proverbs can be described and indexed by multiple characteristics and criteria, we built a multidimensional ontology suitable for proverb classification. To evaluate the effectiveness of the constructed ontology for improving search and retrieval of proverbs, a large-scale user experiment was arranged with 70 users who were asked to search a proverb repository using ontology-based and free-text search interfaces. The comparative analysis of the results shows that the use of this ontology helped to substantially improve the search recall, precision, user satisfaction, and efficiency and to minimize user effort during the search process. A practical contribution of this work is an automated web-based proverb search and retrieval system which incorporates the proposed ontological scheme and an initial corpus of ontology-based annotated proverbs.
  5. Kajanan, S.; Bao, Y.; Datta, A.; VanderMeer, D.; Dutta, K.: Efficient automatic search query formulation using phrase-level analysis (2014) 0.06
    0.0613487 = product of:
      0.1226974 = sum of:
        0.0914341 = weight(_text_:search in 1264) [ClassicSimilarity], result of:
          0.0914341 = score(doc=1264,freq=24.0), product of:
            0.17183559 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.049439456 = queryNorm
            0.5321022 = fieldWeight in 1264, product of:
              4.8989797 = tf(freq=24.0), with freq of:
                24.0 = termFreq=24.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.03125 = fieldNorm(doc=1264)
        0.031263296 = product of:
          0.06252659 = sum of:
            0.06252659 = weight(_text_:engine in 1264) [ClassicSimilarity], result of:
              0.06252659 = score(doc=1264,freq=2.0), product of:
                0.26447627 = queryWeight, product of:
                  5.349498 = idf(docFreq=570, maxDocs=44218)
                  0.049439456 = queryNorm
                0.23641664 = fieldWeight in 1264, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.349498 = idf(docFreq=570, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1264)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Over the past decade, the volume of information available digitally over the Internet has grown enormously. Technical developments in the area of search, such as Google's Page Rank algorithm, have proved so good at serving relevant results that Internet search has become integrated into daily human activity. One can endlessly explore topics of interest simply by querying and reading through the resulting links. Yet, although search engines are well known for providing relevant results based on users' queries, users do not always receive the results they are looking for. Google's Director of Research describes clickstream evidence of frustrated users repeatedly reformulating queries and searching through page after page of results. Given the general quality of search engine results, one must consider the possibility that the frustrated user's query is not effective; that is, it does not describe the essence of the user's interest. Indeed, extensive research into human search behavior has found that humans are not very effective at formulating good search queries that describe what they are interested in. Ideally, the user should simply point to a portion of text that sparked the user's interest, and a system should automatically formulate a search query that captures the essence of the text. In this paper, we describe an implemented system that provides this capability. We first describe how our work differs from existing work in automatic query formulation, and propose a new method for improved quantification of the relevance of candidate search terms drawn from input text using phrase-level analysis. We then propose an implementable method designed to provide relevant queries based on a user's text input. We demonstrate the quality of our results and performance of our system through experimental studies. Our results demonstrate that our system produces relevant search terms with roughly two-thirds precision and recall compared to search terms selected by experts, and that typical users find significantly more relevant results (31% more relevant) more quickly (64% faster) using our system than self-formulated search queries. Further, we show that our implementation can scale to request loads of up to 10 requests per second within current online responsiveness expectations (<2-second response times at the highest loads tested).
  6. Shafer, K.: Scorpion Project explores using Dewey to organize the Web (1996) 0.05
    0.05189138 = product of:
      0.10378276 = sum of:
        0.05759195 = weight(_text_:web in 6750) [ClassicSimilarity], result of:
          0.05759195 = score(doc=6750,freq=4.0), product of:
            0.16134618 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.049439456 = queryNorm
            0.35694647 = fieldWeight in 6750, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6750)
        0.046190813 = weight(_text_:search in 6750) [ClassicSimilarity], result of:
          0.046190813 = score(doc=6750,freq=2.0), product of:
            0.17183559 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.049439456 = queryNorm
            0.2688082 = fieldWeight in 6750, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6750)
      0.5 = coord(2/4)
    
    Abstract
    As the amount of accessible information on the WWW increases, so will the cost of accessing it, even if search servcies remain free, due to the increasing amount of time users will have to spend to find needed items. Considers what the seemingly unorganized Web and the organized world of libraries can offer each other. The OCLC Scorpion Project is attempting to combine indexing and cataloguing, specifically focusing on building tools for automatic subject recognition using the technqiues of library science and information retrieval. If subject headings or concept domains can be automatically assigned to electronic items, improved filtering tools for searching can be produced
  7. Stankovic, R. et al.: Indexing of textual databases based on lexical resources : a case study for Serbian (2016) 0.05
    0.049739346 = product of:
      0.09947869 = sum of:
        0.06598687 = weight(_text_:search in 2759) [ClassicSimilarity], result of:
          0.06598687 = score(doc=2759,freq=2.0), product of:
            0.17183559 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.049439456 = queryNorm
            0.3840117 = fieldWeight in 2759, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.078125 = fieldNorm(doc=2759)
        0.03349182 = product of:
          0.06698364 = sum of:
            0.06698364 = weight(_text_:22 in 2759) [ClassicSimilarity], result of:
              0.06698364 = score(doc=2759,freq=2.0), product of:
                0.17312855 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049439456 = queryNorm
                0.38690117 = fieldWeight in 2759, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2759)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Date
    1. 2.2016 18:25:22
    Source
    Semantic keyword-based search on structured data sources: First COST Action IC1302 International KEYSTONE Conference, IKC 2015, Coimbra, Portugal, September 8-9, 2015. Revised Selected Papers. Eds.: J. Cardoso et al
  8. Wolfekuhler, M.R.; Punch, W.F.: Finding salient features for personal Web pages categories (1997) 0.05
    0.04698986 = product of:
      0.09397972 = sum of:
        0.07053544 = weight(_text_:web in 2673) [ClassicSimilarity], result of:
          0.07053544 = score(doc=2673,freq=6.0), product of:
            0.16134618 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.049439456 = queryNorm
            0.43716836 = fieldWeight in 2673, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2673)
        0.023444273 = product of:
          0.046888545 = sum of:
            0.046888545 = weight(_text_:22 in 2673) [ClassicSimilarity], result of:
              0.046888545 = score(doc=2673,freq=2.0), product of:
                0.17312855 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049439456 = queryNorm
                0.2708308 = fieldWeight in 2673, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2673)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Examines techniques that discover features in sets of pre-categorized documents, such that similar documents can be found on the WWW. Examines techniques which will classifiy training examples with high accuracy, then explains why this is not necessarily useful. Describes a method for extracting word clusters from the raw document features. Results show that the clustering technique is successful in discovering word groups in personal Web pages which can be used to find similar information on the WWW
    Date
    1. 8.1996 22:08:06
    Footnote
    Contribution to a special issue of papers from the 6th International World Wide Web conference, held 7-11 Apr 1997, Santa Clara, California
  9. Humphrey, S.M.: Automatic indexing of documents from journal descriptors : a preliminary investigation (1999) 0.04
    0.037249055 = product of:
      0.07449811 = sum of:
        0.03490599 = weight(_text_:web in 3769) [ClassicSimilarity], result of:
          0.03490599 = score(doc=3769,freq=2.0), product of:
            0.16134618 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.049439456 = queryNorm
            0.21634221 = fieldWeight in 3769, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=3769)
        0.03959212 = weight(_text_:search in 3769) [ClassicSimilarity], result of:
          0.03959212 = score(doc=3769,freq=2.0), product of:
            0.17183559 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.049439456 = queryNorm
            0.230407 = fieldWeight in 3769, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.046875 = fieldNorm(doc=3769)
      0.5 = coord(2/4)
    
    Abstract
    A new, fully automated approach for indedexing documents is presented based on associating textwords in a training set of bibliographic citations with the indexing of journals. This journal-level indexing is in the form of a consistent, timely set of journal descriptors (JDs) indexing the individual journals themselves. This indexing is maintained in journal records in a serials authority database. The advantage of this novel approach is that the training set does not depend on previous manual indexing of thousands of documents (i.e., any such indexing already in the training set is not used), but rather the relatively small intellectual effort of indexing at the journal level, usually a matter of a few thousand unique journals for which retrospective indexing to maintain consistency and currency may be feasible. If successful, JD indexing would provide topical categorization of documents outside the training set, i.e., journal articles, monographs, Web documents, reports from the grey literature, etc., and therefore be applied in searching. Because JDs are quite general, corresponding to subject domains, their most problable use would be for improving or refining search results
  10. Hodges, P.R.: Keyword in title indexes : effectiveness of retrieval in computer searches (1983) 0.03
    0.034817543 = product of:
      0.069635086 = sum of:
        0.046190813 = weight(_text_:search in 5001) [ClassicSimilarity], result of:
          0.046190813 = score(doc=5001,freq=2.0), product of:
            0.17183559 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.049439456 = queryNorm
            0.2688082 = fieldWeight in 5001, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5001)
        0.023444273 = product of:
          0.046888545 = sum of:
            0.046888545 = weight(_text_:22 in 5001) [ClassicSimilarity], result of:
              0.046888545 = score(doc=5001,freq=2.0), product of:
                0.17312855 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049439456 = queryNorm
                0.2708308 = fieldWeight in 5001, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5001)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    A study was done to test the effectiveness of retrieval using title word searching. It was based on actual search profiles used in the Mechanized Information Center at Ohio State University, in order ro replicate as closely as possible actual searching conditions. Fewer than 50% of the relevant titles were retrieved by keywords in titles. The low rate of retrieval can be attributes to three sources: titles themselves, user and information specialist ignorance of the subject vocabulary in use, and to general language problems. Across fields it was found that the social sciences had the best retrieval rate, with science having the next best, and arts and humanities the lowest. Ways to enhance and supplement keyword in title searching on the computer and in printed indexes are discussed.
    Date
    14. 3.1996 13:22:21
  11. Smiraglia, R.P.; Cai, X.: Tracking the evolution of clustering, machine learning, automatic indexing and automatic classification in knowledge organization (2017) 0.03
    0.03104088 = product of:
      0.06208176 = sum of:
        0.029088326 = weight(_text_:web in 3627) [ClassicSimilarity], result of:
          0.029088326 = score(doc=3627,freq=2.0), product of:
            0.16134618 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.049439456 = queryNorm
            0.18028519 = fieldWeight in 3627, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3627)
        0.032993436 = weight(_text_:search in 3627) [ClassicSimilarity], result of:
          0.032993436 = score(doc=3627,freq=2.0), product of:
            0.17183559 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.049439456 = queryNorm
            0.19200584 = fieldWeight in 3627, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3627)
      0.5 = coord(2/4)
    
    Abstract
    A very important extension of the traditional domain of knowledge organization (KO) arises from attempts to incorporate techniques devised in the computer science domain for automatic concept extraction and for grouping, categorizing, clustering and otherwise organizing knowledge using mechanical means. Four specific terms have emerged to identify the most prevalent techniques: machine learning, clustering, automatic indexing, and automatic classification. Our study presents three domain analytical case analyses in search of answers. The first case relies on citations located using the ISKO-supported "Knowledge Organization Bibliography." The second case relies on works in both Web of Science and SCOPUS. Case three applies co-word analysis and citation analysis to the contents of the papers in the present special issue. We observe scholars involved in "clustering" and "automatic classification" who share common thematic emphases. But we have found no coherence, no common activity and no social semantics. We have not found a research front, or a common teleology within the KO domain. We also have found a lively group of authors who have succeeded in submitting papers to this special issue, and their work quite interestingly aligns with the case studies we report. There is an emphasis on KO for information retrieval; there is much work on clustering (which involves conceptual points within texts) and automatic classification (which involves semantic groupings at the meta-document level).
  12. Salton, G.; Wong, A.: Generation and search of clustered files (1978) 0.03
    0.026394749 = product of:
      0.105578996 = sum of:
        0.105578996 = weight(_text_:search in 2411) [ClassicSimilarity], result of:
          0.105578996 = score(doc=2411,freq=2.0), product of:
            0.17183559 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.049439456 = queryNorm
            0.6144187 = fieldWeight in 2411, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.125 = fieldNorm(doc=2411)
      0.25 = coord(1/4)
    
  13. Sparck Jones, K.; Tait, J.I.: Automatic search term variant generation (1984) 0.03
    0.026394749 = product of:
      0.105578996 = sum of:
        0.105578996 = weight(_text_:search in 2918) [ClassicSimilarity], result of:
          0.105578996 = score(doc=2918,freq=2.0), product of:
            0.17183559 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.049439456 = queryNorm
            0.6144187 = fieldWeight in 2918, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.125 = fieldNorm(doc=2918)
      0.25 = coord(1/4)
    
  14. Molto, M.: Improving full text search performance through textual analysis (1993) 0.03
    0.026394749 = product of:
      0.105578996 = sum of:
        0.105578996 = weight(_text_:search in 5099) [ClassicSimilarity], result of:
          0.105578996 = score(doc=5099,freq=8.0), product of:
            0.17183559 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.049439456 = queryNorm
            0.6144187 = fieldWeight in 5099, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0625 = fieldNorm(doc=5099)
      0.25 = coord(1/4)
    
    Abstract
    Explores the potential of text analysis as a tool in full text search and design improvement. Reports on a trial analysis performed in the domain of family history. The findings offered insights into possible gains and losses in using one search or design strategy versus another and strong evidence was provided to the potential of text analysis. Makes search and design recommendation
  15. Stegentritt, E.: Evaluationsresultate des mehrsprachigen Suchsystems CANAL/LS (1998) 0.03
    0.026394749 = product of:
      0.105578996 = sum of:
        0.105578996 = weight(_text_:search in 7216) [ClassicSimilarity], result of:
          0.105578996 = score(doc=7216,freq=8.0), product of:
            0.17183559 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.049439456 = queryNorm
            0.6144187 = fieldWeight in 7216, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0625 = fieldNorm(doc=7216)
      0.25 = coord(1/4)
    
    Abstract
    The search system CANAL/LS simplifies the searching of library catalogues by analyzing search questions linguistically and translating them if required. The linguistic analysis reduces the search question words to their basic forms so that they can be compared with basic title forms. Consequently all variants of words and parts of compounds in German can be found. Presents the results of an analysis of search questions in a catalogue of 45.000 titles in the field of psychology
  16. Salton, G.; Araya, J.: On the use of clustered file organizations in information search and retrieval (1990) 0.02
    0.01979606 = product of:
      0.07918424 = sum of:
        0.07918424 = weight(_text_:search in 2409) [ClassicSimilarity], result of:
          0.07918424 = score(doc=2409,freq=2.0), product of:
            0.17183559 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.049439456 = queryNorm
            0.460814 = fieldWeight in 2409, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.09375 = fieldNorm(doc=2409)
      0.25 = coord(1/4)
    
  17. Salton, G.: Fast document classification in automatic information retrieval (1978) 0.02
    0.018663906 = product of:
      0.07465562 = sum of:
        0.07465562 = weight(_text_:search in 2331) [ClassicSimilarity], result of:
          0.07465562 = score(doc=2331,freq=4.0), product of:
            0.17183559 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.049439456 = queryNorm
            0.43445963 = fieldWeight in 2331, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0625 = fieldNorm(doc=2331)
      0.25 = coord(1/4)
    
    Abstract
    A classified or clustered file is one where related or similar records are grouped into classes or clusters of items in such a way that all itmes within a cluster are jointly retrievable. Clustered files are easily adapted to to broad and narrow search strategies, and simple file updating methods are available. An inexpensive file clustering method applicable to large files is given together with appropriate file search methods
  18. Samstag-Schnock, U.; Meadow, C.T.: PBS: an ecomical natural language query interpreter (1993) 0.02
    0.018663906 = product of:
      0.07465562 = sum of:
        0.07465562 = weight(_text_:search in 5091) [ClassicSimilarity], result of:
          0.07465562 = score(doc=5091,freq=4.0), product of:
            0.17183559 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.049439456 = queryNorm
            0.43445963 = fieldWeight in 5091, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0625 = fieldNorm(doc=5091)
      0.25 = coord(1/4)
    
    Abstract
    Reports on the design and implementation of the information searching and retrieval software, PBS (Parsing, Boolean recognition, Stemming) for the front end OAK 2, a new version of OAK developed at Toronto Univ. OAK 2 is a research tool for user behaviour studies. PBS receives natural language search statements from an end user and identifies search facets and implied Boolean logic operators
  19. Pfeifer, U.; Fuhr, N.; Huynh, T.: Searching structured documents with the enhanced retrieval functionality of freeWAIS-sf and SFgate (1995) 0.02
    0.016330918 = product of:
      0.06532367 = sum of:
        0.06532367 = weight(_text_:search in 2214) [ClassicSimilarity], result of:
          0.06532367 = score(doc=2214,freq=4.0), product of:
            0.17183559 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.049439456 = queryNorm
            0.38015217 = fieldWeight in 2214, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2214)
      0.25 = coord(1/4)
    
    Abstract
    The original WAIS implementation by Thinking Machines and others treats documents as uniform bags of terms. Since most documents exhibit some internal structure, it is desirable to provide the user means to exploit this structure in his queries. Presents extensions to the freeWAIS indexer and server, which allows access to document structures using the original WAIS protocol. Major extensions include: arbitrary document formats, search in individual structure elements, stemming and phonetic search, support of 8-bit character sets, numeric concepts and operators. combination of Boolean and linear retrieval. Presents a WWW-WAIS gateway specially tailored for usage with freeWAIS-sf which transforms filled out HTML forms to the new query syntax
  20. Lepsky, K.; Siepmann, J.; Zimmermann, A.: Automatische Indexierung für Online-Kataloge : Ergebnisse eines Retrievaltests (1996) 0.02
    0.016330918 = product of:
      0.06532367 = sum of:
        0.06532367 = weight(_text_:search in 3251) [ClassicSimilarity], result of:
          0.06532367 = score(doc=3251,freq=4.0), product of:
            0.17183559 = queryWeight, product of:
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.049439456 = queryNorm
            0.38015217 = fieldWeight in 3251, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.475677 = idf(docFreq=3718, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3251)
      0.25 = coord(1/4)
    
    Abstract
    Examines the effectiveness of automated indexing and presents the results of a study of information retrieval from a segment (40.000 items) of the ULB Düsseldorf database. The segment was selected randomly and all the documents included were indexed automatically. The search topics included 50 subject areas ranging from economic growth to alternative energy sources. While there were 876 relevant documents in the database segment for each of the 50 search topics, the recall ranged from 1 to 244 references, with the average being 17.52 documents per topic. Therefore it seems that, in the immediate future, automatic indexing should be used in combination with intellectual indexing

Years

Languages

  • e 60
  • d 29
  • f 1
  • ru 1
  • More… Less…

Types

  • a 81
  • el 7
  • x 4
  • m 2
  • s 2
  • More… Less…