Search (11 results, page 1 of 1)

  • × language_ss:"e"
  • × theme_ss:"Volltextretrieval"
  • × year_i:[1990 TO 2000}
  1. Kristensen, J.; Järvelin, K.: ¬The effectiveness of a searching thesaurus in free-text searching in a full-text database (1990) 0.02
    0.015380906 = product of:
      0.107666336 = sum of:
        0.053833168 = weight(_text_:classification in 2043) [ClassicSimilarity], result of:
          0.053833168 = score(doc=2043,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.5629819 = fieldWeight in 2043, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.125 = fieldNorm(doc=2043)
        0.053833168 = weight(_text_:classification in 2043) [ClassicSimilarity], result of:
          0.053833168 = score(doc=2043,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.5629819 = fieldWeight in 2043, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.125 = fieldNorm(doc=2043)
      0.14285715 = coord(2/14)
    
    Source
    International classification. 17(1990), S.77-84
  2. Paijmans, H.: Gravity wells of meaning : detecting information rich passages in scientific texts (1997) 0.01
    0.013485613 = product of:
      0.09439929 = sum of:
        0.03799808 = product of:
          0.07599616 = sum of:
            0.07599616 = weight(_text_:schemes in 7444) [ClassicSimilarity], result of:
              0.07599616 = score(doc=7444,freq=2.0), product of:
                0.16067243 = queryWeight, product of:
                  5.3512506 = idf(docFreq=569, maxDocs=44218)
                  0.03002521 = queryNorm
                0.4729882 = fieldWeight in 7444, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.3512506 = idf(docFreq=569, maxDocs=44218)
                  0.0625 = fieldNorm(doc=7444)
          0.5 = coord(1/2)
        0.056401204 = product of:
          0.11280241 = sum of:
            0.11280241 = weight(_text_:texts in 7444) [ClassicSimilarity], result of:
              0.11280241 = score(doc=7444,freq=4.0), product of:
                0.16460659 = queryWeight, product of:
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.03002521 = queryNorm
                0.6852849 = fieldWeight in 7444, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.0625 = fieldNorm(doc=7444)
          0.5 = coord(1/2)
      0.14285715 = coord(2/14)
    
    Abstract
    Presents research in which 4 term weigthing schemes were used to detect information rich passages in texts and the results compared. Demonstrates that word categories and frequency derived weights have a close correlation but that weighting according to the first mention theory or the cue method shows no correlation with frequency based weights
  3. Voorbij, H.: Title keywords and subject descriptors : a comparison of subject search entries of books in the humanities and social sciences (1998) 0.01
    0.00525005 = product of:
      0.0735007 = sum of:
        0.0735007 = weight(_text_:subject in 4721) [ClassicSimilarity], result of:
          0.0735007 = score(doc=4721,freq=24.0), product of:
            0.10738805 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.03002521 = queryNorm
            0.68444026 = fieldWeight in 4721, product of:
              4.8989797 = tf(freq=24.0), with freq of:
                24.0 = termFreq=24.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4721)
      0.071428575 = coord(1/14)
    
    Abstract
    In order to compare the value of subject descriptors and title keywords as entries to subject searches, two studies were carried out. Both studies concentrated on monographs in the humanities and social sciences, held by the online public access catalogue of the National Library of the Netherlands. In the first study, a comparison was made by subject librarians between the subject descriptors and the title keywords of 475 records. They could express their opinion on a scale from 1 (descriptor is exactly or almost the same as word in title) to 7 (descriptor does not appear in title at all). It was concluded that 37 per cent of the records are considerably enhanced by a subject descriptor, and 49 per cent slightly or considerably enhanced. In the second study, subject librarians performed subject searches using title keywords and subject descriptors on the same topic. The relative recall amounted to 48 per cent and 86 per cent respectively. Failure analysis revealed the reasons why so many records that were found by subject descriptors were not found by title keywords. First, although completely meaningless titles hardly ever appear, the title of a publication does not always offer sufficient clues for title keyword searching. In those cases, descriptors may enhance the record of a publication. A second and even more important task of subject descriptors is controlling the vocabulary. Many relevant titles cannot be retrieved by title keyword searching because of the wide diversity of ways of expressing a topic. Descriptors take away the burden of vocabulary control from the user.
  4. Sclafani, F.: Controlled subject heading searching versus keyword searching (1999) 0.00
    0.004849789 = product of:
      0.067897044 = sum of:
        0.067897044 = weight(_text_:subject in 3790) [ClassicSimilarity], result of:
          0.067897044 = score(doc=3790,freq=2.0), product of:
            0.10738805 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.03002521 = queryNorm
            0.63225883 = fieldWeight in 3790, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.125 = fieldNorm(doc=3790)
      0.071428575 = coord(1/14)
    
  5. Pirkola, A.; Jarvelin, K.: ¬The effect of anaphor and ellipsis resolution on proximity searching in a text database (1995) 0.00
    0.004806533 = product of:
      0.03364573 = sum of:
        0.016822865 = weight(_text_:classification in 4088) [ClassicSimilarity], result of:
          0.016822865 = score(doc=4088,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.17593184 = fieldWeight in 4088, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4088)
        0.016822865 = weight(_text_:classification in 4088) [ClassicSimilarity], result of:
          0.016822865 = score(doc=4088,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.17593184 = fieldWeight in 4088, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4088)
      0.14285715 = coord(2/14)
    
    Abstract
    So far, methods for ellipsis and anaphor resolution have been developed and the effects of anaphor resolution have been analyzed in the context of statistical information retrieval of scientific abstracts. No significant improvements has been observed. Analyzes the effects of ellipsis and anaphor resolution on proximity searching in a full text database. Anaphora and ellipsis are classified on the basis of the type of their correlates / antecedents rather than, as traditional, on the basis of their own linguistic type. The classification differentiates proper names and common nouns of basic words, compound words, and phrases. The study was carried out in a newspaper article database containing 55.000 full text articles. A set of 154 keyword pairs in different categories was created. Human resolution of keyword ellipsis and anaphora was performed to identify sentences and paragraphs which would match proximity searches after resolution. Findings indicate that ellipsis and anaphor resolution is most relevant for proper name phrases and only marginal in the other keyword categories. Therefore the recall effect of restricted resolution of proper name phrases only was analyzed for keyword pairs containing at least 1 proper name phrase. Findings indicate a recall increase of 38.2% in sentence searches, and 28.8% in paragraph searches when proper name ellipsis were resolved. The recall increase was 17.6% sentence searches, and 19.8% in paragraph searches when proper name anaphora were resolved. Some simple and computationally justifiable resolution method might be developed only for proper name phrases to support keyword based full text information retrieval. Discusses elements of such a method
  6. Quint, B.: Flipping for full-text (1991) 0.00
    0.002872974 = product of:
      0.04022163 = sum of:
        0.04022163 = weight(_text_:bibliographic in 4893) [ClassicSimilarity], result of:
          0.04022163 = score(doc=4893,freq=2.0), product of:
            0.11688946 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.03002521 = queryNorm
            0.34409973 = fieldWeight in 4893, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.0625 = fieldNorm(doc=4893)
      0.071428575 = coord(1/14)
    
    Abstract
    Provides tips for searchers of full text online databases and examines the coverage policies of full text database producers which may change without notification to users. Most full text newspaper files do not carry even bibliographic listings for syndicated columns not created by their own staff. Looks at the development of full-text CD-ROM databases and claims that full-text, though expensive, is the wave of the future
  7. Tenopir, C.: Full-text retrieval : systems and files (1994) 0.00
    0.002872974 = product of:
      0.04022163 = sum of:
        0.04022163 = weight(_text_:bibliographic in 2424) [ClassicSimilarity], result of:
          0.04022163 = score(doc=2424,freq=2.0), product of:
            0.11688946 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.03002521 = queryNorm
            0.34409973 = fieldWeight in 2424, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.0625 = fieldNorm(doc=2424)
      0.071428575 = coord(1/14)
    
    Abstract
    State of the art review of the development of full text databases, encompassing: types of commercially available full text databases; online systems for full text databases; CD-ROM databases for full text databases; full text databases on magnetic discs or tapes; creation of full text databases; searching and display requirements for full text searching and software. Concludes that bibliographic information services without full text support solve only half of the retrieval problems
  8. Couvreur, T.R.; Benzel, R.N.; Miller, S.F.; Zeitler, D.N.; Lee, D.L.; Singhal, M.; Shivaratri, N.; Wong, W.Y.P.: ¬An analysis of performance and cost factors in searching large text databases using parallel search systems (1994) 0.00
    0.002513852 = product of:
      0.035193928 = sum of:
        0.035193928 = weight(_text_:bibliographic in 7657) [ClassicSimilarity], result of:
          0.035193928 = score(doc=7657,freq=2.0), product of:
            0.11688946 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.03002521 = queryNorm
            0.30108726 = fieldWeight in 7657, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7657)
      0.071428575 = coord(1/14)
    
    Abstract
    The results of modelling the performance of searching large text databases (>10 GBytes) via various parallel hardware architectures and search algorithms are discussed. The performance under load and the cost of each configuration are compared. Strengths, weaknesses, performance sensitivities, and search features supported for each configuration are also addressed. In addition, a common search workload used in the modelling is described. The search workload is derived from a set of searches run against the Chemical Abstracts file of bibliographic and abstract text available on STN International. This common workload is applied to all configurations modelled to provide a common basis of comparison
  9. Melucci, M.: Passage retrieval : a probabilistic technique (1998) 0.00
    0.0024926048 = product of:
      0.034896467 = sum of:
        0.034896467 = product of:
          0.069792934 = sum of:
            0.069792934 = weight(_text_:texts in 1150) [ClassicSimilarity], result of:
              0.069792934 = score(doc=1150,freq=2.0), product of:
                0.16460659 = queryWeight, product of:
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.03002521 = queryNorm
                0.42399842 = fieldWeight in 1150, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1150)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Abstract
    This paper presents a probabilistic technique to retrieve passages from texts having a large size or heterogeneous semantic content. The proposed technique is independent on any supporting auxiliary data, such as text structure, topic organization, or pre-defined text segments. A Bayesian framework implements the probabilistic technique. We carried out experiments to compare the probabilistique technique to one based on a text segmentation algorithm. In particular, the probabilistique technique is more effective than, or as effective as the one based on the text segmentation to retrieve small passages. Results show that passage size affects passage retrieval performance. Results do also suggest that text organization and query generality may have an impact on the difference in effectiveness between the two techniques
  10. Pearce, C.; Nicholas, C.: TELLTALE: Experiments in a dynamic hypertext environment for degraded and multilingual data (1996) 0.00
    0.0021365185 = product of:
      0.029911257 = sum of:
        0.029911257 = product of:
          0.059822515 = sum of:
            0.059822515 = weight(_text_:texts in 4071) [ClassicSimilarity], result of:
              0.059822515 = score(doc=4071,freq=2.0), product of:
                0.16460659 = queryWeight, product of:
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.03002521 = queryNorm
                0.36342722 = fieldWeight in 4071, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4071)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Abstract
    Methods and tools for finding documents relevant to a user's needs in a document corpora can be found in the information retrieval, library science, and hypertext communities. Typically, these systems provide retrieval capabilities for fairly static copora, their algorithms are dependent on the language for which they are written, e.g. English, and they do not perform well when presented with misspelled words or text that has been degraded by OCR techniques. In this article, we present experimentation results for the TELLTALE system. TELLTALE is a dynamic hypertext environment that provides full-text search from a hypertext-style user interface for text corpora that may be garbled by OCR or transmission errors, and that may contain languages other than English. TELLTALE uses several techniques based on n-grams (n character sequences of text). With these results we show that the dynamic linkage mechanisms in TELLTALE are tolerant of garbles in up to 30% of the characters in the body of the texts
  11. Laegreid, J.A.: SIFT: a Norwegian information retrieval system (1993) 0.00
    0.0011622861 = product of:
      0.016272005 = sum of:
        0.016272005 = product of:
          0.03254401 = sum of:
            0.03254401 = weight(_text_:22 in 7701) [ClassicSimilarity], result of:
              0.03254401 = score(doc=7701,freq=2.0), product of:
                0.10514317 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03002521 = queryNorm
                0.30952093 = fieldWeight in 7701, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=7701)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Date
    23. 1.1999 19:22:09