Search (395 results, page 1 of 20)

  • × theme_ss:"Retrievalstudien"
  1. Ding, C.H.Q.: ¬A probabilistic model for Latent Semantic Indexing (2005) 0.05
    0.048486304 = product of:
      0.16970205 = sum of:
        0.018832127 = weight(_text_:of in 3459) [ClassicSimilarity], result of:
          0.018832127 = score(doc=3459,freq=14.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.2742677 = fieldWeight in 3459, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=3459)
        0.15086992 = weight(_text_:distribution in 3459) [ClassicSimilarity], result of:
          0.15086992 = score(doc=3459,freq=6.0), product of:
            0.24019864 = queryWeight, product of:
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.043909185 = queryNorm
            0.6281048 = fieldWeight in 3459, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.046875 = fieldNorm(doc=3459)
      0.2857143 = coord(2/7)
    
    Abstract
    Latent Semantic Indexing (LSI), when applied to semantic space built an text collections, improves information retrieval, information filtering, and word sense disambiguation. A new dual probability model based an the similarity concepts is introduced to provide deeper understanding of LSI. Semantic associations can be quantitatively characterized by their statistical significance, the likelihood. Semantic dimensions containing redundant and noisy information can be separated out and should be ignored because their negative contribution to the overall statistical significance. LSI is the optimal solution of the model. The peak in the likelihood curve indicates the existence of an intrinsic semantic dimension. The importance of LSI dimensions follows the Zipf-distribution, indicating that LSI dimensions represent latent concepts. Document frequency of words follows the Zipf distribution, and the number of distinct words follows log-normal distribution. Experiments an five standard document collections confirm and illustrate the analysis.
    Source
    Journal of the American Society for Information Science and Technology. 56(2005) no.6, S.597-608
  2. Schabas, A.H.: ¬A comparative evaluation of the retrieval effectiveness of titles, Library of Congress Subject Headings and PRECIS strings for computer searching of UK MARC data (1979) 0.05
    0.04694498 = product of:
      0.16430742 = sum of:
        0.031832106 = weight(_text_:of in 5277) [ClassicSimilarity], result of:
          0.031832106 = score(doc=5277,freq=10.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.46359703 = fieldWeight in 5277, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.09375 = fieldNorm(doc=5277)
        0.1324753 = weight(_text_:congress in 5277) [ClassicSimilarity], result of:
          0.1324753 = score(doc=5277,freq=2.0), product of:
            0.20946044 = queryWeight, product of:
              4.7703104 = idf(docFreq=1018, maxDocs=44218)
              0.043909185 = queryNorm
            0.63245976 = fieldWeight in 5277, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.7703104 = idf(docFreq=1018, maxDocs=44218)
              0.09375 = fieldNorm(doc=5277)
      0.2857143 = coord(2/7)
    
    Imprint
    London : University of London
  3. MacCain, K.W.: Descriptor and citation retrieval in the medical behavioral sciences literature : retrieval overlaps and novelty distribution (1989) 0.04
    0.040177125 = product of:
      0.14061993 = sum of:
        0.01743516 = weight(_text_:of in 2290) [ClassicSimilarity], result of:
          0.01743516 = score(doc=2290,freq=12.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.25392252 = fieldWeight in 2290, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=2290)
        0.12318477 = weight(_text_:distribution in 2290) [ClassicSimilarity], result of:
          0.12318477 = score(doc=2290,freq=4.0), product of:
            0.24019864 = queryWeight, product of:
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.043909185 = queryNorm
            0.5128454 = fieldWeight in 2290, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.046875 = fieldNorm(doc=2290)
      0.2857143 = coord(2/7)
    
    Abstract
    Search results for nine topics in the medical behavioral sciences are reanalyzed to compare the overall perfor-mance of descriptor and citation search strategies in identifying relevant and novel documents. Overlap per- centages between an aggregate "descriptor-based" database (MEDLINE, EXERPTA MEDICA, PSYCINFO) and an aggregate "citation-based" database (SCISEARCH, SOCIAL SCISEARCH) ranged from 1% to 26%, with a median overlap of 8% relevant retrievals found using both search strategies. For seven topics in which both descriptor and citation strategies produced reasonably substantial retrievals, two patterns of search performance and novelty distribution were observed: (1) where descriptor and citation retrieval showed little overlap, novelty retrieval percentages differed by 17-23% between the two strategies; (2) topics with a relatively high percentage retrieval overlap shoed little difference (1-4%) in descriptor and citation novelty retrieval percentages. These results reflect the varying partial congruence of two literature networks and represent two different types of subject relevance
    Source
    Journal of the American Society for Information Science. 40(1989), S.110-114
  4. Hood, W.W.; Wilson, C.S.: ¬The scatter of documents over databases in different subject domains : how many databases are needed? (2001) 0.04
    0.035200432 = product of:
      0.123201504 = sum of:
        0.020547535 = weight(_text_:of in 6936) [ClassicSimilarity], result of:
          0.020547535 = score(doc=6936,freq=24.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.2992506 = fieldWeight in 6936, product of:
              4.8989797 = tf(freq=24.0), with freq of:
                24.0 = termFreq=24.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6936)
        0.102653965 = weight(_text_:distribution in 6936) [ClassicSimilarity], result of:
          0.102653965 = score(doc=6936,freq=4.0), product of:
            0.24019864 = queryWeight, product of:
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.043909185 = queryNorm
            0.42737114 = fieldWeight in 6936, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6936)
      0.2857143 = coord(2/7)
    
    Abstract
    The distribution of bibliographic records in on-line bibliographic databases is examined using 14 different search topics. These topics were searched using the DIALOG database host, and using as many suitable databases as possible. The presence of duplicate records in the searches was taken into consideration in the analysis, and the problem with lexical ambiguity in at least one search topic is discussed. The study answers questions such as how many databases are needed in a multifile search for particular topics, and what coverage will be achieved using a certain number of databases. The distribution of the percentages of records retrieved over a number of databases for 13 of the 14 search topics roughly fell into three groups: (1) high concentration of records in one database with about 80% coverage in five to eight databases; (2) moderate concentration in one database with about 80% coverage in seven to 10 databases; and (3) low concentration in one database with about 80% coverage in 16 to 19 databases. The study does conform with earlier results, but shows that the number of databases needed for searches with varying complexities of search strategies, is much more topic dependent than previous studies would indicate.
    Source
    Journal of the American Society for Information Science and technology. 52(2001) no.14, S.1242-1254
  5. Spink, A.; Saracevic, T.: Interaction in information retrieval : selection and effectiveness of search terms (1997) 0.03
    0.031632032 = product of:
      0.1107121 = sum of:
        0.023607321 = weight(_text_:of in 206) [ClassicSimilarity], result of:
          0.023607321 = score(doc=206,freq=22.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.34381276 = fieldWeight in 206, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=206)
        0.08710478 = weight(_text_:distribution in 206) [ClassicSimilarity], result of:
          0.08710478 = score(doc=206,freq=2.0), product of:
            0.24019864 = queryWeight, product of:
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.043909185 = queryNorm
            0.36263645 = fieldWeight in 206, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.046875 = fieldNorm(doc=206)
      0.2857143 = coord(2/7)
    
    Abstract
    We investigated the sources and effectiveness of search terms used during mediated on-line searching under real-life (as opposed to laboratory) circumstances. A stratified model of information retrieval (IR) interaction served as a framework for the analysis. For the analysis, we used the on-line transaction logs, videotapes, and transcribed dialogue of the presearch and on-line interaction between 40 users and 4 professional intermediaries. Each user provided one question and interacted with one of the four intermediaries. Searching was done using DIALOG. Five sources of search terms were identified: (1) the users' written question statements, (2) terms derived from users' domain knowledge during the interaction, (3) terms extracted from retrieved items as relevance feedback, (4) database thesaurus, and (5) terms derived by intermediaries during the interaction. Distribution, retrieval effectiveness, transition sequences, and correlation of search terms from different sources were investigated. Search terms from users' written question statements and term relevance feedback were the most productive sources of terms contributing to the retrieval of items judged relevant by users. Implications of the findings are discussed
    Source
    Journal of the American Society for Information Science. 48(1997) no.8, S.741-761
  6. Shaw, W.M.; Burgin, R.; Howell, P.: Performance standards and evaluations in IR test collections : vector-space and other retrieval models (1997) 0.03
    0.03026769 = product of:
      0.10593691 = sum of:
        0.018832127 = weight(_text_:of in 7259) [ClassicSimilarity], result of:
          0.018832127 = score(doc=7259,freq=14.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.2742677 = fieldWeight in 7259, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=7259)
        0.08710478 = weight(_text_:distribution in 7259) [ClassicSimilarity], result of:
          0.08710478 = score(doc=7259,freq=2.0), product of:
            0.24019864 = queryWeight, product of:
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.043909185 = queryNorm
            0.36263645 = fieldWeight in 7259, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.046875 = fieldNorm(doc=7259)
      0.2857143 = coord(2/7)
    
    Abstract
    Computes low performance standards for each query and for the group of queries in 13 traditional and 4 TREC test collections. Predicted by the hypergeometric distribution, the standards represent the highest level of retrieval effectiveness attributable to chance. Compares operational levels of performance for vector-space, ad-hoc-feature-based, probabilistic, and other retrieval models to the standards. The effectiveness of these techniques in small, traditional test collections, can be explained by retrieving a few more relevant documents for most queries than expected by chance. The effectiveness of retrieval techniques in the larger TREC test collections can only be explained by retrieving many more relevant documents for most queries than expected by chance. The discrepancy between deviations form chance in traditional and TREC test collections is due to a decrease in performance standards for large test collections, not to an increase in operational performance. The next generation of information retrieval systems would be enhanced by abandoning uninformative performance summaries and focusing on effectiveness and improvements in effectiveness of individual queries
  7. Pao, M.L.: Retrieval differences between term and citation indexing (1989) 0.03
    0.029929973 = product of:
      0.1047549 = sum of:
        0.016438028 = weight(_text_:of in 3566) [ClassicSimilarity], result of:
          0.016438028 = score(doc=3566,freq=6.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.23940048 = fieldWeight in 3566, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=3566)
        0.08831687 = weight(_text_:congress in 3566) [ClassicSimilarity], result of:
          0.08831687 = score(doc=3566,freq=2.0), product of:
            0.20946044 = queryWeight, product of:
              4.7703104 = idf(docFreq=1018, maxDocs=44218)
              0.043909185 = queryNorm
            0.42163986 = fieldWeight in 3566, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.7703104 = idf(docFreq=1018, maxDocs=44218)
              0.0625 = fieldNorm(doc=3566)
      0.2857143 = coord(2/7)
    
    Abstract
    A retrieval experiment was conducted to compare on-line searching using terms opposed to citations. This is the first study in which a single data base was used to retrieve two equivalent sets for each query, one using terms found in the bibliographic record to achieve higher recall, and the other using documents. Reports on the use of a second citation searching strategy. Overall, by using both types of search keys, the total recall is increased.
    Source
    Information, knowledge, evolution. Proceedings of the 44th FID congress, Helsinki, 28.8.-1.9.1988. Ed. by S. Koshiala and R. Launo
  8. Ruthven, I.; Lalmas, M.; Rijsbergen, K. van: Combining and selecting characteristics of information use (2002) 0.03
    0.029527023 = product of:
      0.103344575 = sum of:
        0.021221403 = weight(_text_:of in 5208) [ClassicSimilarity], result of:
          0.021221403 = score(doc=5208,freq=40.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.3090647 = fieldWeight in 5208, product of:
              6.3245554 = tf(freq=40.0), with freq of:
                40.0 = termFreq=40.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03125 = fieldNorm(doc=5208)
        0.082123175 = weight(_text_:distribution in 5208) [ClassicSimilarity], result of:
          0.082123175 = score(doc=5208,freq=4.0), product of:
            0.24019864 = queryWeight, product of:
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.043909185 = queryNorm
            0.34189692 = fieldWeight in 5208, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.03125 = fieldNorm(doc=5208)
      0.2857143 = coord(2/7)
    
    Abstract
    Ruthven, Lalmas, and van Rijsbergen use traditional term importance measures like inverse document frequency, noise, based upon in-document frequency, and term frequency supplemented by theme value which is calculated from differences of expected positions of words in a text from their actual positions, on the assumption that even distribution indicates term association with a main topic, and context, which is based on a query term's distance from the nearest other query term relative to the average expected distribution of all query terms in the document. They then define document characteristics like specificity, the sum of all idf values in a document over the total terms in the document, or document complexity, measured by the documents average idf value; and information to noise ratio, info-noise, tokens after stopping and stemming over tokens before these processes, measuring the ratio of useful and non-useful information in a document. Retrieval tests are then carried out using each characteristic, combinations of the characteristics, and relevance feedback to determine the correct combination of characteristics. A file ranks independently of query terms by both specificity and info-noise, but if presence of a query term is required unique rankings are generated. Tested on five standard collections the traditional characteristics out preformed the new characteristics, which did, however, out preform random retrieval. All possible combinations of characteristics were also tested both with and without a set of scaling weights applied. All characteristics can benefit by combination with another characteristic or set of characteristics and performance as a single characteristic is a good indicator of performance in combination. Larger combinations tended to be more effective than smaller ones and weighting increased precision measures of middle ranking combinations but decreased the ranking of poorer combinations. The best combinations vary for each collection, and in some collections with the addition of weighting. Finally, with all documents ranked by the all characteristics combination, they take the top 30 documents and calculate the characteristic scores for each term in both the relevant and the non-relevant sets. Then taking for each query term the characteristics whose average was higher for relevant than non-relevant documents the documents are re-ranked. The relevance feedback method of selecting characteristics can select a good set of characteristics for query terms.
    Source
    Journal of the American Society for Information Science and technology. 53(2002) no.5, S.378-396
  9. Drabenstott, K.M.; Vizine-Goetz, D.: Using subject headings for online retrieval : theory, practice and potential (1994) 0.02
    0.0229924 = product of:
      0.0804734 = sum of:
        0.01423575 = weight(_text_:of in 386) [ClassicSimilarity], result of:
          0.01423575 = score(doc=386,freq=8.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.20732689 = fieldWeight in 386, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=386)
        0.06623765 = weight(_text_:congress in 386) [ClassicSimilarity], result of:
          0.06623765 = score(doc=386,freq=2.0), product of:
            0.20946044 = queryWeight, product of:
              4.7703104 = idf(docFreq=1018, maxDocs=44218)
              0.043909185 = queryNorm
            0.31622988 = fieldWeight in 386, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.7703104 = idf(docFreq=1018, maxDocs=44218)
              0.046875 = fieldNorm(doc=386)
      0.2857143 = coord(2/7)
    
    Abstract
    Using subject headings for Online Retrieval is an indispensable tool for online system desingners who are developing new systems or refining exicting ones. The book describes subject analysis and subject searching in online catalogs, including the limitations of retrieval, and demonstrates how such limitations can be overcome through system design and programming. The book describes the Library of Congress Subject headings system and system characteristics, shows how information is stored in machine readable files, and offers examples of and recommendations for successful methods. Tables are included to support these recommendations, and diagrams, graphs, and bar charts are used to provide results of data analyses.
  10. Leiva-Mederos, A.; Senso, J.A.; Hidalgo-Delgado, Y.; Hipola, P.: Working framework of semantic interoperability for CRIS with heterogeneous data sources (2017) 0.02
    0.022804378 = product of:
      0.07981532 = sum of:
        0.021745466 = weight(_text_:of in 3706) [ClassicSimilarity], result of:
          0.021745466 = score(doc=3706,freq=42.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.31669703 = fieldWeight in 3706, product of:
              6.4807405 = tf(freq=42.0), with freq of:
                42.0 = termFreq=42.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03125 = fieldNorm(doc=3706)
        0.058069855 = weight(_text_:distribution in 3706) [ClassicSimilarity], result of:
          0.058069855 = score(doc=3706,freq=2.0), product of:
            0.24019864 = queryWeight, product of:
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.043909185 = queryNorm
            0.24175763 = fieldWeight in 3706, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4703507 = idf(docFreq=505, maxDocs=44218)
              0.03125 = fieldNorm(doc=3706)
      0.2857143 = coord(2/7)
    
    Abstract
    Purpose Information from Current Research Information Systems (CRIS) is stored in different formats, in platforms that are not compatible, or even in independent networks. It would be helpful to have a well-defined methodology to allow for management data processing from a single site, so as to take advantage of the capacity to link disperse data found in different systems, platforms, sources and/or formats. Based on functionalities and materials of the VLIR project, the purpose of this paper is to present a model that provides for interoperability by means of semantic alignment techniques and metadata crosswalks, and facilitates the fusion of information stored in diverse sources. Design/methodology/approach After reviewing the state of the art regarding the diverse mechanisms for achieving semantic interoperability, the paper analyzes the following: the specific coverage of the data sets (type of data, thematic coverage and geographic coverage); the technical specifications needed to retrieve and analyze a distribution of the data set (format, protocol, etc.); the conditions of re-utilization (copyright and licenses); and the "dimensions" included in the data set as well as the semantics of these dimensions (the syntax and the taxonomies of reference). The semantic interoperability framework here presented implements semantic alignment and metadata crosswalk to convert information from three different systems (ABCD, Moodle and DSpace) to integrate all the databases in a single RDF file. Findings The paper also includes an evaluation based on the comparison - by means of calculations of recall and precision - of the proposed model and identical consultations made on Open Archives Initiative and SQL, in order to estimate its efficiency. The results have been satisfactory enough, due to the fact that the semantic interoperability facilitates the exact retrieval of information. Originality/value The proposed model enhances management of the syntactic and semantic interoperability of the CRIS system designed. In a real setting of use it achieves very positive results.
    Source
    Journal of documentation. 73(2017) no.3, S.481-499
  11. Lancaster, F.W.: Evaluation within the environment of an operating information service (1981) 0.02
    0.022558236 = product of:
      0.078953825 = sum of:
        0.016608374 = weight(_text_:of in 3150) [ClassicSimilarity], result of:
          0.016608374 = score(doc=3150,freq=2.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.24188137 = fieldWeight in 3150, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.109375 = fieldNorm(doc=3150)
        0.06234545 = product of:
          0.1246909 = sum of:
            0.1246909 = weight(_text_:service in 3150) [ClassicSimilarity], result of:
              0.1246909 = score(doc=3150,freq=2.0), product of:
                0.18813887 = queryWeight, product of:
                  4.284727 = idf(docFreq=1655, maxDocs=44218)
                  0.043909185 = queryNorm
                0.6627599 = fieldWeight in 3150, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.284727 = idf(docFreq=1655, maxDocs=44218)
                  0.109375 = fieldNorm(doc=3150)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
  12. Bodoff, D.; Kambil, A.: Partial coordination : II. A preliminary evaluation and failure analysis (1998) 0.02
    0.02233565 = product of:
      0.07817477 = sum of:
        0.01423575 = weight(_text_:of in 2323) [ClassicSimilarity], result of:
          0.01423575 = score(doc=2323,freq=8.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.20732689 = fieldWeight in 2323, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=2323)
        0.06393902 = weight(_text_:cataloging in 2323) [ClassicSimilarity], result of:
          0.06393902 = score(doc=2323,freq=4.0), product of:
            0.17305137 = queryWeight, product of:
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.043909185 = queryNorm
            0.36948 = fieldWeight in 2323, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.046875 = fieldNorm(doc=2323)
      0.2857143 = coord(2/7)
    
    Abstract
    Partial coordination is a new method for cataloging documents for subject access. It is especially designed to enhance the precision of document searches in online environments. This article reports a preliminary evaluation of partial coordination that shows promising results compared with full-text retrieval. We also report the difficulties in empirically evaluating the effectiveness of automatic full-text retrieval in contrast to mixed methods such as partial coordination which combine human cataloging with computerized retrieval. Based on our study, we propose research in this area will substantially benefit from a common framework for failure analysis and a common data set. This will allow information retrieval researchers adapting 'library style'cataloging to large electronic document collections, as well as those developing automated or mixed methods, to directly compare their proposals for indexing and retrieval. This article concludes by suggesting guidelines for constructing such as testbed
    Source
    Journal of the American Society for Information Science. 49(1998) no.14, S.1270-1282
  13. Lancaster, F.W.; Connell, T.H.; Bishop, N.; McCowan, S.: Identifying barriers to effective subject access in library catalogs (1991) 0.02
    0.02134795 = product of:
      0.07471782 = sum of:
        0.021970814 = weight(_text_:of in 2259) [ClassicSimilarity], result of:
          0.021970814 = score(doc=2259,freq=14.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.31997898 = fieldWeight in 2259, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2259)
        0.052747004 = weight(_text_:cataloging in 2259) [ClassicSimilarity], result of:
          0.052747004 = score(doc=2259,freq=2.0), product of:
            0.17305137 = queryWeight, product of:
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.043909185 = queryNorm
            0.30480546 = fieldWeight in 2259, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2259)
      0.2857143 = coord(2/7)
    
    Abstract
    51 subject searches were performed in an online catalog containing about 4,5 million records. Their success was judges in terms of lists of items, known to be relevant to the various topics, compiled by subject specialists (faculty members or authors of articles in specialized encyclopedias). Many of the items known to be relevant were not retrieved, even in very broad searches that sometimes retrieved several hundred records, and very little could be done to make them retrievable within the constraints of present cataloging practice. Librarians should recognize that library catalogs, as now implemented, offer only the most primitive of subject access and should seek to develop different types of subject access tools. - Vgl auch Letter (B.H. Weinberg) in: LTRS 36(1992) S.123-124.
  14. Schultz Jr., W.N.; Braddy, L.: ¬A librarian-centered study of perceptions of subject terms and controlled vocabulary (2017) 0.02
    0.02134795 = product of:
      0.07471782 = sum of:
        0.021970814 = weight(_text_:of in 5156) [ClassicSimilarity], result of:
          0.021970814 = score(doc=5156,freq=14.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.31997898 = fieldWeight in 5156, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5156)
        0.052747004 = weight(_text_:cataloging in 5156) [ClassicSimilarity], result of:
          0.052747004 = score(doc=5156,freq=2.0), product of:
            0.17305137 = queryWeight, product of:
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.043909185 = queryNorm
            0.30480546 = fieldWeight in 5156, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5156)
      0.2857143 = coord(2/7)
    
    Abstract
    Controlled vocabulary and subject headings in OPAC records have proven to be useful in improving search results. The authors used a survey to gather information about librarian opinions and professional use of controlled vocabulary. Data from a range of backgrounds and expertise were examined, including academic and public libraries, and technical services as well as public services professionals. Responses overall demonstrated positive opinions of the value of controlled vocabulary, including in reference interactions as well as during bibliographic instruction sessions. Results are also examined based upon factors such as age and type of librarian.
    Source
    Cataloging and classification quarterly. 55(2017) no.7/8, S.456-466
  15. Wien, C.: Sample sizes and composition : their effect on recall and precision in IR experiments with OPACs (2000) 0.02
    0.020375926 = product of:
      0.071315736 = sum of:
        0.018568728 = weight(_text_:of in 5368) [ClassicSimilarity], result of:
          0.018568728 = score(doc=5368,freq=10.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.2704316 = fieldWeight in 5368, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5368)
        0.052747004 = weight(_text_:cataloging in 5368) [ClassicSimilarity], result of:
          0.052747004 = score(doc=5368,freq=2.0), product of:
            0.17305137 = queryWeight, product of:
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.043909185 = queryNorm
            0.30480546 = fieldWeight in 5368, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5368)
      0.2857143 = coord(2/7)
    
    Abstract
    This article discusses how samples of records for laboratory IR experiments on OPACs can be constructed so that results obtained from different experiments can be compared. The literature on laboratory IR experiments seems to indicate that the retrieval effectiveness (recall and precision) is affected by the way the samples of records for such experiments are generated. Especially the amount of records and the subject area coverage of the records seems to affect the retrieval effectiveness. This article contains suggestions for the construction of samples for laboratory IR experiments on OPACs and demonstrates that the retrieval effectiveness is affected by different sample size and composition.
    Source
    Cataloging and classification quarterly. 29(2000) no.4, S.73-86
  16. Wilkes, A.; Nelson, A.: Subject searching in two online catalogs : authority control vs. non authority control (1995) 0.02
    0.019815823 = product of:
      0.069355376 = sum of:
        0.016608374 = weight(_text_:of in 4450) [ClassicSimilarity], result of:
          0.016608374 = score(doc=4450,freq=8.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.24188137 = fieldWeight in 4450, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4450)
        0.052747004 = weight(_text_:cataloging in 4450) [ClassicSimilarity], result of:
          0.052747004 = score(doc=4450,freq=2.0), product of:
            0.17305137 = queryWeight, product of:
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.043909185 = queryNorm
            0.30480546 = fieldWeight in 4450, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4450)
      0.2857143 = coord(2/7)
    
    Abstract
    Compares the results of subject searching in 2 online catalogue systems, one system with authority control, the other without. Transaction logs from Library A (no authority control) were analyzed to identify searching patterns of users; 885 searches were attempted, 351 (39,7%) by subject. 142 (40,6%) of these subject searches were unsuccessful. Identical searches were performed in a comparable library that has authority control, Library B. Terms identified in 'see' references at Library B were searched in Library A. 105 (73,9%) of the searches that appeared to fail would have retrievd at least one, and usually many, records if a link had been provided between the term chosen by the user and the term used by the system
    Source
    Cataloging and classification quarterly. 20(1995) no.4, S.57-79
  17. Beall, J.; Kafadar, K.: Measuring typographical errors' impact on retrieval in bibliographic databases (2007) 0.02
    0.019662583 = product of:
      0.06881904 = sum of:
        0.023607321 = weight(_text_:of in 261) [ClassicSimilarity], result of:
          0.023607321 = score(doc=261,freq=22.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.34381276 = fieldWeight in 261, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=261)
        0.045211717 = weight(_text_:cataloging in 261) [ClassicSimilarity], result of:
          0.045211717 = score(doc=261,freq=2.0), product of:
            0.17305137 = queryWeight, product of:
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.043909185 = queryNorm
            0.26126182 = fieldWeight in 261, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9411201 = idf(docFreq=2334, maxDocs=44218)
              0.046875 = fieldNorm(doc=261)
      0.2857143 = coord(2/7)
    
    Abstract
    Typographical errors can block access to records in online catalogs; but, when a word contains a typo and is also spelled correctly elsewhere in the same record, access may not be blocked. To quantify the effect of typographical errors in records on information retrieval, we conducted a study to measure the proportion of records that contain a typographical error but that do not also contain a correct spelling of the same word. This article presents the experimental design, results of the study, and a statistical analysis of the results.We find that the average proportion of records that are blocked by the presence of a typo (that is, records in which a correct spelling of the word does not also occur) ranges from 35% to 99%, depending upon the frequency of the word being searched and the likelihood of the word being misspelled.
    Footnote
    Simultaneously published as Cataloger, Editor, and Scholar: Essays in Honor of Ruth C. Carter
    Source
    Cataloging and classification quarterly. 44(2007) nos.3/4, S.197-211
  18. Serrano Cobos, J.; Quintero Orta, A.: Design, development and management of an information recovery system for an Internet Website : from documentary theory to practice (2003) 0.02
    0.01932375 = product of:
      0.06763312 = sum of:
        0.021353623 = weight(_text_:of in 2726) [ClassicSimilarity], result of:
          0.021353623 = score(doc=2726,freq=18.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.3109903 = fieldWeight in 2726, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=2726)
        0.046279497 = product of:
          0.092558995 = sum of:
            0.092558995 = weight(_text_:service in 2726) [ClassicSimilarity], result of:
              0.092558995 = score(doc=2726,freq=6.0), product of:
                0.18813887 = queryWeight, product of:
                  4.284727 = idf(docFreq=1655, maxDocs=44218)
                  0.043909185 = queryNorm
                0.49197167 = fieldWeight in 2726, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  4.284727 = idf(docFreq=1655, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2726)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    A real case study is shown, explaining in a timeline the whole process of design, development and evaluation of a search engine used as a navigational help tool for end users and clients an a content website, e-commerce driven. The nature of the website is a community website, which will determine the core design of the information service. This study will involve several steps, such as information recovery system analysis, comparative analysis of other commercial search engines, service design, functionalities and scope; software selection, design of the project, project management, future service administration and conclusions.
    Source
    Challenges in knowledge representation and organization for the 21st century: Integration of knowledge across boundaries. Proceedings of the 7th ISKO International Conference Granada, Spain, July 10-13, 2002. Ed.: M. López-Huertas
  19. Dalrymple, P.W.: Retrieval by reformulation in two library catalogs : toward a cognitive model of searching behavior (1990) 0.02
    0.018608976 = product of:
      0.06513141 = sum of:
        0.02348779 = weight(_text_:of in 5089) [ClassicSimilarity], result of:
          0.02348779 = score(doc=5089,freq=4.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.34207192 = fieldWeight in 5089, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.109375 = fieldNorm(doc=5089)
        0.04164362 = product of:
          0.08328724 = sum of:
            0.08328724 = weight(_text_:22 in 5089) [ClassicSimilarity], result of:
              0.08328724 = score(doc=5089,freq=2.0), product of:
                0.15376249 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043909185 = queryNorm
                0.5416616 = fieldWeight in 5089, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=5089)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Date
    22. 7.2006 18:43:54
    Source
    Journal of the American Society for Information Science. 41(1990) no.4, S.272-281
  20. Voorhees, E.M.; Harman, D.: Overview of the Sixth Text REtrieval Conference (TREC-6) (2000) 0.02
    0.016643427 = product of:
      0.058251992 = sum of:
        0.016608374 = weight(_text_:of in 6438) [ClassicSimilarity], result of:
          0.016608374 = score(doc=6438,freq=2.0), product of:
            0.06866331 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.043909185 = queryNorm
            0.24188137 = fieldWeight in 6438, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.109375 = fieldNorm(doc=6438)
        0.04164362 = product of:
          0.08328724 = sum of:
            0.08328724 = weight(_text_:22 in 6438) [ClassicSimilarity], result of:
              0.08328724 = score(doc=6438,freq=2.0), product of:
                0.15376249 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043909185 = queryNorm
                0.5416616 = fieldWeight in 6438, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=6438)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Date
    11. 8.2001 16:22:19

Languages

Types

  • a 367
  • s 14
  • m 8
  • el 6
  • r 4
  • x 2
  • d 1
  • p 1
  • More… Less…