Search (950 results, page 2 of 48)

Robertson, S.: ¬The state of information retrieval : a researcher's view 0.05

0.051434852 = product of:
  0.102869704 = sum of:
    0.033397563 = weight(_text_:retrieval in 1944) [ClassicSimilarity], result of:
      0.033397563 = score(doc=1944,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.26736724 = fieldWeight in 1944, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=1944)
    0.03422346 = weight(_text_:use in 1944) [ClassicSimilarity], result of:
      0.03422346 = score(doc=1944,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.27065295 = fieldWeight in 1944, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0625 = fieldNorm(doc=1944)
    0.019957775 = weight(_text_:of in 1944) [ClassicSimilarity], result of:
      0.019957775 = score(doc=1944,freq=10.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.3090647 = fieldWeight in 1944, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=1944)
    0.015290912 = product of:
      0.030581824 = sum of:
        0.030581824 = weight(_text_:on in 1944) [ClassicSimilarity], result of:
          0.030581824 = score(doc=1944,freq=6.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.33671528 = fieldWeight in 1944, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0625 = fieldNorm(doc=1944)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: For the last ten years Stephen Robertson has been a researcher at the Microsoft Research Laboratory. He previously spent twenty years at City University, where he started the Centre for Interactive Systems Research and still retains a part-time professorship. His work on probabilistic theory underpins the algorithms behind every serious search engine today. In his talk, he gave a non-technical overview of some current concerns of core IR research, in particular on the use of different kinds of evidence in searching and ranking.
Content: Presentation at ISKO meeting on June 26, 2008 at University College, London.

Francu, V.: Does convenience trump accuracy? : the avatars of the UDC in Romania (2007) 0.05

0.05142991 = product of:
  0.10285982 = sum of:
    0.041327372 = weight(_text_:retrieval in 544) [ClassicSimilarity], result of:
      0.041327372 = score(doc=544,freq=4.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.33085006 = fieldWeight in 544, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=544)
    0.029945528 = weight(_text_:use in 544) [ClassicSimilarity], result of:
      0.029945528 = score(doc=544,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.23682132 = fieldWeight in 544, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0546875 = fieldNorm(doc=544)
    0.020662563 = weight(_text_:of in 544) [ClassicSimilarity], result of:
      0.020662563 = score(doc=544,freq=14.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.31997898 = fieldWeight in 544, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=544)
    0.010924355 = product of:
      0.02184871 = sum of:
        0.02184871 = weight(_text_:on in 544) [ClassicSimilarity], result of:
          0.02184871 = score(doc=544,freq=4.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.24056101 = fieldWeight in 544, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0546875 = fieldNorm(doc=544)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: This paper will concentrate on some major issues regarding the potential of UDC and the current controversy about its use UDC in Romania: i) the importance of hierarchical structures in controlled vocabularies with a direct impact on improved information retrieval given by the browsing function which enables visualizing the hierarchies in subject areas rather than just locating a particular topic; ii) the lack of popularity of the UDC as an indexing and information retrieval language among its users be they librarians or end users of library OPACs; and iii) the situation of UDC teachers and teaching in Romanian universities.

Styltsvig, H.B.: Ontology-based information retrieval (2006) 0.05
```
0.050694928 = product of:
  0.101389855 = sum of:
    0.047231287 = weight(_text_:retrieval in 1154) [ClassicSimilarity], result of:
      0.047231287 = score(doc=1154,freq=16.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.37811437 = fieldWeight in 1154, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=1154)
    0.024199642 = weight(_text_:use in 1154) [ClassicSimilarity], result of:
      0.024199642 = score(doc=1154,freq=4.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.19138055 = fieldWeight in 1154, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.03125 = fieldNorm(doc=1154)
    0.02231347 = weight(_text_:of in 1154) [ClassicSimilarity], result of:
      0.02231347 = score(doc=1154,freq=50.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.34554482 = fieldWeight in 1154, product of:
          7.071068 = tf(freq=50.0), with freq of:
            50.0 = termFreq=50.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03125 = fieldNorm(doc=1154)
    0.007645456 = product of:
      0.015290912 = sum of:
        0.015290912 = weight(_text_:on in 1154) [ClassicSimilarity], result of:
          0.015290912 = score(doc=1154,freq=6.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.16835764 = fieldWeight in 1154, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.03125 = fieldNorm(doc=1154)
      0.5 = coord(1/2)
  0.5 = coord(4/8)
```
Abstract

In this thesis, we will present methods for introducing ontologies in information retrieval. The main hypothesis is that the inclusion of conceptual knowledge such as ontologies in the information retrieval process can contribute to the solution of major problems currently found in information retrieval. This utilization of ontologies has a number of challenges. Our focus is on the use of similarity measures derived from the knowledge about relations between concepts in ontologies, the recognition of semantic information in texts and the mapping of this knowledge into the ontologies in use, as well as how to fuse together the ideas of ontological similarity and ontological indexing into a realistic information retrieval scenario. To achieve the recognition of semantic knowledge in a text, shallow natural language processing is used during indexing that reveals knowledge to the level of noun phrases. Furthermore, we briefly cover the identification of semantic relations inside and between noun phrases, as well as discuss which kind of problems are caused by an increase in compoundness with respect to the structure of concepts in the evaluation of queries. Measuring similarity between concepts based on distances in the structure of the ontology is discussed. In addition, a shared nodes measure is introduced and, based on a set of intuitive similarity properties, compared to a number of different measures. In this comparison the shared nodes measure appears to be superior, though more computationally complex. Some of the major problems of shared nodes which relate to the way relations differ with respect to the degree they bring the concepts they connect closer are discussed. A generalized measure called weighted shared nodes is introduced to deal with these problems. Finally, the utilization of concept similarity in query evaluation is discussed. A semantic expansion approach that incorporates concept similarity is introduced and a generalized fuzzy set retrieval model that applies expansion during query evaluation is presented. While not commonly used in present information retrieval systems, it appears that the fuzzy set model comprises the flexibility needed when generalizing to an ontology-based retrieval model and, with the introduction of a hierarchical fuzzy aggregation principle, compound concepts can be handled in a straightforward and natural manner.

Content

A dissertation Presented to the Faculties of Roskilde University in Partial Fulfillment of the Requirement for the Degree of Doctor of Philosophy. Vgl. unter: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.117.987 oder http://coitweb.uncc.edu/~ras/RS/Onto-Retrieval.pdf.

Gödert, W.: Detecting multiword phrases in mathematical text corpora (2012) 0.05

0.050031796 = product of:
  0.10006359 = sum of:
    0.033397563 = weight(_text_:retrieval in 466) [ClassicSimilarity], result of:
      0.033397563 = score(doc=466,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.26736724 = fieldWeight in 466, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=466)
    0.03422346 = weight(_text_:use in 466) [ClassicSimilarity], result of:
      0.03422346 = score(doc=466,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.27065295 = fieldWeight in 466, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0625 = fieldNorm(doc=466)
    0.023614356 = weight(_text_:of in 466) [ClassicSimilarity], result of:
      0.023614356 = score(doc=466,freq=14.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.36569026 = fieldWeight in 466, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=466)
    0.008828212 = product of:
      0.017656423 = sum of:
        0.017656423 = weight(_text_:on in 466) [ClassicSimilarity], result of:
          0.017656423 = score(doc=466,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.19440265 = fieldWeight in 466, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0625 = fieldNorm(doc=466)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: We present an approach for detecting multiword phrases in mathematical text corpora. The method used is based on characteristic features of mathematical terminology. It makes use of a software tool named Lingo which allows to identify words by means of previously defined dictionaries for specific word classes as adjectives, personal names or nouns. The detection of multiword groups is done algorithmically. Possible advantages of the method for indexing and information retrieval and conclusions for applying dictionary-based methods of automatic indexing instead of stemming procedures are discussed.

Hjoerland, B.: Information retrieval and knowledge organization : a perspective from the philosophy of science 0.05

0.049509183 = product of:
  0.099018365 = sum of:
    0.035423465 = weight(_text_:retrieval in 206) [ClassicSimilarity], result of:
      0.035423465 = score(doc=206,freq=4.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.2835858 = fieldWeight in 206, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=206)
    0.016396983 = weight(_text_:of in 206) [ClassicSimilarity], result of:
      0.016396983 = score(doc=206,freq=12.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.25392252 = fieldWeight in 206, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=206)
    0.009363732 = product of:
      0.018727465 = sum of:
        0.018727465 = weight(_text_:on in 206) [ClassicSimilarity], result of:
          0.018727465 = score(doc=206,freq=4.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.20619515 = fieldWeight in 206, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=206)
      0.5 = coord(1/2)
    0.037834182 = product of:
      0.075668365 = sum of:
        0.075668365 = weight(_text_:computers in 206) [ClassicSimilarity], result of:
          0.075668365 = score(doc=206,freq=2.0), product of:
            0.21710795 = queryWeight, product of:
              5.257537 = idf(docFreq=625, maxDocs=44218)
              0.041294612 = queryNorm
            0.34852874 = fieldWeight in 206, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.257537 = idf(docFreq=625, maxDocs=44218)
              0.046875 = fieldNorm(doc=206)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: Information retrieval (IR) is about making systems for finding documents or information. Knowledge organization (KO) is the field concerned with indexing, classification, and representing documents for IR, browsing, and related processes, whether performed by humans or computers. The field of IR is today dominated by search engines like Google. An important difference between KO and IR as research fields is that KO attempts to reflect knowledge as depicted by contemporary scholarship, in contrast to IR, which is based on, for example, "match" techniques, popularity measures or personalization principles. The classification of documents in KO mostly aims at reflecting the classification of knowledge in the sciences. Books about birds, for example, mostly reflect (or aim at reflecting) how birds are classified in ornithology. KO therefore requires access to the adequate subject knowledge; however, this is often characterized by disagreements. At the deepest layer, such disagreements are based on philosophical issues best characterized as "paradigms". No IR technology and no system of knowledge organization can ever be neutral in relation to paradigmatic conflicts, and therefore such philosophical problems represent the basis for the study of IR and KO.

Tomassen, S.L.: Research on ontology-driven information retrieval (2006 (?)) 0.05

0.049391042 = product of:
  0.098782085 = sum of:
    0.050096344 = weight(_text_:retrieval in 4328) [ClassicSimilarity], result of:
      0.050096344 = score(doc=4328,freq=8.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.40105087 = fieldWeight in 4328, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=4328)
    0.025667597 = weight(_text_:use in 4328) [ClassicSimilarity], result of:
      0.025667597 = score(doc=4328,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.20298971 = fieldWeight in 4328, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.046875 = fieldNorm(doc=4328)
    0.016396983 = weight(_text_:of in 4328) [ClassicSimilarity], result of:
      0.016396983 = score(doc=4328,freq=12.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.25392252 = fieldWeight in 4328, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=4328)
    0.006621159 = product of:
      0.013242318 = sum of:
        0.013242318 = weight(_text_:on in 4328) [ClassicSimilarity], result of:
          0.013242318 = score(doc=4328,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.14580199 = fieldWeight in 4328, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=4328)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: An increasing number of recent information retrieval systems make use of ontologies to help the users clarify their information needs and come up with semantic representations of documents. A particular concern here is the integration of these semantic approaches with traditional search technology. The research presented in this paper examines how ontologies can be efficiently applied to large-scale search systems for the web. We describe how these systems can be enriched with adapted ontologies to provide both an in-depth understanding of the user's needs as well as an easy integration with standard vector-space retrieval systems. The ontology concepts are adapted to the domain terminology by computing a feature vector for each concept. Later, the feature vectors are used to enrich a provided query. The whole retrieval system is under development as part of a larger Semantic Web standardization project for the Norwegian oil & gas sector.

Svensson, L.G.; Jahns, Y.: PDF, CSV, RSS and other Acronyms : redefining the bibliographic services in the German National Library (2010) 0.05

0.04892317 = product of:
  0.13046178 = sum of:
    0.020873476 = weight(_text_:retrieval in 3970) [ClassicSimilarity], result of:
      0.020873476 = score(doc=3970,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.16710453 = fieldWeight in 3970, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3970)
    0.015778005 = weight(_text_:of in 3970) [ClassicSimilarity], result of:
      0.015778005 = score(doc=3970,freq=16.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.24433708 = fieldWeight in 3970, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3970)
    0.0938103 = sum of:
      0.022070529 = weight(_text_:on in 3970) [ClassicSimilarity], result of:
        0.022070529 = score(doc=3970,freq=8.0), product of:
          0.090823986 = queryWeight, product of:
            2.199415 = idf(docFreq=13325, maxDocs=44218)
            0.041294612 = queryNorm
          0.24300331 = fieldWeight in 3970, product of:
            2.828427 = tf(freq=8.0), with freq of:
              8.0 = termFreq=8.0
            2.199415 = idf(docFreq=13325, maxDocs=44218)
            0.0390625 = fieldNorm(doc=3970)
      0.07173977 = weight(_text_:line in 3970) [ClassicSimilarity], result of:
        0.07173977 = score(doc=3970,freq=2.0), product of:
          0.23157367 = queryWeight, product of:
            5.6078424 = idf(docFreq=440, maxDocs=44218)
            0.041294612 = queryNorm
          0.30979243 = fieldWeight in 3970, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.6078424 = idf(docFreq=440, maxDocs=44218)
            0.0390625 = fieldNorm(doc=3970)
  0.375 = coord(3/8)

Abstract: In January 2010, the German National Library discontinued the print version of the national bibliography and replaced it with an online journal. This was the first step in a longer process of redefining the National Library's bibliographic services, leaving the field of traditional media - e. g. paper or CD-ROM databases - and focusing on publishing its data over the WWW. A new business model was set up - all web resources are now published in an extra bibliography series and the bibliographic data are freely available. Step by step the prices of the other bibliographic data will be also reduced. In the second stage of the project, the focus is on value-added services based on the National Library's catalogue. The main purpose is to introduce alerting services based on the user's search criteria offering different access methods such as RSS feeds, integration with e. g. Zotero, or export of the bibliographic data as a CSV or PDF file. Current standards of cataloguing remain a guide line to offer high-value end-user retrieval but they will be supplemented by automated indexing procedures to find & browse the growing number of documents. A transparent cataloguing policy and wellarranged selection menus are aimed.

Rozman, D.; Rifl, B.: Universal Decimal Classification in Slovenia (2007) 0.05

0.048300974 = product of:
  0.1288026 = sum of:
    0.030249555 = weight(_text_:use in 2528) [ClassicSimilarity], result of:
      0.030249555 = score(doc=2528,freq=4.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.23922569 = fieldWeight in 2528, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2528)
    0.015778005 = weight(_text_:of in 2528) [ClassicSimilarity], result of:
      0.015778005 = score(doc=2528,freq=16.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.24433708 = fieldWeight in 2528, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2528)
    0.082775034 = sum of:
      0.0110352645 = weight(_text_:on in 2528) [ClassicSimilarity], result of:
        0.0110352645 = score(doc=2528,freq=2.0), product of:
          0.090823986 = queryWeight, product of:
            2.199415 = idf(docFreq=13325, maxDocs=44218)
            0.041294612 = queryNorm
          0.121501654 = fieldWeight in 2528, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            2.199415 = idf(docFreq=13325, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2528)
      0.07173977 = weight(_text_:line in 2528) [ClassicSimilarity], result of:
        0.07173977 = score(doc=2528,freq=2.0), product of:
          0.23157367 = queryWeight, product of:
            5.6078424 = idf(docFreq=440, maxDocs=44218)
            0.041294612 = queryNorm
          0.30979243 = fieldWeight in 2528, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.6078424 = idf(docFreq=440, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2528)
  0.375 = coord(3/8)

Abstract: In Slovenia , most libraries use UDC system for cataloguing purposes. Open-access shelving with UDC has also a long tradition in Slovenian public libraries and in some academic libraries. The last printed Slovenian UDC edition dates from 1991. This outdated edition included a very short guide to the use of UDC and about 11000 notations. In the National and University Library, Ljubljana , Slovenia , a team of a coordinator, editors, translators and a computer programmer has been formed to prepare Slovenian translation of UDC version UDC MRF 2001. The on line edition in the format ISO 2709 has kept the original data structure. Searching by UDC numbers, precise searching and full text searching of UDC explanations, notes, examples, etc. have been provided. There are many links in the application which guide the users to UDC numbers. Thus, the appropriate UDC number can be recognized and chosen. Those parts of the application superstructure are especially user friendly and reviewable. Access to the UDC database is controlled.The basics of UDC are explained in the new Slovenian manual »Univerzalna decimalna klasifikacija« published by the National and University Library in Ljubljana in 2006. The authors have created a short, clear and useful manual for beginners as well as for experienced librarians who want to classify and arrange their library holdings in new and innovative ways. In the paper, a description of the characteristics of Slovenian UDC manual is presented and also some proposals for future developments in UDC are expressed.

Oard, D.W.: Alternative approaches for cross-language text retrieval (1997) 0.05
```
0.048154086 = product of:
  0.09630817 = sum of:
    0.056589838 = weight(_text_:retrieval in 1164) [ClassicSimilarity], result of:
      0.056589838 = score(doc=1164,freq=30.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.45303512 = fieldWeight in 1164, product of:
          5.477226 = tf(freq=30.0), with freq of:
            30.0 = termFreq=30.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1164)
    0.014972764 = weight(_text_:use in 1164) [ClassicSimilarity], result of:
      0.014972764 = score(doc=1164,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.11841066 = fieldWeight in 1164, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1164)
    0.017020877 = weight(_text_:of in 1164) [ClassicSimilarity], result of:
      0.017020877 = score(doc=1164,freq=38.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.2635841 = fieldWeight in 1164, product of:
          6.164414 = tf(freq=38.0), with freq of:
            38.0 = termFreq=38.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1164)
    0.007724685 = product of:
      0.01544937 = sum of:
        0.01544937 = weight(_text_:on in 1164) [ClassicSimilarity], result of:
          0.01544937 = score(doc=1164,freq=8.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.17010231 = fieldWeight in 1164, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1164)
      0.5 = coord(1/2)
  0.5 = coord(4/8)
```
Abstract

The explosive growth of the Internet and other sources of networked information have made automatic mediation of access to networked information sources an increasingly important problem. Much of this information is expressed as electronic text, and it is becoming practical to automatically convert some printed documents and recorded speech to electronic text as well. Thus, automated systems capable of detecting useful documents are finding widespread application. With even a small number of languages it can be inconvenient to issue the same query repeatedly in every language, so users who are able to read more than one language will likely prefer a multilingual text retrieval system over a collection of monolingual systems. And since reading ability in a language does not always imply fluent writing ability in that language, such users will likely find cross-language text retrieval particularly useful for languages in which they are less confident of their ability to express their information needs effectively. The use of such systems can be also be beneficial if the user is able to read only a single language. For example, when only a small portion of the document collection will ever be examined by the user, performing retrieval before translation can be significantly more economical than performing translation before retrieval. So when the application is sufficiently important to justify the time and effort required for translation, those costs can be minimized if an effective cross-language text retrieval system is available. Even when translation is not available, there are circumstances in which cross-language text retrieval could be useful to a monolingual user. For example, a researcher might find a paper published in an unfamiliar language useful if that paper contains references to works by the same author that are in the researcher's native language.
Multilingual text retrieval can be defined as selection of useful documents from collections that may contain several languages (English, French, Chinese, etc.). This formulation allows for the possibility that individual documents might contain more than one language, a common occurrence in some applications. Both cross-language and within-language retrieval are included in this formulation, but it is the cross-language aspect of the problem which distinguishes multilingual text retrieval from its well studied monolingual counterpart. At the SIGIR 96 workshop on "Cross-Linguistic Information Retrieval" the participants discussed the proliferation of terminology being used to describe the field and settled on "Cross-Language" as the best single description of the salient aspect of the problem. "Multilingual" was felt to be too broad, since that term has also been used to describe systems able to perform within-language retrieval in more than one language but that lack any cross-language capability. "Cross-lingual" and "cross-linguistic" were felt to be equally good descriptions of the field, but "crosslanguage" was selected as the preferred term in the interest of standardization. Unfortunately, at about the same time the U.S. Defense Advanced Research Projects Agency (DARPA) introduced "translingual" as their preferred term, so we are still some distance from reaching consensus on this matter.
I will not attempt to draw a sharp distinction between retrieval and filtering in this survey. Although my own work on adaptive cross-language text filtering has led me to make this distinction fairly carefully in other presentations (c.f., (Oard 1997b)), such an proach does little to help understand the fundamental techniques which have been applied or the results that have been obtained in this case. Since it is still common to view filtering (detection of useful documents in dynamic document streams) as a kind of retrieval, will simply adopt that perspective here.

Theme

Semantisches Umfeld in Indexierung u. Retrieval

Rindflesch, T.C.; Aronson, A.R.: Semantic processing in information retrieval (1993) 0.05

0.047558818 = product of:
  0.12682351 = sum of:
    0.06534432 = weight(_text_:retrieval in 4121) [ClassicSimilarity], result of:
      0.06534432 = score(doc=4121,freq=10.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.5231199 = fieldWeight in 4121, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4121)
    0.042349376 = weight(_text_:use in 4121) [ClassicSimilarity], result of:
      0.042349376 = score(doc=4121,freq=4.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.33491597 = fieldWeight in 4121, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4121)
    0.019129815 = weight(_text_:of in 4121) [ClassicSimilarity], result of:
      0.019129815 = score(doc=4121,freq=12.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.29624295 = fieldWeight in 4121, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4121)
  0.375 = coord(3/8)

Abstract: Intuition suggests that one way to enhance the information retrieval process would be the use of phrases to characterize the contents of text. A number of researchers, however, have noted that phrases alone do not improve retrieval effectiveness. In this paper we briefly review the use of phrases in information retrieval and then suggest extensions to this paradigm using semantic information. We claim that semantic processing, which can be viewed as expressing relations between the concepts represented by phrases, will in fact enhance retrieval effectiveness. The availability of the UMLS® domain model, which we exploit extensively, significantly contributes to the feasibility of this processing.

Sojka, P.; Liska, M.: ¬The art of mathematics retrieval (2011) 0.05

0.046735823 = product of:
  0.09347165 = sum of:
    0.041327372 = weight(_text_:retrieval in 3450) [ClassicSimilarity], result of:
      0.041327372 = score(doc=3450,freq=4.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.33085006 = fieldWeight in 3450, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3450)
    0.013526822 = weight(_text_:of in 3450) [ClassicSimilarity], result of:
      0.013526822 = score(doc=3450,freq=6.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.20947541 = fieldWeight in 3450, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3450)
    0.010924355 = product of:
      0.02184871 = sum of:
        0.02184871 = weight(_text_:on in 3450) [ClassicSimilarity], result of:
          0.02184871 = score(doc=3450,freq=4.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.24056101 = fieldWeight in 3450, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3450)
      0.5 = coord(1/2)
    0.027693095 = product of:
      0.05538619 = sum of:
        0.05538619 = weight(_text_:22 in 3450) [ClassicSimilarity], result of:
          0.05538619 = score(doc=3450,freq=4.0), product of:
            0.1446067 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.041294612 = queryNorm
            0.38301262 = fieldWeight in 3450, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3450)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: The design and architecture of MIaS (Math Indexer and Searcher), a system for mathematics retrieval is presented, and design decisions are discussed. We argue for an approach based on Presentation MathML using a similarity of math subformulae. The system was implemented as a math-aware search engine based on the state-ofthe-art system Apache Lucene. Scalability issues were checked against more than 400,000 arXiv documents with 158 million mathematical formulae. Almost three billion MathML subformulae were indexed using a Solr-compatible Lucene.
Content: Vgl.: DocEng2011, September 19-22, 2011, Mountain View, California, USA Copyright 2011 ACM 978-1-4503-0863-2/11/09
Date: 22. 2.2017 13:00:42

Campbell, D.G.; Mayhew, A.: ¬A phylogenetic approach to bibliographic families and relationships (2017) 0.05

0.04660417 = product of:
  0.124277785 = sum of:
    0.021389665 = weight(_text_:use in 3875) [ClassicSimilarity], result of:
      0.021389665 = score(doc=3875,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.1691581 = fieldWeight in 3875, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3875)
    0.02011309 = weight(_text_:of in 3875) [ClassicSimilarity], result of:
      0.02011309 = score(doc=3875,freq=26.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.31146988 = fieldWeight in 3875, product of:
          5.0990195 = tf(freq=26.0), with freq of:
            26.0 = termFreq=26.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3875)
    0.082775034 = sum of:
      0.0110352645 = weight(_text_:on in 3875) [ClassicSimilarity], result of:
        0.0110352645 = score(doc=3875,freq=2.0), product of:
          0.090823986 = queryWeight, product of:
            2.199415 = idf(docFreq=13325, maxDocs=44218)
            0.041294612 = queryNorm
          0.121501654 = fieldWeight in 3875, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            2.199415 = idf(docFreq=13325, maxDocs=44218)
            0.0390625 = fieldNorm(doc=3875)
      0.07173977 = weight(_text_:line in 3875) [ClassicSimilarity], result of:
        0.07173977 = score(doc=3875,freq=2.0), product of:
          0.23157367 = queryWeight, product of:
            5.6078424 = idf(docFreq=440, maxDocs=44218)
            0.041294612 = queryNorm
          0.30979243 = fieldWeight in 3875, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.6078424 = idf(docFreq=440, maxDocs=44218)
            0.0390625 = fieldNorm(doc=3875)
  0.375 = coord(3/8)

Abstract: This presentation applies the principles of phylogenetic classification to the phenomenon of bibliographic relationships in library catalogues. We argue that while the FRBR paradigm supports hierarchical bibliographic relationships between works and their various expressions and manifestations, we need a different paradigm to support associative bibliographic relationships of the kind detected in previous research. Numerous studies have shown the existence and importance of bibliographic relationships that lie outside that hierarchical FRBR model: particularly the importance of bibliographic families. We would like to suggest phylogenetics as a potential means of gaining access to those more elusive and ephemeral relationships. Phylogenetic analysis does not follow the Platonic conception of an abstract work that gives rise to specific instantiations; rather, it tracks relationships of kinship as they evolve over time. We use two examples to suggest ways in which phylogenetic trees could be represented in future library catalogues. The novels of Jane Austen are used to indicate how phylogenetic trees can represent, with greater accuracy, the line of Jane Austen adaptations, ranging from contemporary efforts to complete her unfinished work, through to the more recent efforts to graft horror memes onto the original text. Stanley Kubrick's 2001: A Space Odyssey provides an example of charting relationships both backwards and forwards in time, across different media and genres. We suggest three possible means of applying phylogenetic s in the future: enhancement of the relationship designators in RDA, crowdsourcing user tags, and extracting relationship trees through big data analysis.
Content: Beitrag bei: NASKO 2017: Visualizing Knowledge Organization: Bringing Focus to Abstract Realities. The sixth North American Symposium on Knowledge Organization (NASKO 2017), June 15-16, 2017, in Champaign, IL, USA.

Danowski, P.: Authority files and Web 2.0 : Wikipedia and the PND. An Example (2007) 0.05

0.046060055 = product of:
  0.07369609 = sum of:
    0.020873476 = weight(_text_:retrieval in 1291) [ClassicSimilarity], result of:
      0.020873476 = score(doc=1291,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.16710453 = fieldWeight in 1291, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1291)
    0.021389665 = weight(_text_:use in 1291) [ClassicSimilarity], result of:
      0.021389665 = score(doc=1291,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.1691581 = fieldWeight in 1291, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1291)
    0.007889003 = weight(_text_:of in 1291) [ClassicSimilarity], result of:
      0.007889003 = score(doc=1291,freq=4.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.12216854 = fieldWeight in 1291, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1291)
    0.00955682 = product of:
      0.01911364 = sum of:
        0.01911364 = weight(_text_:on in 1291) [ClassicSimilarity], result of:
          0.01911364 = score(doc=1291,freq=6.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.21044704 = fieldWeight in 1291, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1291)
      0.5 = coord(1/2)
    0.013987125 = product of:
      0.02797425 = sum of:
        0.02797425 = weight(_text_:22 in 1291) [ClassicSimilarity], result of:
          0.02797425 = score(doc=1291,freq=2.0), product of:
            0.1446067 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.041294612 = queryNorm
            0.19345059 = fieldWeight in 1291, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1291)
      0.5 = coord(1/2)
  0.625 = coord(5/8)

Abstract: More and more users index everything on their own in the web 2.0. There are services for links, videos, pictures, books, encyclopaedic articles and scientific articles. All these services are library independent. But must that really be? Can't libraries help with their experience and tools to make user indexing better? On the experience of a project from German language Wikipedia together with the German person authority files (Personen Namen Datei - PND) located at German National Library (Deutsche Nationalbibliothek) I would like to show what is possible. How users can and will use the authority files, if we let them. We will take a look how the project worked and what we can learn for future projects. Conclusions - Authority files can have a role in the web 2.0 - there must be an open interface/ service for retrieval - everything that is indexed on the net with authority files can be easy integrated in a federated search - O'Reilly: You have to found ways that your data get more important that more it will be used
Content: Vortrag anlässlich des Workshops: "Extending the multilingual capacity of The European Library in the EDL project Stockholm, Swedish National Library, 22-23 November 2007".

Liu, S.: Decomposing DDC synthesized numbers (1996) 0.05

0.045158453 = product of:
  0.09031691 = sum of:
    0.046674512 = weight(_text_:retrieval in 5969) [ClassicSimilarity], result of:
      0.046674512 = score(doc=5969,freq=10.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.37365708 = fieldWeight in 5969, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5969)
    0.021389665 = weight(_text_:use in 5969) [ClassicSimilarity], result of:
      0.021389665 = score(doc=5969,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.1691581 = fieldWeight in 5969, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5969)
    0.0167351 = weight(_text_:of in 5969) [ClassicSimilarity], result of:
      0.0167351 = score(doc=5969,freq=18.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.25915858 = fieldWeight in 5969, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5969)
    0.0055176322 = product of:
      0.0110352645 = sum of:
        0.0110352645 = weight(_text_:on in 5969) [ClassicSimilarity], result of:
          0.0110352645 = score(doc=5969,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.121501654 = fieldWeight in 5969, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5969)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: Much literature has been written speculating upon how classification can be used in online catalogs to improve information retrieval. While some empirical studies have been done exploring whether the direct use of traditional classification schemes designed for a manual environment is effective and efficient in the online environment, none has manipulated these manual classifications in such a w ay as to take full advantage of the power of both the classification and computer. It has been suggested by some authors, such as Wajenberg and Drabenstott, that this power could be realized if the individual components of synthesized DDC numbers could be identified and indexed. This paper looks at the feasibility of automatically decomposing DDC synthesized numbers and the implications of such decomposition for information retrieval. Based on an analysis of the instructions for synthesizing numbers in the main class Arts (700) and all DDC Tables, 17 decomposition rules were defined, 13 covering the Add Notes and four the Standard Subdivisions. 1,701 DDC synthesized numbers were decomposed by a computer system called DND (Dewey Number Decomposer), developed by the author. From the 1,701 numbers, 600 were randomly selected fo r examination by three judges, each evaluating 200 numbers. The decomposition success rate was 100% and it was concluded that synthesized DDC numbers can be accurately decomposed automatically. The study has implications for information retrieval, expert systems for assigning DDC numbers, automatic indexing, switching language development, enhancing classifiers' work, teaching library school students, and providing quality control for DDC number assignments. These implications were explored using a prototype retrieval system.
Content: Bezug zu: Liu, Songqiao. "The Automatic Decomposition of DDC Synthesized Numbers." Ph.D. diss., University of California, Los Angeles, 1993.
Theme: Klassifikationssysteme im Online-Retrieval

Chowdhury, A.; Mccabe, M.C.: Improving information retrieval systems using part of speech tagging (1993) 0.04

0.044440318 = product of:
  0.088880636 = sum of:
    0.035423465 = weight(_text_:retrieval in 1061) [ClassicSimilarity], result of:
      0.035423465 = score(doc=1061,freq=4.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.2835858 = fieldWeight in 1061, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=1061)
    0.025667597 = weight(_text_:use in 1061) [ClassicSimilarity], result of:
      0.025667597 = score(doc=1061,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.20298971 = fieldWeight in 1061, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.046875 = fieldNorm(doc=1061)
    0.021168415 = weight(_text_:of in 1061) [ClassicSimilarity], result of:
      0.021168415 = score(doc=1061,freq=20.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.32781258 = fieldWeight in 1061, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=1061)
    0.006621159 = product of:
      0.013242318 = sum of:
        0.013242318 = weight(_text_:on in 1061) [ClassicSimilarity], result of:
          0.013242318 = score(doc=1061,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.14580199 = fieldWeight in 1061, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=1061)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: The object of Information Retrieval is to retrieve all relevant documents for a user query and only those relevant documents. Much research has focused on achieving this objective with little regard for storage overhead or performance. In the paper we evaluate the use of Part of Speech Tagging to improve, the index storage overhead and general speed of the system with only a minimal reduction to precision recall measurements. We tagged 500Mbs of the Los Angeles Times 1990 and 1989 document collection provided by TREC for parts of speech. We then experimented to find the most relevant part of speech to index. We show that 90% of precision recall is achieved with 40% of the document collections terms. We also show that this is a improvement in overhead with only a 1% reduction in precision recall.

EndNote Plus 2.3 : Enhanced reference database and bibliography maker. With EndLink 2.1, link to on-line and CD-ROM databases (1997) 0.04

0.0441767 = product of:
  0.1767068 = sum of:
    0.011156735 = weight(_text_:of in 1717) [ClassicSimilarity], result of:
      0.011156735 = score(doc=1717,freq=2.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.17277241 = fieldWeight in 1717, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.078125 = fieldNorm(doc=1717)
    0.16555007 = sum of:
      0.022070529 = weight(_text_:on in 1717) [ClassicSimilarity], result of:
        0.022070529 = score(doc=1717,freq=2.0), product of:
          0.090823986 = queryWeight, product of:
            2.199415 = idf(docFreq=13325, maxDocs=44218)
            0.041294612 = queryNorm
          0.24300331 = fieldWeight in 1717, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            2.199415 = idf(docFreq=13325, maxDocs=44218)
            0.078125 = fieldNorm(doc=1717)
      0.14347954 = weight(_text_:line in 1717) [ClassicSimilarity], result of:
        0.14347954 = score(doc=1717,freq=2.0), product of:
          0.23157367 = queryWeight, product of:
            5.6078424 = idf(docFreq=440, maxDocs=44218)
            0.041294612 = queryNorm
          0.61958486 = fieldWeight in 1717, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.6078424 = idf(docFreq=440, maxDocs=44218)
            0.078125 = fieldNorm(doc=1717)
  0.25 = coord(2/8)

Footnote: Rez. in: International journal of information management. 17(1997) no.6, S.470-472 (T. Wilson)

Bradford, R.B.: Relationship discovery in large text collections using Latent Semantic Indexing (2006) 0.04

0.04386019 = product of:
  0.0701763 = sum of:
    0.016698781 = weight(_text_:retrieval in 1163) [ClassicSimilarity], result of:
      0.016698781 = score(doc=1163,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.13368362 = fieldWeight in 1163, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=1163)
    0.01711173 = weight(_text_:use in 1163) [ClassicSimilarity], result of:
      0.01711173 = score(doc=1163,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.13532647 = fieldWeight in 1163, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.03125 = fieldNorm(doc=1163)
    0.018933605 = weight(_text_:of in 1163) [ClassicSimilarity], result of:
      0.018933605 = score(doc=1163,freq=36.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.2932045 = fieldWeight in 1163, product of:
          6.0 = tf(freq=36.0), with freq of:
            36.0 = termFreq=36.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03125 = fieldNorm(doc=1163)
    0.0062424885 = product of:
      0.012484977 = sum of:
        0.012484977 = weight(_text_:on in 1163) [ClassicSimilarity], result of:
          0.012484977 = score(doc=1163,freq=4.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.13746344 = fieldWeight in 1163, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.03125 = fieldNorm(doc=1163)
      0.5 = coord(1/2)
    0.0111897 = product of:
      0.0223794 = sum of:
        0.0223794 = weight(_text_:22 in 1163) [ClassicSimilarity], result of:
          0.0223794 = score(doc=1163,freq=2.0), product of:
            0.1446067 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.041294612 = queryNorm
            0.15476047 = fieldWeight in 1163, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=1163)
      0.5 = coord(1/2)
  0.625 = coord(5/8)

Abstract: This paper addresses the problem of information discovery in large collections of text. For users, one of the key problems in working with such collections is determining where to focus their attention. In selecting documents for examination, users must be able to formulate reasonably precise queries. Queries that are too broad will greatly reduce the efficiency of information discovery efforts by overwhelming the users with peripheral information. In order to formulate efficient queries, a mechanism is needed to automatically alert users regarding potentially interesting information contained within the collection. This paper presents the results of an experiment designed to test one approach to generation of such alerts. The technique of latent semantic indexing (LSI) is used to identify relationships among entities of interest. Entity extraction software is used to pre-process the text of the collection so that the LSI space contains representation vectors for named entities in addition to those for individual terms. In the LSI space, the cosine of the angle between the representation vectors for two entities captures important information regarding the degree of association of those two entities. For appropriate choices of entities, determining the entity pairs with the highest mutual cosine values yields valuable information regarding the contents of the text collection. The test database used for the experiment consists of 150,000 news articles. The proposed approach for alert generation is tested using a counterterrorism analysis example. The approach is shown to have significant potential for aiding users in rapidly focusing on information of potential importance in large text collections. The approach also has value in identifying possible use of aliases.
Source: Proceedings of the Fourth Workshop on Link Analysis, Counterterrorism, and Security, SIAM Data Mining Conference, Bethesda, MD, 20-22 April, 2006. [http://www.siam.org/meetings/sdm06/workproceed/Link%20Analysis/15.pdf]
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Hollink, L.; Assem, M. van: Estimating the relevance of search results in the Culture-Web : a study of semantic distance measures (2010) 0.04

0.042324685 = product of:
  0.08464937 = sum of:
    0.036299463 = weight(_text_:use in 4649) [ClassicSimilarity], result of:
      0.036299463 = score(doc=4649,freq=4.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.2870708 = fieldWeight in 4649, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.046875 = fieldNorm(doc=4649)
    0.022201622 = weight(_text_:of in 4649) [ClassicSimilarity], result of:
      0.022201622 = score(doc=4649,freq=22.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.34381276 = fieldWeight in 4649, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=4649)
    0.009363732 = product of:
      0.018727465 = sum of:
        0.018727465 = weight(_text_:on in 4649) [ClassicSimilarity], result of:
          0.018727465 = score(doc=4649,freq=4.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.20619515 = fieldWeight in 4649, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=4649)
      0.5 = coord(1/2)
    0.016784549 = product of:
      0.033569098 = sum of:
        0.033569098 = weight(_text_:22 in 4649) [ClassicSimilarity], result of:
          0.033569098 = score(doc=4649,freq=2.0), product of:
            0.1446067 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.041294612 = queryNorm
            0.23214069 = fieldWeight in 4649, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=4649)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: More and more cultural heritage institutions publish their collections, vocabularies and metadata on the Web. The resulting Web of linked cultural data opens up exciting new possibilities for searching and browsing through these cultural heritage collections. We report on ongoing work in which we investigate the estimation of relevance in this Web of Culture. We study existing measures of semantic distance and how they apply to two use cases. The use cases relate to the structured, multilingual and multimodal nature of the Culture Web. We distinguish between measures using the Web, such as Google distance and PMI, and measures using the Linked Data Web, i.e. the semantic structure of metadata vocabularies. We perform a small study in which we compare these semantic distance measures to human judgements of relevance. Although it is too early to draw any definitive conclusions, the study provides new insights into the applicability of semantic distance measures to the Web of Culture, and clear starting points for further research.
Date: 26.12.2011 13:40:22

Wenige, L.; Ruhland, J.: Similarity-based knowledge graph queries for recommendation retrieval (2019) 0.04

0.04230194 = product of:
  0.08460388 = sum of:
    0.04174695 = weight(_text_:retrieval in 5864) [ClassicSimilarity], result of:
      0.04174695 = score(doc=5864,freq=8.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.33420905 = fieldWeight in 5864, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5864)
    0.021389665 = weight(_text_:use in 5864) [ClassicSimilarity], result of:
      0.021389665 = score(doc=5864,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.1691581 = fieldWeight in 5864, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5864)
    0.013664153 = weight(_text_:of in 5864) [ClassicSimilarity], result of:
      0.013664153 = score(doc=5864,freq=12.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.21160212 = fieldWeight in 5864, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5864)
    0.007803111 = product of:
      0.015606222 = sum of:
        0.015606222 = weight(_text_:on in 5864) [ClassicSimilarity], result of:
          0.015606222 = score(doc=5864,freq=4.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.1718293 = fieldWeight in 5864, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5864)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: Current retrieval and recommendation approaches rely on hard-wired data models. This hinders personalized cus-tomizations to meet information needs of users in a more flexible manner. Therefore, the paper investigates how similarity-basedretrieval strategies can be combined with graph queries to enable users or system providers to explore repositories in the LinkedOpen Data (LOD) cloud more thoroughly. For this purpose, we developed novel content-based recommendation approaches.They rely on concept annotations of Simple Knowledge Organization System (SKOS) vocabularies and a SPARQL-based querylanguage that facilitates advanced and personalized requests for openly available knowledge graphs. We have comprehensivelyevaluated the novel search strategies in several test cases and example application domains (i.e., travel search and multimediaretrieval). The results of the web-based online experiments showed that our approaches increase the recall and diversity of rec-ommendations or at least provide a competitive alternative strategy of resource access when conventional methods do not providehelpful suggestions. The findings may be of use for Linked Data-enabled recommender systems (LDRS) as well as for semanticsearch engines that can consume LOD resources. (PDF) Similarity-based knowledge graph queries for recommendation retrieval. Available from: https://www.researchgate.net/publication/333358714_Similarity-based_knowledge_graph_queries_for_recommendation_retrieval [accessed May 21 2020].
Content: Vgl.: https://www.researchgate.net/publication/333358714_Similarity-based_knowledge_graph_queries_for_recommendation_retrieval. Vgl. auch: http://semantic-web-journal.net/content/similarity-based-knowledge-graph-queries-recommendation-retrieval-1.

Ding, L.; Finin, T.; Joshi, A.; Peng, Y.; Cost, R.S.; Sachs, J.; Pan, R.; Reddivari, P.; Doshi, V.: Swoogle : a Semantic Web search and metadata engine (2004) 0.04

0.041340277 = product of:
  0.08268055 = sum of:
    0.035423465 = weight(_text_:retrieval in 4704) [ClassicSimilarity], result of:
      0.035423465 = score(doc=4704,freq=4.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.2835858 = fieldWeight in 4704, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=4704)
    0.025667597 = weight(_text_:use in 4704) [ClassicSimilarity], result of:
      0.025667597 = score(doc=4704,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.20298971 = fieldWeight in 4704, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.046875 = fieldNorm(doc=4704)
    0.014968331 = weight(_text_:of in 4704) [ClassicSimilarity], result of:
      0.014968331 = score(doc=4704,freq=10.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.23179851 = fieldWeight in 4704, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=4704)
    0.006621159 = product of:
      0.013242318 = sum of:
        0.013242318 = weight(_text_:on in 4704) [ClassicSimilarity], result of:
          0.013242318 = score(doc=4704,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.14580199 = fieldWeight in 4704, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=4704)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: Swoogle is a crawler-based indexing and retrieval system for the Semantic Web, i.e., for Web documents in RDF or OWL. It extracts metadata for each discovered document, and computes relations between documents. Discovered documents are also indexed by an information retrieval system which can use either character N-Gram or URIrefs as keywords to find relevant documents and to compute the similarity among a set of documents. One of the interesting properties we compute is rank, a measure of the importance of a Semantic Web document.
Source: CIKM '04 Proceedings of the thirteenth ACM international conference on Information and knowledge management

Search (950 results, page 2 of 48)

Authors

Years

Types

Themes

Subjects

Classifications