Search (13 results, page 1 of 1)

  • × theme_ss:"Semantisches Umfeld in Indexierung u. Retrieval"
  • × type_ss:"a"
  • × year_i:[2010 TO 2020}
  1. Thenmalar, S.; Geetha, T.V.: Enhanced ontology-based indexing and searching (2014) 0.04
    0.043791123 = product of:
      0.1094778 = sum of:
        0.08505092 = weight(_text_:index in 1633) [ClassicSimilarity], result of:
          0.08505092 = score(doc=1633,freq=10.0), product of:
            0.2250935 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.051511593 = queryNorm
            0.37784708 = fieldWeight in 1633, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1633)
        0.024426885 = weight(_text_:22 in 1633) [ClassicSimilarity], result of:
          0.024426885 = score(doc=1633,freq=2.0), product of:
            0.18038483 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051511593 = queryNorm
            0.1354154 = fieldWeight in 1633, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1633)
      0.4 = coord(2/5)
    
    Abstract
    Purpose - The purpose of this paper is to improve the conceptual-based search by incorporating structural ontological information such as concepts and relations. Generally, Semantic-based information retrieval aims to identify relevant information based on the meanings of the query terms or on the context of the terms and the performance of semantic information retrieval is carried out through standard measures-precision and recall. Higher precision leads to the (meaningful) relevant documents obtained and lower recall leads to the less coverage of the concepts. Design/methodology/approach - In this paper, the authors enhance the existing ontology-based indexing proposed by Kohler et al., by incorporating sibling information to the index. The index designed by Kohler et al., contains only super and sub-concepts from the ontology. In addition, in our approach, we focus on two tasks; query expansion and ranking of the expanded queries, to improve the efficiency of the ontology-based search. The aforementioned tasks make use of ontological concepts, and relations existing between those concepts so as to obtain semantically more relevant search results for a given query. Findings - The proposed ontology-based indexing technique is investigated by analysing the coverage of concepts that are being populated in the index. Here, we introduce a new measure called index enhancement measure, to estimate the coverage of ontological concepts being indexed. We have evaluated the ontology-based search for the tourism domain with the tourism documents and tourism-specific ontology. The comparison of search results based on the use of ontology "with and without query expansion" is examined to estimate the efficiency of the proposed query expansion task. The ranking is compared with the ORank system to evaluate the performance of our ontology-based search. From these analyses, the ontology-based search results shows better recall when compared to the other concept-based search systems. The mean average precision of the ontology-based search is found to be 0.79 and the recall is found to be 0.65, the ORank system has the mean average precision of 0.62 and the recall is found to be 0.51, while the concept-based search has the mean average precision of 0.56 and the recall is found to be 0.42. Practical implications - When the concept is not present in the domain-specific ontology, the concept cannot be indexed. When the given query term is not available in the ontology then the term-based results are retrieved. Originality/value - In addition to super and sub-concepts, we incorporate the concepts present in same level (siblings) to the ontological index. The structural information from the ontology is determined for the query expansion. The ranking of the documents depends on the type of the query (single concept query, multiple concept queries and concept with relation queries) and the ontological relations that exists in the query and the documents. With this ontological structural information, the search results showed us better coverage of concepts with respect to the query.
    Date
    20. 1.2015 18:30:22
  2. Järvelin, A.; Keskustalo, H.; Sormunen, E.; Saastamoinen, M.; Kettunen, K.: Information retrieval from historical newspaper collections in highly inflectional languages : a query expansion approach (2016) 0.02
    0.018822905 = product of:
      0.09411452 = sum of:
        0.09411452 = weight(_text_:index in 3223) [ClassicSimilarity], result of:
          0.09411452 = score(doc=3223,freq=6.0), product of:
            0.2250935 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.051511593 = queryNorm
            0.418113 = fieldWeight in 3223, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3223)
      0.2 = coord(1/5)
    
    Abstract
    The aim of the study was to test whether query expansion by approximate string matching methods is beneficial in retrieval from historical newspaper collections in a language rich with compounds and inflectional forms (Finnish). First, approximate string matching methods were used to generate lists of index words most similar to contemporary query terms in a digitized newspaper collection from the 1800s. Top index word variants were categorized to estimate the appropriate query expansion ranges in the retrieval test. Second, the effectiveness of approximate string matching methods, automatically generated inflectional forms, and their combinations were measured in a Cranfield-style test. Finally, a detailed topic-level analysis of test results was conducted. In the index of historical newspaper collection the occurrences of a word typically spread to many linguistic and historical variants along with optical character recognition (OCR) errors. All query expansion methods improved the baseline results. Extensive expansion of around 30 variants for each query word was required to achieve the highest performance improvement. Query expansion based on approximate string matching was superior to using the inflectional forms of the query words, showing that coverage of the different types of variation is more important than precision in handling one type of variation.
  3. Rekabsaz, N. et al.: Toward optimized multimodal concept indexing (2016) 0.01
    0.01395822 = product of:
      0.0697911 = sum of:
        0.0697911 = weight(_text_:22 in 2751) [ClassicSimilarity], result of:
          0.0697911 = score(doc=2751,freq=2.0), product of:
            0.18038483 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051511593 = queryNorm
            0.38690117 = fieldWeight in 2751, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2751)
      0.2 = coord(1/5)
    
    Date
    1. 2.2016 18:25:22
  4. Kozikowski, P. et al.: Support of part-whole relations in query answering (2016) 0.01
    0.01395822 = product of:
      0.0697911 = sum of:
        0.0697911 = weight(_text_:22 in 2754) [ClassicSimilarity], result of:
          0.0697911 = score(doc=2754,freq=2.0), product of:
            0.18038483 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051511593 = queryNorm
            0.38690117 = fieldWeight in 2754, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2754)
      0.2 = coord(1/5)
    
    Date
    1. 2.2016 18:25:22
  5. Marx, E. et al.: Exploring term networks for semantic search over RDF knowledge graphs (2016) 0.01
    0.01395822 = product of:
      0.0697911 = sum of:
        0.0697911 = weight(_text_:22 in 3279) [ClassicSimilarity], result of:
          0.0697911 = score(doc=3279,freq=2.0), product of:
            0.18038483 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051511593 = queryNorm
            0.38690117 = fieldWeight in 3279, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=3279)
      0.2 = coord(1/5)
    
    Source
    Metadata and semantics research: 10th International Conference, MTSR 2016, Göttingen, Germany, November 22-25, 2016, Proceedings. Eds.: E. Garoufallou
  6. Kopácsi, S. et al.: Development of a classification server to support metadata harmonization in a long term preservation system (2016) 0.01
    0.01395822 = product of:
      0.0697911 = sum of:
        0.0697911 = weight(_text_:22 in 3280) [ClassicSimilarity], result of:
          0.0697911 = score(doc=3280,freq=2.0), product of:
            0.18038483 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051511593 = queryNorm
            0.38690117 = fieldWeight in 3280, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=3280)
      0.2 = coord(1/5)
    
    Source
    Metadata and semantics research: 10th International Conference, MTSR 2016, Göttingen, Germany, November 22-25, 2016, Proceedings. Eds.: E. Garoufallou
  7. Chebil, W.; Soualmia, L.F.; Omri, M.N.; Darmoni, S.F.: Indexing biomedical documents with a possibilistic network (2016) 0.01
    0.010867408 = product of:
      0.054337036 = sum of:
        0.054337036 = weight(_text_:index in 2854) [ClassicSimilarity], result of:
          0.054337036 = score(doc=2854,freq=2.0), product of:
            0.2250935 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.051511593 = queryNorm
            0.24139762 = fieldWeight in 2854, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2854)
      0.2 = coord(1/5)
    
    Abstract
    In this article, we propose a new approach for indexing biomedical documents based on a possibilistic network that carries out partial matching between documents and biomedical vocabulary. The main contribution of our approach is to deal with the imprecision and uncertainty of the indexing task using possibility theory. We enhance estimation of the similarity between a document and a given concept using the two measures of possibility and necessity. Possibility estimates the extent to which a document is not similar to the concept. The second measure can provide confirmation that the document is similar to the concept. Our contribution also reduces the limitation of partial matching. Although the latter allows extracting from the document other variants of terms than those in dictionaries, it also generates irrelevant information. Our objective is to filter the index using the knowledge provided by the Unified Medical Language System®. Experiments were carried out on different corpora, showing encouraging results (the improvement rate is +26.37% in terms of main average precision when compared with the baseline).
  8. Salaba, A.; Zeng, M.L.: Extending the "Explore" user task beyond subject authority data into the linked data sphere (2014) 0.01
    0.009770754 = product of:
      0.04885377 = sum of:
        0.04885377 = weight(_text_:22 in 1465) [ClassicSimilarity], result of:
          0.04885377 = score(doc=1465,freq=2.0), product of:
            0.18038483 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051511593 = queryNorm
            0.2708308 = fieldWeight in 1465, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1465)
      0.2 = coord(1/5)
    
    Source
    Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik
  9. Mlodzka-Stybel, A.: Towards continuous improvement of users' access to a library catalogue (2014) 0.01
    0.009770754 = product of:
      0.04885377 = sum of:
        0.04885377 = weight(_text_:22 in 1466) [ClassicSimilarity], result of:
          0.04885377 = score(doc=1466,freq=2.0), product of:
            0.18038483 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051511593 = queryNorm
            0.2708308 = fieldWeight in 1466, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1466)
      0.2 = coord(1/5)
    
    Source
    Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik
  10. Layfield, C.; Azzopardi, J,; Staff, C.: Experiments with document retrieval from small text collections using Latent Semantic Analysis or term similarity with query coordination and automatic relevance feedback (2017) 0.01
    0.008693925 = product of:
      0.043469626 = sum of:
        0.043469626 = weight(_text_:index in 3478) [ClassicSimilarity], result of:
          0.043469626 = score(doc=3478,freq=2.0), product of:
            0.2250935 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.051511593 = queryNorm
            0.1931181 = fieldWeight in 3478, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.03125 = fieldNorm(doc=3478)
      0.2 = coord(1/5)
    
    Abstract
    One of the problems faced by users of databases containing textual documents is the difficulty in retrieving relevant results due to the diverse vocabulary used in queries and contained in relevant documents, especially when there are only a small number of relevant documents. This problem is known as the Vocabulary Gap. The PIKES team have constructed a small test collection of 331 articles extracted from a blog and a Gold Standard for 35 queries selected from the blog's search log so the results of different approaches to semantic search can be compared. So far, prior approaches include recognising Named Entities in documents and queries, and relations including temporal relations, and represent them as `semantic layers' in a retrieval system index. In this work, we take two different approaches that do not involve Named Entity Recognition. In the first approach, we process an unannotated version of the PIKES document collection using Latent Semantic Analysis and use a combination of query coordination and automatic relevance feedback with which we outperform prior work. However, this approach is highly dependent on the underlying collection, and is not necessarily scalable to massive collections. In our second approach, we use an LSA Model generated by SEMILAR from a Wikipedia dump to generate a Term Similarity Matrix (TSM). We automatically expand the queries in the PIKES test collection with related terms from the TSM and submit them to a term-by-document matrix derived by indexing the PIKES collection using the Vector Space Model. Coupled with a combination of query coordination and automatic relevance feedback we also outperform prior work with this approach. The advantage of the second approach is that it is independent of the underlying document collection.
  11. Zeng, M.L.; Gracy, K.F.; Zumer, M.: Using a semantic analysis tool to generate subject access points : a study using Panofsky's theory and two research samples (2014) 0.01
    0.008374932 = product of:
      0.04187466 = sum of:
        0.04187466 = weight(_text_:22 in 1464) [ClassicSimilarity], result of:
          0.04187466 = score(doc=1464,freq=2.0), product of:
            0.18038483 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051511593 = queryNorm
            0.23214069 = fieldWeight in 1464, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=1464)
      0.2 = coord(1/5)
    
    Source
    Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik
  12. Brandão, W.C.; Santos, R.L.T.; Ziviani, N.; Moura, E.S. de; Silva, A.S. da: Learning to expand queries using entities (2014) 0.01
    0.00697911 = product of:
      0.03489555 = sum of:
        0.03489555 = weight(_text_:22 in 1343) [ClassicSimilarity], result of:
          0.03489555 = score(doc=1343,freq=2.0), product of:
            0.18038483 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051511593 = queryNorm
            0.19345059 = fieldWeight in 1343, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1343)
      0.2 = coord(1/5)
    
    Date
    22. 8.2014 17:07:50
  13. Brunetti, J.M.; Roberto García, R.: User-centered design and evaluation of overview components for semantic data exploration (2014) 0.01
    0.005583288 = product of:
      0.02791644 = sum of:
        0.02791644 = weight(_text_:22 in 1626) [ClassicSimilarity], result of:
          0.02791644 = score(doc=1626,freq=2.0), product of:
            0.18038483 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051511593 = queryNorm
            0.15476047 = fieldWeight in 1626, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=1626)
      0.2 = coord(1/5)
    
    Date
    20. 1.2015 18:30:22