Search (19 results, page 1 of 1)

Rekabsaz, N. et al.: Toward optimized multimodal concept indexing (2016) 0.05

0.04925008 = product of:
  0.14775024 = sum of:
    0.14775024 = sum of:
      0.08043434 = weight(_text_:indexing in 2751) [ClassicSimilarity], result of:
        0.08043434 = score(doc=2751,freq=2.0), product of:
          0.19018644 = queryWeight, product of:
            3.8278677 = idf(docFreq=2614, maxDocs=44218)
            0.049684696 = queryNorm
          0.42292362 = fieldWeight in 2751, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.8278677 = idf(docFreq=2614, maxDocs=44218)
            0.078125 = fieldNorm(doc=2751)
      0.06731591 = weight(_text_:22 in 2751) [ClassicSimilarity], result of:
        0.06731591 = score(doc=2751,freq=2.0), product of:
          0.17398734 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.049684696 = queryNorm
          0.38690117 = fieldWeight in 2751, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.078125 = fieldNorm(doc=2751)
  0.33333334 = coord(1/3)

Date: 1. 2.2016 18:25:22

Roy, R.S.; Agarwal, S.; Ganguly, N.; Choudhury, M.: Syntactic complexity of Web search queries through the lenses of language models, networks and users (2016) 0.03
```
0.029886894 = product of:
  0.08966068 = sum of:
    0.08966068 = weight(_text_:systematic in 3188) [ClassicSimilarity], result of:
      0.08966068 = score(doc=3188,freq=2.0), product of:
        0.28397155 = queryWeight, product of:
          5.715473 = idf(docFreq=395, maxDocs=44218)
          0.049684696 = queryNorm
        0.31573826 = fieldWeight in 3188, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.715473 = idf(docFreq=395, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3188)
  0.33333334 = coord(1/3)
```
Abstract

Across the world, millions of users interact with search engines every day to satisfy their information needs. As the Web grows bigger over time, such information needs, manifested through user search queries, also become more complex. However, there has been no systematic study that quantifies the structural complexity of Web search queries. In this research, we make an attempt towards understanding and characterizing the syntactic complexity of search queries using a multi-pronged approach. We use traditional statistical language modeling techniques to quantify and compare the perplexity of queries with natural language (NL). We then use complex network analysis for a comparative analysis of the topological properties of queries issued by real Web users and those generated by statistical models. Finally, we conduct experiments to study whether search engine users are able to identify real queries, when presented along with model-generated ones. The three complementary studies show that the syntactic structure of Web queries is more complex than what n-grams can capture, but simpler than NL. Queries, thus, seem to represent an intermediate stage between syntactic and non-syntactic communication.
Thenmalar, S.; Geetha, T.V.: Enhanced ontology-based indexing and searching (2014) 0.02
```
0.024107099 = product of:
  0.072321296 = sum of:
    0.072321296 = sum of:
      0.04876073 = weight(_text_:indexing in 1633) [ClassicSimilarity], result of:
        0.04876073 = score(doc=1633,freq=6.0), product of:
          0.19018644 = queryWeight, product of:
            3.8278677 = idf(docFreq=2614, maxDocs=44218)
            0.049684696 = queryNorm
          0.25638384 = fieldWeight in 1633, product of:
            2.4494898 = tf(freq=6.0), with freq of:
              6.0 = termFreq=6.0
            3.8278677 = idf(docFreq=2614, maxDocs=44218)
            0.02734375 = fieldNorm(doc=1633)
      0.023560567 = weight(_text_:22 in 1633) [ClassicSimilarity], result of:
        0.023560567 = score(doc=1633,freq=2.0), product of:
          0.17398734 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.049684696 = queryNorm
          0.1354154 = fieldWeight in 1633, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.02734375 = fieldNorm(doc=1633)
  0.33333334 = coord(1/3)
```
Abstract

Purpose - The purpose of this paper is to improve the conceptual-based search by incorporating structural ontological information such as concepts and relations. Generally, Semantic-based information retrieval aims to identify relevant information based on the meanings of the query terms or on the context of the terms and the performance of semantic information retrieval is carried out through standard measures-precision and recall. Higher precision leads to the (meaningful) relevant documents obtained and lower recall leads to the less coverage of the concepts. Design/methodology/approach - In this paper, the authors enhance the existing ontology-based indexing proposed by Kohler et al., by incorporating sibling information to the index. The index designed by Kohler et al., contains only super and sub-concepts from the ontology. In addition, in our approach, we focus on two tasks; query expansion and ranking of the expanded queries, to improve the efficiency of the ontology-based search. The aforementioned tasks make use of ontological concepts, and relations existing between those concepts so as to obtain semantically more relevant search results for a given query. Findings - The proposed ontology-based indexing technique is investigated by analysing the coverage of concepts that are being populated in the index. Here, we introduce a new measure called index enhancement measure, to estimate the coverage of ontological concepts being indexed. We have evaluated the ontology-based search for the tourism domain with the tourism documents and tourism-specific ontology. The comparison of search results based on the use of ontology "with and without query expansion" is examined to estimate the efficiency of the proposed query expansion task. The ranking is compared with the ORank system to evaluate the performance of our ontology-based search. From these analyses, the ontology-based search results shows better recall when compared to the other concept-based search systems. The mean average precision of the ontology-based search is found to be 0.79 and the recall is found to be 0.65, the ORank system has the mean average precision of 0.62 and the recall is found to be 0.51, while the concept-based search has the mean average precision of 0.56 and the recall is found to be 0.42. Practical implications - When the concept is not present in the domain-specific ontology, the concept cannot be indexed. When the given query term is not available in the ontology then the term-based results are retrieved. Originality/value - In addition to super and sub-concepts, we incorporate the concepts present in same level (siblings) to the ontological index. The structural information from the ontology is determined for the query expansion. The ranking of the documents depends on the type of the query (single concept query, multiple concept queries and concept with relation queries) and the ontological relations that exists in the query and the documents. With this ontological structural information, the search results showed us better coverage of concepts with respect to the query.

Date

20. 1.2015 18:30:22
Zhang, W.; Yoshida, T.; Tang, X.: ¬A comparative study of TF*IDF, LSI and multi-words for text classification (2011) 0.01
```
0.013931636 = product of:
  0.041794907 = sum of:
    0.041794907 = product of:
      0.083589815 = sum of:
        0.083589815 = weight(_text_:indexing in 1165) [ClassicSimilarity], result of:
          0.083589815 = score(doc=1165,freq=6.0), product of:
            0.19018644 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.049684696 = queryNorm
            0.4395151 = fieldWeight in 1165, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.046875 = fieldNorm(doc=1165)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

One of the main themes in text mining is text representation, which is fundamental and indispensable for text-based intellegent information processing. Generally, text representation inludes two tasks: indexing and weighting. This paper has comparatively studied TF*IDF, LSI and multi-word for text representation. We used a Chinese and an English document collection to respectively evaluate the three methods in information retreival and text categorization. Experimental results have demonstrated that in text categorization, LSI has better performance than other methods in both document collections. Also, LSI has produced the best performance in retrieving English documents. This outcome has shown that LSI has both favorable semantic and statistical quality and is different with the claim that LSI can not produce discriminative power for indexing.

Object

Latent Semantic Indexing
Ma, N.; Zheng, H.T.; Xiao, X.: ¬An ontology-based latent semantic indexing approach using long short-term memory networks (2017) 0.01
```
0.0134057235 = product of:
  0.04021717 = sum of:
    0.04021717 = product of:
      0.08043434 = sum of:
        0.08043434 = weight(_text_:indexing in 3810) [ClassicSimilarity], result of:
          0.08043434 = score(doc=3810,freq=8.0), product of:
            0.19018644 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.049684696 = queryNorm
            0.42292362 = fieldWeight in 3810, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3810)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Nowadays, online data shows an astonishing increase and the issue of semantic indexing remains an open question. Ontologies and knowledge bases have been widely used to optimize performance. However, researchers are placing increased emphasis on internal relations of ontologies but neglect latent semantic relations between ontologies and documents. They generally annotate instances mentioned in documents, which are related to concepts in ontologies. In this paper, we propose an Ontology-based Latent Semantic Indexing approach utilizing Long Short-Term Memory networks (LSTM-OLSI). We utilize an importance-aware topic model to extract document-level semantic features and leverage ontologies to extract word-level contextual features. Then we encode the above two levels of features and match their embedding vectors utilizing LSTM networks. Finally, the experimental results reveal that LSTM-OLSI outperforms existing techniques and demonstrates deep comprehension of instances and articles.

Object

Latent Semantic Indexing
Chebil, W.; Soualmia, L.F.; Omri, M.N.; Darmoni, S.F.: Indexing biomedical documents with a possibilistic network (2016) 0.01
```
0.011609698 = product of:
  0.03482909 = sum of:
    0.03482909 = product of:
      0.06965818 = sum of:
        0.06965818 = weight(_text_:indexing in 2854) [ClassicSimilarity], result of:
          0.06965818 = score(doc=2854,freq=6.0), product of:
            0.19018644 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.049684696 = queryNorm
            0.3662626 = fieldWeight in 2854, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2854)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

In this article, we propose a new approach for indexing biomedical documents based on a possibilistic network that carries out partial matching between documents and biomedical vocabulary. The main contribution of our approach is to deal with the imprecision and uncertainty of the indexing task using possibility theory. We enhance estimation of the similarity between a document and a given concept using the two measures of possibility and necessity. Possibility estimates the extent to which a document is not similar to the concept. The second measure can provide confirmation that the document is similar to the concept. Our contribution also reduces the limitation of partial matching. Although the latter allows extracting from the document other variants of terms than those in dictionaries, it also generates irrelevant information. Our objective is to filter the index using the knowledge provided by the Unified Medical Language System®. Experiments were carried out on different corpora, showing encouraging results (the improvement rate is +26.37% in terms of main average precision when compared with the baseline).

Kozikowski, P. et al.: Support of part-whole relations in query answering (2016) 0.01

0.011219318 = product of:
  0.033657953 = sum of:
    0.033657953 = product of:
      0.06731591 = sum of:
        0.06731591 = weight(_text_:22 in 2754) [ClassicSimilarity], result of:
          0.06731591 = score(doc=2754,freq=2.0), product of:
            0.17398734 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049684696 = queryNorm
            0.38690117 = fieldWeight in 2754, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2754)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 1. 2.2016 18:25:22

Marx, E. et al.: Exploring term networks for semantic search over RDF knowledge graphs (2016) 0.01

0.011219318 = product of:
  0.033657953 = sum of:
    0.033657953 = product of:
      0.06731591 = sum of:
        0.06731591 = weight(_text_:22 in 3279) [ClassicSimilarity], result of:
          0.06731591 = score(doc=3279,freq=2.0), product of:
            0.17398734 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049684696 = queryNorm
            0.38690117 = fieldWeight in 3279, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=3279)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Metadata and semantics research: 10th International Conference, MTSR 2016, Göttingen, Germany, November 22-25, 2016, Proceedings. Eds.: E. Garoufallou

Kopácsi, S. et al.: Development of a classification server to support metadata harmonization in a long term preservation system (2016) 0.01

0.011219318 = product of:
  0.033657953 = sum of:
    0.033657953 = product of:
      0.06731591 = sum of:
        0.06731591 = weight(_text_:22 in 3280) [ClassicSimilarity], result of:
          0.06731591 = score(doc=3280,freq=2.0), product of:
            0.17398734 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049684696 = queryNorm
            0.38690117 = fieldWeight in 3280, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=3280)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Metadata and semantics research: 10th International Conference, MTSR 2016, Göttingen, Germany, November 22-25, 2016, Proceedings. Eds.: E. Garoufallou

Salaba, A.; Zeng, M.L.: Extending the "Explore" user task beyond subject authority data into the linked data sphere (2014) 0.01

0.007853523 = product of:
  0.023560567 = sum of:
    0.023560567 = product of:
      0.047121134 = sum of:
        0.047121134 = weight(_text_:22 in 1465) [ClassicSimilarity], result of:
          0.047121134 = score(doc=1465,freq=2.0), product of:
            0.17398734 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049684696 = queryNorm
            0.2708308 = fieldWeight in 1465, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1465)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik

Mlodzka-Stybel, A.: Towards continuous improvement of users' access to a library catalogue (2014) 0.01

0.007853523 = product of:
  0.023560567 = sum of:
    0.023560567 = product of:
      0.047121134 = sum of:
        0.047121134 = weight(_text_:22 in 1466) [ClassicSimilarity], result of:
          0.047121134 = score(doc=1466,freq=2.0), product of:
            0.17398734 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049684696 = queryNorm
            0.2708308 = fieldWeight in 1466, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1466)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik

Semantic search over the Web (2012) 0.01
```
0.0075834226 = product of:
  0.022750268 = sum of:
    0.022750268 = product of:
      0.045500536 = sum of:
        0.045500536 = weight(_text_:indexing in 411) [ClassicSimilarity], result of:
          0.045500536 = score(doc=411,freq=4.0), product of:
            0.19018644 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.049684696 = queryNorm
            0.23924173 = fieldWeight in 411, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.03125 = fieldNorm(doc=411)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

The Web has become the world's largest database, with search being the main tool that allows organizations and individuals to exploit its huge amount of information. Search on the Web has been traditionally based on textual and structural similarities, ignoring to a large degree the semantic dimension, i.e., understanding the meaning of the query and of the document content. Combining search and semantics gives birth to the idea of semantic search. Traditional search engines have already advertised some semantic dimensions. Some of them, for instance, can enhance their generated result sets with documents that are semantically related to the query terms even though they may not include these terms. Nevertheless, the exploitation of the semantic search has not yet reached its full potential. In this book, Roberto De Virgilio, Francesco Guerra and Yannis Velegrakis present an extensive overview of the work done in Semantic Search and other related areas. They explore different technologies and solutions in depth, making their collection a valuable and stimulating reading for both academic and industrial researchers. The book is divided into three parts. The first introduces the readers to the basic notions of the Web of Data. It describes the different kinds of data that exist, their topology, and their storing and indexing techniques. The second part is dedicated to Web Search. It presents different types of search, like the exploratory or the path-oriented, alongside methods for their efficient and effective implementation. Other related topics included in this part are the use of uncertainty in query answering, the exploitation of ontologies, and the use of semantics in mashup design and operation. The focus of the third part is on linked data, and more specifically, on applying ideas originating in recommender systems on linked data management, and on techniques for the efficiently querying answering on linked data.

Content

Inhalt: Introduction.- Part I Introduction to Web of Data.- Topology of the Web of Data.- Storing and Indexing Massive RDF Data Sets.- Designing Exploratory Search Applications upon Web Data Sources.- Part II Search over the Web.- Path-oriented Keyword Search query over RDF.- Interactive Query Construction for Keyword Search on the SemanticWeb.- Understanding the Semantics of Keyword Queries on Relational DataWithout Accessing the Instance.- Keyword-Based Search over Semantic Data.- Semantic Link Discovery over Relational Data.- Embracing Uncertainty in Entity Linking.- The Return of the Entity-Relationship Model: Ontological Query Answering.- Linked Data Services and Semantics-enabled Mashup.- Part III Linked Data Search engines.- A Recommender System for Linked Data.- Flint: from Web Pages to Probabilistic Semantic Data.- Searching and Browsing Linked Data with SWSE.

Zeng, M.L.; Gracy, K.F.; Zumer, M.: Using a semantic analysis tool to generate subject access points : a study using Panofsky's theory and two research samples (2014) 0.01

0.0067315903 = product of:
  0.02019477 = sum of:
    0.02019477 = product of:
      0.04038954 = sum of:
        0.04038954 = weight(_text_:22 in 1464) [ClassicSimilarity], result of:
          0.04038954 = score(doc=1464,freq=2.0), product of:
            0.17398734 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049684696 = queryNorm
            0.23214069 = fieldWeight in 1464, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=1464)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik

Bergamaschi, S.; Domnori, E.; Guerra, F.; Rota, S.; Lado, R.T.; Velegrakis, Y.: Understanding the semantics of keyword queries on relational data without accessing the instance (2012) 0.01
```
0.0067028617 = product of:
  0.020108584 = sum of:
    0.020108584 = product of:
      0.04021717 = sum of:
        0.04021717 = weight(_text_:indexing in 431) [ClassicSimilarity], result of:
          0.04021717 = score(doc=431,freq=2.0), product of:
            0.19018644 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.049684696 = queryNorm
            0.21146181 = fieldWeight in 431, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.0390625 = fieldNorm(doc=431)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

The birth of the Web has brought an exponential growth to the amount of the information that is freely available to the Internet population, overloading users and entangling their efforts to satisfy their information needs. Web search engines such Google, Yahoo, or Bing have become popular mainly due to the fact that they offer an easy-to-use query interface (i.e., based on keywords) and an effective and efficient query execution mechanism. The majority of these search engines do not consider information stored on the deep or hidden Web [9,28], despite the fact that the size of the deep Web is estimated to be much bigger than the surface Web [9,47]. There have been a number of systems that record interactions with the deep Web sources or automatically submit queries them (mainly through their Web form interfaces) in order to index their context. Unfortunately, this technique is only partially indexing the data instance. Moreover, it is not possible to take advantage of the query capabilities of data sources, for example, of the relational query features, because their interface is often restricted from the Web form. Besides, Web search engines focus on retrieving documents and not on querying structured sources, so they are unable to access information based on concepts.

Brandão, W.C.; Santos, R.L.T.; Ziviani, N.; Moura, E.S. de; Silva, A.S. da: Learning to expand queries using entities (2014) 0.01

0.005609659 = product of:
  0.016828977 = sum of:
    0.016828977 = product of:
      0.033657953 = sum of:
        0.033657953 = weight(_text_:22 in 1343) [ClassicSimilarity], result of:
          0.033657953 = score(doc=1343,freq=2.0), product of:
            0.17398734 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049684696 = queryNorm
            0.19345059 = fieldWeight in 1343, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1343)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 22. 8.2014 17:07:50

Layfield, C.; Azzopardi, J,; Staff, C.: Experiments with document retrieval from small text collections using Latent Semantic Analysis or term similarity with query coordination and automatic relevance feedback (2017) 0.01
```
0.00536229 = product of:
  0.016086869 = sum of:
    0.016086869 = product of:
      0.032173738 = sum of:
        0.032173738 = weight(_text_:indexing in 3478) [ClassicSimilarity], result of:
          0.032173738 = score(doc=3478,freq=2.0), product of:
            0.19018644 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.049684696 = queryNorm
            0.16916946 = fieldWeight in 3478, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.03125 = fieldNorm(doc=3478)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

One of the problems faced by users of databases containing textual documents is the difficulty in retrieving relevant results due to the diverse vocabulary used in queries and contained in relevant documents, especially when there are only a small number of relevant documents. This problem is known as the Vocabulary Gap. The PIKES team have constructed a small test collection of 331 articles extracted from a blog and a Gold Standard for 35 queries selected from the blog's search log so the results of different approaches to semantic search can be compared. So far, prior approaches include recognising Named Entities in documents and queries, and relations including temporal relations, and represent them as `semantic layers' in a retrieval system index. In this work, we take two different approaches that do not involve Named Entity Recognition. In the first approach, we process an unannotated version of the PIKES document collection using Latent Semantic Analysis and use a combination of query coordination and automatic relevance feedback with which we outperform prior work. However, this approach is highly dependent on the underlying collection, and is not necessarily scalable to massive collections. In our second approach, we use an LSA Model generated by SEMILAR from a Wikipedia dump to generate a Term Similarity Matrix (TSM). We automatically expand the queries in the PIKES test collection with related terms from the TSM and submit them to a term-by-document matrix derived by indexing the PIKES collection using the Vector Space Model. Coupled with a combination of query coordination and automatic relevance feedback we also outperform prior work with this approach. The advantage of the second approach is that it is independent of the underlying document collection.

Brunetti, J.M.; Roberto García, R.: User-centered design and evaluation of overview components for semantic data exploration (2014) 0.00

0.0044877273 = product of:
  0.013463181 = sum of:
    0.013463181 = product of:
      0.026926363 = sum of:
        0.026926363 = weight(_text_:22 in 1626) [ClassicSimilarity], result of:
          0.026926363 = score(doc=1626,freq=2.0), product of:
            0.17398734 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049684696 = queryNorm
            0.15476047 = fieldWeight in 1626, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=1626)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 20. 1.2015 18:30:22

Gillitzer, B.: Yewno (2017) 0.00

0.0044877273 = product of:
  0.013463181 = sum of:
    0.013463181 = product of:
      0.026926363 = sum of:
        0.026926363 = weight(_text_:22 in 3447) [ClassicSimilarity], result of:
          0.026926363 = score(doc=3447,freq=2.0), product of:
            0.17398734 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049684696 = queryNorm
            0.15476047 = fieldWeight in 3447, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=3447)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 22. 2.2017 10:16:49

Hannech, A.: Système de recherche d'information étendue basé sur une projection multi-espaces (2018) 0.00
```
0.002681145 = product of:
  0.0080434345 = sum of:
    0.0080434345 = product of:
      0.016086869 = sum of:
        0.016086869 = weight(_text_:indexing in 4472) [ClassicSimilarity], result of:
          0.016086869 = score(doc=4472,freq=2.0), product of:
            0.19018644 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.049684696 = queryNorm
            0.08458473 = fieldWeight in 4472, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.015625 = fieldNorm(doc=4472)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

However, this assumption does not hold in all cases, the needs of the user evolve over time and can move away from his previous interests stored in his profile. In other cases, the user's profile may be misused to extract or infer new information needs. This problem is much more accentuated with ambiguous queries. When multiple POIs linked to a search query are identified in the user's profile, the system is unable to select the relevant data from that profile to respond to that request. This has a direct impact on the quality of the results provided to this user. In order to overcome some of these limitations, in this research thesis, we have been interested in the development of techniques aimed mainly at improving the relevance of the results of current SRIs and facilitating the exploration of major collections of documents. To do this, we propose a solution based on a new concept and model of indexing and information retrieval called multi-spaces projection. This proposal is based on the exploitation of different categories of semantic and social information that enrich the universe of document representation and search queries in several dimensions of interpretations. The originality of this representation is to be able to distinguish between the different interpretations used for the description and the search for documents. This gives a better visibility on the results returned and helps to provide a greater flexibility of search and exploration, giving the user the ability to navigate one or more views of data that interest him the most. In addition, the proposed multidimensional representation universes for document description and search query interpretation help to improve the relevance of the user's results by providing a diversity of research / exploration that helps meet his diverse needs and those of other different users. This study exploits different aspects that are related to the personalized search and aims to solve the problems caused by the evolution of the information needs of the user. Thus, when the profile of this user is used by our system, a technique is proposed and used to identify the interests most representative of his current needs in his profile. This technique is based on the combination of three influential factors, including the contextual, frequency and temporal factor of the data. The ability of users to interact, exchange ideas and opinions, and form social networks on the Web, has led systems to focus on the types of interactions these users have at the level of interaction between them as well as their social roles in the system. This social information is discussed and integrated into this research work. The impact and how they are integrated into the IR process are studied to improve the relevance of the results.

Search (19 results, page 1 of 1)

Authors

Languages

Types

Themes