Search (96 results, page 2 of 5)

Kopácsi, S. et al.: Development of a classification server to support metadata harmonization in a long term preservation system (2016) 0.03

0.030329227 = product of:
  0.08087794 = sum of:
    0.04174695 = weight(_text_:retrieval in 3280) [ClassicSimilarity], result of:
      0.04174695 = score(doc=3280,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.33420905 = fieldWeight in 3280, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=3280)
    0.011156735 = weight(_text_:of in 3280) [ClassicSimilarity], result of:
      0.011156735 = score(doc=3280,freq=2.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.17277241 = fieldWeight in 3280, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.078125 = fieldNorm(doc=3280)
    0.02797425 = product of:
      0.0559485 = sum of:
        0.0559485 = weight(_text_:22 in 3280) [ClassicSimilarity], result of:
          0.0559485 = score(doc=3280,freq=2.0), product of:
            0.1446067 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.041294612 = queryNorm
            0.38690117 = fieldWeight in 3280, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=3280)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Source: Metadata and semantics research: 10th International Conference, MTSR 2016, Göttingen, Germany, November 22-25, 2016, Proceedings. Eds.: E. Garoufallou
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Rekabsaz, N. et al.: Toward optimized multimodal concept indexing (2016) 0.03

0.030283675 = product of:
  0.08075646 = sum of:
    0.04174695 = weight(_text_:retrieval in 2751) [ClassicSimilarity], result of:
      0.04174695 = score(doc=2751,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.33420905 = fieldWeight in 2751, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=2751)
    0.0110352645 = product of:
      0.022070529 = sum of:
        0.022070529 = weight(_text_:on in 2751) [ClassicSimilarity], result of:
          0.022070529 = score(doc=2751,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.24300331 = fieldWeight in 2751, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.078125 = fieldNorm(doc=2751)
      0.5 = coord(1/2)
    0.02797425 = product of:
      0.0559485 = sum of:
        0.0559485 = weight(_text_:22 in 2751) [ClassicSimilarity], result of:
          0.0559485 = score(doc=2751,freq=2.0), product of:
            0.1446067 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.041294612 = queryNorm
            0.38690117 = fieldWeight in 2751, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2751)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Date: 1. 2.2016 18:25:22
Source: Semantic keyword-based search on structured data sources: First COST Action IC1302 International KEYSTONE Conference, IKC 2015, Coimbra, Portugal, September 8-9, 2015. Revised Selected Papers. Eds.: J. Cardoso et al
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Symonds, M.; Bruza, P.; Zuccon, G.; Koopman, B.; Sitbon, L.; Turner, I.: Automatic query expansion : a structural linguistic perspective (2014) 0.03

0.029587397 = product of:
  0.078899726 = sum of:
    0.051129367 = weight(_text_:retrieval in 1338) [ClassicSimilarity], result of:
      0.051129367 = score(doc=1338,freq=12.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.40932083 = fieldWeight in 1338, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1338)
    0.0167351 = weight(_text_:of in 1338) [ClassicSimilarity], result of:
      0.0167351 = score(doc=1338,freq=18.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.25915858 = fieldWeight in 1338, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1338)
    0.0110352645 = product of:
      0.022070529 = sum of:
        0.022070529 = weight(_text_:on in 1338) [ClassicSimilarity], result of:
          0.022070529 = score(doc=1338,freq=8.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.24300331 = fieldWeight in 1338, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1338)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: A user's query is considered to be an imprecise description of their information need. Automatic query expansion is the process of reformulating the original query with the goal of improving retrieval effectiveness. Many successful query expansion techniques model syntagmatic associations that infer two terms co-occur more often than by chance in natural language. However, structural linguistics relies on both syntagmatic and paradigmatic associations to deduce the meaning of a word. Given the success of dependency-based approaches to query expansion and the reliance on word meanings in the query formulation process, we argue that modeling both syntagmatic and paradigmatic information in the query expansion process improves retrieval effectiveness. This article develops and evaluates a new query expansion technique that is based on a formal, corpus-based model of word meaning that models syntagmatic and paradigmatic associations. We demonstrate that when sufficient statistical information exists, as in the case of longer queries, including paradigmatic information alone provides significant improvements in retrieval effectiveness across a wide variety of data sets. More generally, when our new query expansion approach is applied to large-scale web retrieval it demonstrates significant improvements in retrieval effectiveness over a strong baseline system, based on a commercial search engine.
Source: Journal of the Association for Information Science and Technology. 65(2014) no.8, S.1577-1596
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Jindal, V.; Bawa, S.; Batra, S.: ¬A review of ranking approaches for semantic search on Web (2014) 0.03

0.028500058 = product of:
  0.076000154 = sum of:
    0.04338471 = weight(_text_:retrieval in 2799) [ClassicSimilarity], result of:
      0.04338471 = score(doc=2799,freq=6.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.34732026 = fieldWeight in 2799, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2799)
    0.016396983 = weight(_text_:of in 2799) [ClassicSimilarity], result of:
      0.016396983 = score(doc=2799,freq=12.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.25392252 = fieldWeight in 2799, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=2799)
    0.016218461 = product of:
      0.032436922 = sum of:
        0.032436922 = weight(_text_:on in 2799) [ClassicSimilarity], result of:
          0.032436922 = score(doc=2799,freq=12.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.35714048 = fieldWeight in 2799, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=2799)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: With ever increasing information being available to the end users, search engines have become the most powerful tools for obtaining useful information scattered on the Web. However, it is very common that even most renowned search engines return result sets with not so useful pages to the user. Research on semantic search aims to improve traditional information search and retrieval methods where the basic relevance criteria rely primarily on the presence of query keywords within the returned pages. This work is an attempt to explore different relevancy ranking approaches based on semantics which are considered appropriate for the retrieval of relevant information. In this paper, various pilot projects and their corresponding outcomes have been investigated based on methodologies adopted and their most distinctive characteristics towards ranking. An overview of selected approaches and their comparison by means of the classification criteria has been presented. With the help of this comparison, some common concepts and outstanding features have been identified.
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Koopman, B.; Zuccon, G.; Bruza, P.; Sitbon, L.; Lawley, M.: Information retrieval as semantic inference : a graph Inference model applied to medical search (2016) 0.03
```
0.027163785 = product of:
  0.07243676 = sum of:
    0.052806184 = weight(_text_:retrieval in 3260) [ClassicSimilarity], result of:
      0.052806184 = score(doc=3260,freq=20.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.42274472 = fieldWeight in 3260, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=3260)
    0.0133880805 = weight(_text_:of in 3260) [ClassicSimilarity], result of:
      0.0133880805 = score(doc=3260,freq=18.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.20732687 = fieldWeight in 3260, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03125 = fieldNorm(doc=3260)
    0.0062424885 = product of:
      0.012484977 = sum of:
        0.012484977 = weight(_text_:on in 3260) [ClassicSimilarity], result of:
          0.012484977 = score(doc=3260,freq=4.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.13746344 = fieldWeight in 3260, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.03125 = fieldNorm(doc=3260)
      0.5 = coord(1/2)
  0.375 = coord(3/8)
```
Abstract

This paper presents a Graph Inference retrieval model that integrates structured knowledge resources, statistical information retrieval methods and inference in a unified framework. Key components of the model are a graph-based representation of the corpus and retrieval driven by an inference mechanism achieved as a traversal over the graph. The model is proposed to tackle the semantic gap problem-the mismatch between the raw data and the way a human being interprets it. We break down the semantic gap problem into five core issues, each requiring a specific type of inference in order to be overcome. Our model and evaluation is applied to the medical domain because search within this domain is particularly challenging and, as we show, often requires inference. In addition, this domain features both structured knowledge resources as well as unstructured text. Our evaluation shows that inference can be effective, retrieving many new relevant documents that are not retrieved by state-of-the-art information retrieval models. We show that many retrieved documents were not pooled by keyword-based search methods, prompting us to perform additional relevance assessment on these new documents. A third of the newly retrieved documents judged were found to be relevant. Our analysis provides a thorough understanding of when and how to apply inference for retrieval, including a categorisation of queries according to the effect of inference. The inference mechanism promoted recall by retrieving new relevant documents not found by previous keyword-based approaches. In addition, it promoted precision by an effective reranking of documents. When inference is used, performance gains can generally be expected on hard queries. However, inference should not be applied universally: for easy, unambiguous queries and queries with few relevant documents, inference did adversely affect effectiveness. These conclusions reflect the fact that for retrieval as inference to be effective, a careful balancing act is involved. Finally, although the Graph Inference model is developed and applied to medical search, it is a general retrieval model applicable to other areas such as web search, where an emerging research trend is to utilise structured knowledge resources for more effective semantic search.

Source

Information Retrieval Journal. 19(2016) no.1, S.6-37

Theme

Semantisches Umfeld in Indexierung u. Retrieval

Selvaretnam, B.; Belkhatir, M.: ¬A linguistically driven framework for query expansion via grammatical constituent highlighting and role-based concept weighting (2016) 0.03

0.026182959 = product of:
  0.06982122 = sum of:
    0.04338471 = weight(_text_:retrieval in 2876) [ClassicSimilarity], result of:
      0.04338471 = score(doc=2876,freq=6.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.34732026 = fieldWeight in 2876, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2876)
    0.014968331 = weight(_text_:of in 2876) [ClassicSimilarity], result of:
      0.014968331 = score(doc=2876,freq=10.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.23179851 = fieldWeight in 2876, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=2876)
    0.011468184 = product of:
      0.022936368 = sum of:
        0.022936368 = weight(_text_:on in 2876) [ClassicSimilarity], result of:
          0.022936368 = score(doc=2876,freq=6.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.25253648 = fieldWeight in 2876, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=2876)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: In this paper, we propose a linguistically-motivated query expansion framework that recognizes and encodes significant query constituents characterizing query intent in order to improve retrieval performance. Concepts-of-Interest are recognized as the core concepts that represent the gist of the search goal whilst the remaining query constituents which serve to specify the search goal and complete the query structure are classified as descriptive, relational or structural. Acknowledging the need to form semantically-associated base pairs for the purpose of extracting related potential expansion concepts, an algorithm which capitalizes on syntactical dependencies to capture relationships between adjacent and non-adjacent query concepts is proposed. Lastly, a robust weighting scheme that duly emphasizes the importance of query constituents based on their linguistic role within the expanded query is presented. We demonstrate improvements in retrieval effectiveness in terms of increased mean average precision garnered by the proposed linguistic-based query expansion framework through experimentation on the TREC ad hoc test collections.
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Ahn, J.-w.; Brusilovsky, P.: Adaptive visualization for exploratory information retrieval (2013) 0.03
```
0.025827788 = product of:
  0.0688741 = sum of:
    0.04174695 = weight(_text_:retrieval in 2717) [ClassicSimilarity], result of:
      0.04174695 = score(doc=2717,freq=8.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.33420905 = fieldWeight in 2717, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2717)
    0.019324033 = weight(_text_:of in 2717) [ClassicSimilarity], result of:
      0.019324033 = score(doc=2717,freq=24.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.2992506 = fieldWeight in 2717, product of:
          4.8989797 = tf(freq=24.0), with freq of:
            24.0 = termFreq=24.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2717)
    0.007803111 = product of:
      0.015606222 = sum of:
        0.015606222 = weight(_text_:on in 2717) [ClassicSimilarity], result of:
          0.015606222 = score(doc=2717,freq=4.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.1718293 = fieldWeight in 2717, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2717)
      0.5 = coord(1/2)
  0.375 = coord(3/8)
```
Abstract

As the volume and breadth of online information is rapidly increasing, ad hoc search systems become less and less efficient to answer information needs of modern users. To support the growing complexity of search tasks, researchers in the field of information developed and explored a range of approaches that extend the traditional ad hoc retrieval paradigm. Among these approaches, personalized search systems and exploratory search systems attracted many followers. Personalized search explored the power of artificial intelligence techniques to provide tailored search results according to different user interests, contexts, and tasks. In contrast, exploratory search capitalized on the power of human intelligence by providing users with more powerful interfaces to support the search process. As these approaches are not contradictory, we believe that they can re-enforce each other. We argue that the effectiveness of personalized search systems may be increased by allowing users to interact with the system and learn/investigate the problem in order to reach the final goal. We also suggest that an interactive visualization approach could offer a good ground to combine the strong sides of personalized and exploratory search approaches. This paper proposes a specific way to integrate interactive visualization and personalized search and introduces an adaptive visualization based search system Adaptive VIBE that implements it. We tested the effectiveness of Adaptive VIBE and investigated its strengths and weaknesses by conducting a full-scale user study. The results show that Adaptive VIBE can improve the precision and the productivity of the personalized search system while helping users to discover more diverse sets of information.

Footnote

Beitrag im Rahmen einer Special section on Human-computer Information Retrieval.

Theme

Semantisches Umfeld in Indexierung u. Retrieval

Mlodzka-Stybel, A.: Towards continuous improvement of users' access to a library catalogue (2014) 0.03

0.025475495 = product of:
  0.067934655 = sum of:
    0.029222867 = weight(_text_:retrieval in 1466) [ClassicSimilarity], result of:
      0.029222867 = score(doc=1466,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.23394634 = fieldWeight in 1466, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1466)
    0.019129815 = weight(_text_:of in 1466) [ClassicSimilarity], result of:
      0.019129815 = score(doc=1466,freq=12.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.29624295 = fieldWeight in 1466, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1466)
    0.019581974 = product of:
      0.039163947 = sum of:
        0.039163947 = weight(_text_:22 in 1466) [ClassicSimilarity], result of:
          0.039163947 = score(doc=1466,freq=2.0), product of:
            0.1446067 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.041294612 = queryNorm
            0.2708308 = fieldWeight in 1466, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1466)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: The paper discusses the issue of increasing users' access to library records by their publication in Google. Data from the records, converted into html format, have been indexed by Google. The process covered basic formal description fields of the records, description of the content, supported with a thesaurus, as well as an abstract, if present in the record. In addition to monitoring the end users' statistics, the pilot testing covered visibility of library records in Google search results.
Source: Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Roy, R.S.; Agarwal, S.; Ganguly, N.; Choudhury, M.: Syntactic complexity of Web search queries through the lenses of language models, networks and users (2016) 0.03

0.025446799 = product of:
  0.06785813 = sum of:
    0.020873476 = weight(_text_:retrieval in 3188) [ClassicSimilarity], result of:
      0.020873476 = score(doc=3188,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.16710453 = fieldWeight in 3188, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3188)
    0.030249555 = weight(_text_:use in 3188) [ClassicSimilarity], result of:
      0.030249555 = score(doc=3188,freq=4.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.23922569 = fieldWeight in 3188, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3188)
    0.0167351 = weight(_text_:of in 3188) [ClassicSimilarity], result of:
      0.0167351 = score(doc=3188,freq=18.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.25915858 = fieldWeight in 3188, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3188)
  0.375 = coord(3/8)

Abstract: Across the world, millions of users interact with search engines every day to satisfy their information needs. As the Web grows bigger over time, such information needs, manifested through user search queries, also become more complex. However, there has been no systematic study that quantifies the structural complexity of Web search queries. In this research, we make an attempt towards understanding and characterizing the syntactic complexity of search queries using a multi-pronged approach. We use traditional statistical language modeling techniques to quantify and compare the perplexity of queries with natural language (NL). We then use complex network analysis for a comparative analysis of the topological properties of queries issued by real Web users and those generated by statistical models. Finally, we conduct experiments to study whether search engine users are able to identify real queries, when presented along with model-generated ones. The three complementary studies show that the syntactic structure of Web queries is more complex than what n-grams can capture, but simpler than NL. Queries, thus, seem to represent an intermediate stage between syntactic and non-syntactic communication.
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Brambilla, M.; Ceri, S.: Designing exploratory search applications upon Web data sources (2012) 0.03

0.025423512 = product of:
  0.050847024 = sum of:
    0.016698781 = weight(_text_:retrieval in 428) [ClassicSimilarity], result of:
      0.016698781 = score(doc=428,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.13368362 = fieldWeight in 428, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=428)
    0.01711173 = weight(_text_:use in 428) [ClassicSimilarity], result of:
      0.01711173 = score(doc=428,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.13532647 = fieldWeight in 428, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.03125 = fieldNorm(doc=428)
    0.012622404 = weight(_text_:of in 428) [ClassicSimilarity], result of:
      0.012622404 = score(doc=428,freq=16.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.19546966 = fieldWeight in 428, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03125 = fieldNorm(doc=428)
    0.004414106 = product of:
      0.008828212 = sum of:
        0.008828212 = weight(_text_:on in 428) [ClassicSimilarity], result of:
          0.008828212 = score(doc=428,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.097201325 = fieldWeight in 428, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.03125 = fieldNorm(doc=428)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: Search is the preferred method to access information in today's computing systems. The Web, accessed through search engines, is universally recognized as the source for answering users' information needs. However, offering a link to a Web page does not cover all information needs. Even simple problems, such as "Which theater offers an at least three-stars action movie in London close to a good Italian restaurant," can only be solved by searching the Web multiple times, e.g., by extracting a list of the recent action movies filtered by ranking, then looking for movie theaters, then looking for Italian restaurants close to them. While search engines hint to useful information, the user's brain is the fundamental platform for information integration. An important trend is the availability of new, specialized data sources-the so-called "long tail" of the Web of data. Such carefully collected and curated data sources can be much more valuable than information currently available in Web pages; however, many sources remain hidden or insulated, in the lack of software solutions for bringing them to surface and making them usable in the search context. A new class of tailor-made systems, designed to satisfy the needs of users with specific aims, will support the publishing and integration of data sources for vertical domains; the user will be able to select sources based on individual or collective trust, and systems will be able to route queries to such sources and to provide easyto-use interfaces for combining them within search strategies, at the same time, rewarding the data source owners for each contribution to effective search. Efforts such as Google's Fusion Tables show that the technology for bringing hidden data sources to surface is feasible.
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Wang, Z.; Khoo, C.S.G.; Chaudhry, A.S.: Evaluation of the navigation effectiveness of an organizational taxonomy built on a general classification scheme and domain thesauri (2014) 0.03

0.025120806 = product of:
  0.06698882 = sum of:
    0.035423465 = weight(_text_:retrieval in 1251) [ClassicSimilarity], result of:
      0.035423465 = score(doc=1251,freq=4.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.2835858 = fieldWeight in 1251, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=1251)
    0.022201622 = weight(_text_:of in 1251) [ClassicSimilarity], result of:
      0.022201622 = score(doc=1251,freq=22.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.34381276 = fieldWeight in 1251, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=1251)
    0.009363732 = product of:
      0.018727465 = sum of:
        0.018727465 = weight(_text_:on in 1251) [ClassicSimilarity], result of:
          0.018727465 = score(doc=1251,freq=4.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.20619515 = fieldWeight in 1251, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=1251)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: This paper presents an evaluation study of the navigation effectiveness of a multifaceted organizational taxonomy that was built on the Dewey Decimal Classification and several domain thesauri in the area of library and information science education. The objective of the evaluation was to detect deficiencies in the taxonomy and to infer problems of applied construction steps from users' navigation difficulties. The evaluation approach included scenario-based navigation exercises and postexercise interviews. Navigation exercise errors and underlying reasons were analyzed in relation to specific components of the taxonomy and applied construction steps. Guidelines for the construction of the hierarchical structure and categories of an organizational taxonomy using existing general classification schemes and domain thesauri were derived from the evaluation results.
Source: Journal of the Association for Information Science and Technology. 65(2014) no.5, S.948-963
Theme: Klassifikationssysteme im Online-Retrieval
Semantisches Umfeld in Indexierung u. Retrieval

Järvelin, A.; Keskustalo, H.; Sormunen, E.; Saastamoinen, M.; Kettunen, K.: Information retrieval from historical newspaper collections in highly inflectional languages : a query expansion approach (2016) 0.02
```
0.024970733 = product of:
  0.06658862 = sum of:
    0.04174695 = weight(_text_:retrieval in 3223) [ClassicSimilarity], result of:
      0.04174695 = score(doc=3223,freq=8.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.33420905 = fieldWeight in 3223, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3223)
    0.019324033 = weight(_text_:of in 3223) [ClassicSimilarity], result of:
      0.019324033 = score(doc=3223,freq=24.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.2992506 = fieldWeight in 3223, product of:
          4.8989797 = tf(freq=24.0), with freq of:
            24.0 = termFreq=24.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3223)
    0.0055176322 = product of:
      0.0110352645 = sum of:
        0.0110352645 = weight(_text_:on in 3223) [ClassicSimilarity], result of:
          0.0110352645 = score(doc=3223,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.121501654 = fieldWeight in 3223, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3223)
      0.5 = coord(1/2)
  0.375 = coord(3/8)
```
Abstract

The aim of the study was to test whether query expansion by approximate string matching methods is beneficial in retrieval from historical newspaper collections in a language rich with compounds and inflectional forms (Finnish). First, approximate string matching methods were used to generate lists of index words most similar to contemporary query terms in a digitized newspaper collection from the 1800s. Top index word variants were categorized to estimate the appropriate query expansion ranges in the retrieval test. Second, the effectiveness of approximate string matching methods, automatically generated inflectional forms, and their combinations were measured in a Cranfield-style test. Finally, a detailed topic-level analysis of test results was conducted. In the index of historical newspaper collection the occurrences of a word typically spread to many linguistic and historical variants along with optical character recognition (OCR) errors. All query expansion methods improved the baseline results. Extensive expansion of around 30 variants for each query word was required to achieve the highest performance improvement. Query expansion based on approximate string matching was superior to using the inflectional forms of the query words, showing that coverage of the different types of variation is more important than precision in handling one type of variation.

Source

Journal of the Association for Information Science and Technology. 67(2016) no.12, S.2928-2946

Theme

Semantisches Umfeld in Indexierung u. Retrieval

Vidinli, I.B.; Ozcan, R.: New query suggestion framework and algorithms : a case study for an educational search engine (2016) 0.02

0.024631536 = product of:
  0.065684095 = sum of:
    0.025048172 = weight(_text_:retrieval in 3185) [ClassicSimilarity], result of:
      0.025048172 = score(doc=3185,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.20052543 = fieldWeight in 3185, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=3185)
    0.025667597 = weight(_text_:use in 3185) [ClassicSimilarity], result of:
      0.025667597 = score(doc=3185,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.20298971 = fieldWeight in 3185, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.046875 = fieldNorm(doc=3185)
    0.014968331 = weight(_text_:of in 3185) [ClassicSimilarity], result of:
      0.014968331 = score(doc=3185,freq=10.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.23179851 = fieldWeight in 3185, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=3185)
  0.375 = coord(3/8)

Abstract: Query suggestion is generally an integrated part of web search engines. In this study, we first redefine and reduce the query suggestion problem as "comparison of queries". We then propose a general modular framework for query suggestion algorithm development. We also develop new query suggestion algorithms which are used in our proposed framework, exploiting query, session and user features. As a case study, we use query logs of a real educational search engine that targets K-12 students in Turkey. We also exploit educational features (course, grade) in our query suggestion algorithms. We test our framework and algorithms over a set of queries by an experiment and demonstrate a 66-90% statistically significant increase in relevance of query suggestions compared to a baseline method.
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Li, N.; Sun, J.: Improving Chinese term association from the linguistic perspective (2017) 0.02

0.024631536 = product of:
  0.065684095 = sum of:
    0.025048172 = weight(_text_:retrieval in 3381) [ClassicSimilarity], result of:
      0.025048172 = score(doc=3381,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.20052543 = fieldWeight in 3381, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=3381)
    0.025667597 = weight(_text_:use in 3381) [ClassicSimilarity], result of:
      0.025667597 = score(doc=3381,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.20298971 = fieldWeight in 3381, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.046875 = fieldNorm(doc=3381)
    0.014968331 = weight(_text_:of in 3381) [ClassicSimilarity], result of:
      0.014968331 = score(doc=3381,freq=10.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.23179851 = fieldWeight in 3381, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=3381)
  0.375 = coord(3/8)

Abstract: The study aims to solve how to construct the semantic relations of specific domain terms by applying linguistic rules. The semantic structure analysis at the morpheme level was used for semantic measure, and a morpheme-based term association model was proposed by improving and combining the literal-based similarity algorithm and co-occurrence relatedness methods. This study provides a novel insight into the method of semantic analysis and calculation by morpheme parsing, and the proposed solution is feasible for the automatic association of compound terms. The results show that this approach could be used to construct appropriate term association and form a reasonable structural knowledge graph. However, due to linguistic differences, the viability and effectiveness of the use of our method in non-Chinese linguistic environments should be verified.
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Looking for information : a survey on research on information seeking, needs, and behavior (2016) 0.02

0.023719013 = product of:
  0.0632507 = sum of:
    0.036153924 = weight(_text_:retrieval in 3803) [ClassicSimilarity], result of:
      0.036153924 = score(doc=3803,freq=6.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.28943354 = fieldWeight in 3803, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3803)
    0.014758972 = weight(_text_:of in 3803) [ClassicSimilarity], result of:
      0.014758972 = score(doc=3803,freq=14.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.22855641 = fieldWeight in 3803, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3803)
    0.012337802 = product of:
      0.024675604 = sum of:
        0.024675604 = weight(_text_:on in 3803) [ClassicSimilarity], result of:
          0.024675604 = score(doc=3803,freq=10.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.271686 = fieldWeight in 3803, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3803)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: The 4th edition of this popular and well-cited text is now co-authored, and includes significant changes from earlier texts. Presenting a comprehensive review of over a century of research on information behavior (IB), this book is intended for students in information studies and disciplines interested in research on information activities. The initial two chapters introduce IB as a multi-disciplinary topic, the 3rd provides a brief history of research on information seeking. Chapter four discusses what is meant by the terms "information" and "knowledge. "Chapter five discusses "information needs," and how they are addressed. The 6th chapter identifies many related concepts. Twelve models of information behavior (expanded from earlier editions) are illustrated in chapter seven. Chapter eight reviews various paradigms and theories informing IB research. Chapter nine examines research methods invoked in IB studies and a discussion of qualitative and mixed approaches. The 10th chapter gives examples of IB studies by context. The final chapter looks at strengths and weaknesses, recent trends, and future development.
RSWK: Information Retrieval
Subject: Information Retrieval
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Blanco, R.; Matthews, M.; Mika, P.: Ranking of daily deals with concept expansion (2015) 0.02

0.023704888 = product of:
  0.063213035 = sum of:
    0.035423465 = weight(_text_:retrieval in 2663) [ClassicSimilarity], result of:
      0.035423465 = score(doc=2663,freq=4.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.2835858 = fieldWeight in 2663, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2663)
    0.021168415 = weight(_text_:of in 2663) [ClassicSimilarity], result of:
      0.021168415 = score(doc=2663,freq=20.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.32781258 = fieldWeight in 2663, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=2663)
    0.006621159 = product of:
      0.013242318 = sum of:
        0.013242318 = weight(_text_:on in 2663) [ClassicSimilarity], result of:
          0.013242318 = score(doc=2663,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.14580199 = fieldWeight in 2663, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=2663)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: Daily deals have emerged in the last three years as a successful form of online advertising. The downside of this success is that users are increasingly overloaded by the many thousands of deals offered each day by dozens of deal providers and aggregators. The challenge is thus offering the right deals to the right users i.e., the relevance ranking of deals. This is the problem we address in our paper. Exploiting the characteristics of deals data, we propose a combination of a term- and a concept-based retrieval model that closes the semantic gap between queries and documents expanding both of them with category information. The method consistently outperforms state-of-the-art methods based on term-matching alone and existing approaches for ad classification and ranking.
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Vechtomova, O.; Robertson, S.E.: ¬A domain-independent approach to finding related entities (2012) 0.02

0.02329753 = product of:
  0.062126745 = sum of:
    0.035423465 = weight(_text_:retrieval in 2733) [ClassicSimilarity], result of:
      0.035423465 = score(doc=2733,freq=4.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.2835858 = fieldWeight in 2733, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2733)
    0.02008212 = weight(_text_:of in 2733) [ClassicSimilarity], result of:
      0.02008212 = score(doc=2733,freq=18.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.3109903 = fieldWeight in 2733, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=2733)
    0.006621159 = product of:
      0.013242318 = sum of:
        0.013242318 = weight(_text_:on in 2733) [ClassicSimilarity], result of:
          0.013242318 = score(doc=2733,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.14580199 = fieldWeight in 2733, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=2733)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: We propose an approach to the retrieval of entities that have a specific relationship with the entity given in a query. Our research goal is to investigate whether related entity finding problem can be addressed by combining a measure of relatedness of candidate answer entities to the query, and likelihood that the candidate answer entity belongs to the target entity category specified in the query. An initial list of candidate entities, extracted from top ranked documents retrieved for the query, is refined using a number of statistical and linguistic methods. The proposed method extracts the category of the target entity from the query, identifies instances of this category as seed entities, and computes similarity between candidate and seed entities. The evaluation was conducted on the Related Entity Finding task of the Entity Track of TREC 2010, as well as the QA list questions from TREC 2005 and 2006. Evaluation results demonstrate that the proposed methods are effective in finding related entities.
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Celik, I.; Abel, F.; Siehndel, P.: Adaptive faceted search on Twitter (2011) 0.02

0.022991579 = product of:
  0.06131088 = sum of:
    0.033397563 = weight(_text_:retrieval in 2221) [ClassicSimilarity], result of:
      0.033397563 = score(doc=2221,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.26736724 = fieldWeight in 2221, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=2221)
    0.012622404 = weight(_text_:of in 2221) [ClassicSimilarity], result of:
      0.012622404 = score(doc=2221,freq=4.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.19546966 = fieldWeight in 2221, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=2221)
    0.015290912 = product of:
      0.030581824 = sum of:
        0.030581824 = weight(_text_:on in 2221) [ClassicSimilarity], result of:
          0.030581824 = score(doc=2221,freq=6.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.33671528 = fieldWeight in 2221, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0625 = fieldNorm(doc=2221)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: In the last few years, Twitter has become a powerful tool for publishing and discussing information. Yet, content exploration in Twitter requires substantial efforts and users often have to scan information streams by hand. In this paper, we approach this problem by means of faceted search. We propose strategies for inferring facets and facet values on Twitter by enriching the semantics of individual Twitter messages and present di erent methods, including personalized and context-adaptive methods, for making faceted search on Twitter more effective.
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Hannech, A.: Système de recherche d'information étendue basé sur une projection multi-espaces (2018) 0.02
```
0.022777371 = product of:
  0.045554742 = sum of:
    0.016698781 = weight(_text_:retrieval in 4472) [ClassicSimilarity], result of:
      0.016698781 = score(doc=4472,freq=8.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.13368362 = fieldWeight in 4472, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.015625 = fieldNorm(doc=4472)
    0.008555865 = weight(_text_:use in 4472) [ClassicSimilarity], result of:
      0.008555865 = score(doc=4472,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.06766324 = fieldWeight in 4472, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.015625 = fieldNorm(doc=4472)
    0.014460781 = weight(_text_:of in 4472) [ClassicSimilarity], result of:
      0.014460781 = score(doc=4472,freq=84.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.22393863 = fieldWeight in 4472, product of:
          9.165152 = tf(freq=84.0), with freq of:
            84.0 = termFreq=84.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.015625 = fieldNorm(doc=4472)
    0.005839314 = product of:
      0.011678628 = sum of:
        0.011678628 = weight(_text_:on in 4472) [ClassicSimilarity], result of:
          0.011678628 = score(doc=4472,freq=14.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.12858528 = fieldWeight in 4472, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.015625 = fieldNorm(doc=4472)
      0.5 = coord(1/2)
  0.5 = coord(4/8)
```
Abstract

Since its appearance in the early 90's, the World Wide Web (WWW or Web) has provided universal access to knowledge and the world of information has been primarily witness to a great revolution (the digital revolution). It quickly became very popular, making it the largest and most comprehensive database and knowledge base thanks to the amount and diversity of data it contains. However, the considerable increase and evolution of these data raises important problems for users, in particular for accessing the documents most relevant to their search queries. In order to cope with this exponential explosion of data volume and facilitate their access by users, various models are offered by information retrieval systems (IRS) for the representation and retrieval of web documents. Traditional SRIs use simple keywords that are not semantically linked to index and retrieve these documents. This creates limitations in terms of the relevance and ease of exploration of results. To overcome these limitations, existing techniques enrich documents by integrating external keywords from different sources. However, these systems still suffer from limitations that are related to the exploitation techniques of these sources of enrichment. When the different sources are used so that they cannot be distinguished by the system, this limits the flexibility of the exploration models that can be applied to the results returned by this system. Users then feel lost to these results, and find themselves forced to filter them manually to select the relevant information. If they want to go further, they must reformulate and target their search queries even more until they reach the documents that best meet their expectations. In this way, even if the systems manage to find more relevant results, their presentation remains problematic. In order to target research to more user-specific information needs and improve the relevance and exploration of its research findings, advanced SRIs adopt different data personalization techniques that assume that current research of user is directly related to his profile and / or previous browsing / search experiences.
However, this assumption does not hold in all cases, the needs of the user evolve over time and can move away from his previous interests stored in his profile. In other cases, the user's profile may be misused to extract or infer new information needs. This problem is much more accentuated with ambiguous queries. When multiple POIs linked to a search query are identified in the user's profile, the system is unable to select the relevant data from that profile to respond to that request. This has a direct impact on the quality of the results provided to this user. In order to overcome some of these limitations, in this research thesis, we have been interested in the development of techniques aimed mainly at improving the relevance of the results of current SRIs and facilitating the exploration of major collections of documents. To do this, we propose a solution based on a new concept and model of indexing and information retrieval called multi-spaces projection. This proposal is based on the exploitation of different categories of semantic and social information that enrich the universe of document representation and search queries in several dimensions of interpretations. The originality of this representation is to be able to distinguish between the different interpretations used for the description and the search for documents. This gives a better visibility on the results returned and helps to provide a greater flexibility of search and exploration, giving the user the ability to navigate one or more views of data that interest him the most. In addition, the proposed multidimensional representation universes for document description and search query interpretation help to improve the relevance of the user's results by providing a diversity of research / exploration that helps meet his diverse needs and those of other different users. This study exploits different aspects that are related to the personalized search and aims to solve the problems caused by the evolution of the information needs of the user. Thus, when the profile of this user is used by our system, a technique is proposed and used to identify the interests most representative of his current needs in his profile. This technique is based on the combination of three influential factors, including the contextual, frequency and temporal factor of the data. The ability of users to interact, exchange ideas and opinions, and form social networks on the Web, has led systems to focus on the types of interactions these users have at the level of interaction between them as well as their social roles in the system. This social information is discussed and integrated into this research work. The impact and how they are integrated into the IR process are studied to improve the relevance of the results.

Theme

Semantisches Umfeld in Indexierung u. Retrieval

Melucci, M.: Contextual search : a computational framework (2012) 0.02

0.022265585 = product of:
  0.059374895 = sum of:
    0.036153924 = weight(_text_:retrieval in 4913) [ClassicSimilarity], result of:
      0.036153924 = score(doc=4913,freq=6.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.28943354 = fieldWeight in 4913, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4913)
    0.013664153 = weight(_text_:of in 4913) [ClassicSimilarity], result of:
      0.013664153 = score(doc=4913,freq=12.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.21160212 = fieldWeight in 4913, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4913)
    0.00955682 = product of:
      0.01911364 = sum of:
        0.01911364 = weight(_text_:on in 4913) [ClassicSimilarity], result of:
          0.01911364 = score(doc=4913,freq=6.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.21044704 = fieldWeight in 4913, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4913)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: The growing availability of data in electronic form, the expansion of the World Wide Web and the accessibility of computational methods for large-scale data processing have allowed researchers in Information Retrieval (IR) to design systems which can effectively and efficiently constrain search within the boundaries given by context, thus transforming classical search into contextual search. Contextual Search: A Computational Framework introduces contextual search within a computational framework based on contextual variables, contextual factors and statistical models. It describes how statistical models can process contextual variables to infer the contextual factors underlying the current search context. It also provides background to the subject by: placing it among other surveys on relevance, interaction, context, and behaviour; providing a description of the contextual variables used for implementing the statistical models which represent and predict relevance and contextual factors; and providing an overview of the evaluation methodologies and findings relevant to this subject. Contextual Search: A Computational Framework is a highly recommended read, both for beginners who are embarking on research in this area and as a useful reference for established IR researchers.
Content: Table of contents 1. Introduction 2. Query Intent 3. Personal Interest 4. Document Quality 5. Contextual Search Evaluation 6. Conclusions Acknowledgements References A. Implementations
Series: Foundations and trends(r) in information retrieval; 6, 4/5
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Search (96 results, page 2 of 5)

Authors

Languages

Types

Themes

Subjects

Classifications