Search (25 results, page 1 of 2)

Gnoli, C.; Santis, R. de; Pusterla, L.: Commerce, see also Rhetoric : cross-discipline relationships as authority data for enhanced retrieval (2015) 0.01
```
0.005023338 = product of:
  0.035163365 = sum of:
    0.02942922 = weight(_text_:representation in 2299) [ClassicSimilarity], result of:
      0.02942922 = score(doc=2299,freq=2.0), product of:
        0.11578492 = queryWeight, product of:
          4.600994 = idf(docFreq=1206, maxDocs=44218)
          0.025165197 = queryNorm
        0.25417143 = fieldWeight in 2299, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.600994 = idf(docFreq=1206, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2299)
    0.005734144 = product of:
      0.017202431 = sum of:
        0.017202431 = weight(_text_:29 in 2299) [ClassicSimilarity], result of:
          0.017202431 = score(doc=2299,freq=2.0), product of:
            0.08852329 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.025165197 = queryNorm
            0.19432661 = fieldWeight in 2299, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2299)
      0.33333334 = coord(1/3)
  0.14285715 = coord(2/14)
```
Abstract

Subjects in a classification scheme are often related to other subjects belonging to different hierarchies. This problem was identified already by Hugh of Saint Victor (1096?-1141). Still with present-time bibliographic classifications, a user browsing the class of architecture under the hierarchy of arts may miss relevant items classified in building or in civil engineering under the hierarchy of applied sciences. To face these limitations we have developed SciGator, a browsable interface to explore the collections of all scientific libraries at the University of Pavia. Besides showing subclasses of a given class, the interface points users to related classes in the Dewey Decimal Classification, or in other local schemes, and allows for expanded queries that include them. This is made possible by using a special field for related classes in the database structure which models classification authority data. Ontologically, many relationships between classes in different hierarchies are cases of existential dependence. Dependence can occur between disciplines in such disciplinary classifications as Dewey (e.g. architecture existentially depends on building), or between phenomena in such phenomenon-based classifications as the Integrative Levels Classification (e.g. fishing as a human activity existentially depends on fish as a class of organisms). We provide an example of its representation in OWL and discuss some details of it.

Source

Classification and authority control: expanding resource discovery: proceedings of the International UDC Seminar 2015, 29-30 October 2015, Lisbon, Portugal. Eds.: Slavic, A. u. M.I. Cordeiro
Zhang, W.; Yoshida, T.; Tang, X.: ¬A comparative study of TF*IDF, LSI and multi-words for text classification (2011) 0.00
```
0.004369106 = product of:
  0.061167482 = sum of:
    0.061167482 = weight(_text_:representation in 1165) [ClassicSimilarity], result of:
      0.061167482 = score(doc=1165,freq=6.0), product of:
        0.11578492 = queryWeight, product of:
          4.600994 = idf(docFreq=1206, maxDocs=44218)
          0.025165197 = queryNorm
        0.5282854 = fieldWeight in 1165, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          4.600994 = idf(docFreq=1206, maxDocs=44218)
          0.046875 = fieldNorm(doc=1165)
  0.071428575 = coord(1/14)
```
Abstract

One of the main themes in text mining is text representation, which is fundamental and indispensable for text-based intellegent information processing. Generally, text representation inludes two tasks: indexing and weighting. This paper has comparatively studied TF*IDF, LSI and multi-word for text representation. We used a Chinese and an English document collection to respectively evaluate the three methods in information retreival and text categorization. Experimental results have demonstrated that in text categorization, LSI has better performance than other methods in both document collections. Also, LSI has produced the best performance in retrieving English documents. This outcome has shown that LSI has both favorable semantic and statistical quality and is different with the claim that LSI can not produce discriminative power for indexing.
Jiang, Y.; Zhang, X.; Tang, Y.; Nie, R.: Feature-based approaches to semantic similarity assessment of concepts using Wikipedia (2015) 0.00
```
0.0029727998 = product of:
  0.041619197 = sum of:
    0.041619197 = weight(_text_:representation in 2682) [ClassicSimilarity], result of:
      0.041619197 = score(doc=2682,freq=4.0), product of:
        0.11578492 = queryWeight, product of:
          4.600994 = idf(docFreq=1206, maxDocs=44218)
          0.025165197 = queryNorm
        0.35945266 = fieldWeight in 2682, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.600994 = idf(docFreq=1206, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2682)
  0.071428575 = coord(1/14)
```
Abstract

Semantic similarity assessment between concepts is an important task in many language related applications. In the past, several approaches to assess similarity by evaluating the knowledge modeled in an (or multiple) ontology (or ontologies) have been proposed. However, there are some limitations such as the facts of relying on predefined ontologies and fitting non-dynamic domains in the existing measures. Wikipedia provides a very large domain-independent encyclopedic repository and semantic network for computing semantic similarity of concepts with more coverage than usual ontologies. In this paper, we propose some novel feature based similarity assessment methods that are fully dependent on Wikipedia and can avoid most of the limitations and drawbacks introduced above. To implement similarity assessment based on feature by making use of Wikipedia, firstly a formal representation of Wikipedia concepts is presented. We then give a framework for feature based similarity based on the formal representation of Wikipedia concepts. Lastly, we investigate several feature based approaches to semantic similarity measures resulting from instantiations of the framework. The evaluation, based on several widely used benchmarks and a benchmark developed in ourselves, sustains the intuitions with respect to human judgements. Overall, several methods proposed in this paper have good human correlation and constitute some effective ways of determining similarity between Wikipedia concepts.
Qu, R.; Fang, Y.; Bai, W.; Jiang, Y.: Computing semantic similarity based on novel models of semantic representation using Wikipedia (2018) 0.00
```
0.0029727998 = product of:
  0.041619197 = sum of:
    0.041619197 = weight(_text_:representation in 5052) [ClassicSimilarity], result of:
      0.041619197 = score(doc=5052,freq=4.0), product of:
        0.11578492 = queryWeight, product of:
          4.600994 = idf(docFreq=1206, maxDocs=44218)
          0.025165197 = queryNorm
        0.35945266 = fieldWeight in 5052, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.600994 = idf(docFreq=1206, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5052)
  0.071428575 = coord(1/14)
```
Abstract

Computing Semantic Similarity (SS) between concepts is one of the most critical issues in many domains such as Natural Language Processing and Artificial Intelligence. Over the years, several SS measurement methods have been proposed by exploiting different knowledge resources. Wikipedia provides a large domain-independent encyclopedic repository and a semantic network for computing SS between concepts. Traditional feature-based measures rely on linear combinations of different properties with two main limitations, the insufficient information and the loss of semantic information. In this paper, we propose several hybrid SS measurement approaches by using the Information Content (IC) and features of concepts, which avoid the limitations introduced above. Considering integrating discrete properties into one component, we present two models of semantic representation, called CORM and CARM. Then, we compute SS based on these models and take the IC of categories as a supplement of SS measurement. The evaluation, based on several widely used benchmarks and a benchmark developed by ourselves, sustains the intuitions with respect to human judgments. In summary, our approaches are more efficient in determining SS between concepts and have a better human correlation than previous methods such as Word2Vec and NASARI.
Colace, F.; Santo, M. De; Greco, L.; Napoletano, P.: Weighted word pairs for query expansion (2015) 0.00
```
0.0029429218 = product of:
  0.041200902 = sum of:
    0.041200902 = weight(_text_:representation in 2687) [ClassicSimilarity], result of:
      0.041200902 = score(doc=2687,freq=2.0), product of:
        0.11578492 = queryWeight, product of:
          4.600994 = idf(docFreq=1206, maxDocs=44218)
          0.025165197 = queryNorm
        0.35583997 = fieldWeight in 2687, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.600994 = idf(docFreq=1206, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2687)
  0.071428575 = coord(1/14)
```
Abstract

This paper proposes a novel query expansion method to improve accuracy of text retrieval systems. Our method makes use of a minimal relevance feedback to expand the initial query with a structured representation composed of weighted pairs of words. Such a structure is obtained from the relevance feedback through a method for pairs of words selection based on the Probabilistic Topic Model. We compared our method with other baseline query expansion schemes and methods. Evaluations performed on TREC-8 demonstrated the effectiveness of the proposed method with respect to the baseline.
Colace, F.; Santo, M. de; Greco, L.; Napoletano, P.: Improving relevance feedback-based query expansion by the use of a weighted word pairs approach (2015) 0.00
```
0.0025225044 = product of:
  0.03531506 = sum of:
    0.03531506 = weight(_text_:representation in 2263) [ClassicSimilarity], result of:
      0.03531506 = score(doc=2263,freq=2.0), product of:
        0.11578492 = queryWeight, product of:
          4.600994 = idf(docFreq=1206, maxDocs=44218)
          0.025165197 = queryNorm
        0.3050057 = fieldWeight in 2263, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.600994 = idf(docFreq=1206, maxDocs=44218)
          0.046875 = fieldNorm(doc=2263)
  0.071428575 = coord(1/14)
```
Abstract

In this article, the use of a new term extraction method for query expansion (QE) in text retrieval is investigated. The new method expands the initial query with a structured representation made of weighted word pairs (WWP) extracted from a set of training documents (relevance feedback). Standard text retrieval systems can handle a WWP structure through custom Boolean weighted models. We experimented with both the explicit and pseudorelevance feedback schemas and compared the proposed term extraction method with others in the literature, such as KLD and RM3. Evaluations have been conducted on a number of test collections (Text REtrivel Conference [TREC]-6, -7, -8, -9, and -10). Results demonstrated that the QE method based on this new structure outperforms the baseline.
Wongthontham, P.; Abu-Salih, B.: Ontology-based approach for semantic data extraction from social big data : state-of-the-art and research directions (2018) 0.00
```
0.0025225044 = product of:
  0.03531506 = sum of:
    0.03531506 = weight(_text_:representation in 4097) [ClassicSimilarity], result of:
      0.03531506 = score(doc=4097,freq=2.0), product of:
        0.11578492 = queryWeight, product of:
          4.600994 = idf(docFreq=1206, maxDocs=44218)
          0.025165197 = queryNorm
        0.3050057 = fieldWeight in 4097, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.600994 = idf(docFreq=1206, maxDocs=44218)
          0.046875 = fieldNorm(doc=4097)
  0.071428575 = coord(1/14)
```
Abstract

A challenge of managing and extracting useful knowledge from social media data sources has attracted much attention from academic and industry. To address this challenge, semantic analysis of textual data is focused in this paper. We propose an ontology-based approach to extract semantics of textual data and define the domain of data. In other words, we semantically analyse the social data at two levels i.e. the entity level and the domain level. We have chosen Twitter as a social channel challenge for a purpose of concept proof. Domain knowledge is captured in ontologies which are then used to enrich the semantics of tweets provided with specific semantic conceptual representation of entities that appear in the tweets. Case studies are used to demonstrate this approach. We experiment and evaluate our proposed approach with a public dataset collected from Twitter and from the politics domain. The ontology-based approach leverages entity extraction and concept mappings in terms of quantity and accuracy of concept identification.
Koopman, B.; Zuccon, G.; Bruza, P.; Sitbon, L.; Lawley, M.: Information retrieval as semantic inference : a graph Inference model applied to medical search (2016) 0.00
```
0.0016816697 = product of:
  0.023543375 = sum of:
    0.023543375 = weight(_text_:representation in 3260) [ClassicSimilarity], result of:
      0.023543375 = score(doc=3260,freq=2.0), product of:
        0.11578492 = queryWeight, product of:
          4.600994 = idf(docFreq=1206, maxDocs=44218)
          0.025165197 = queryNorm
        0.20333713 = fieldWeight in 3260, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.600994 = idf(docFreq=1206, maxDocs=44218)
          0.03125 = fieldNorm(doc=3260)
  0.071428575 = coord(1/14)
```
Abstract

This paper presents a Graph Inference retrieval model that integrates structured knowledge resources, statistical information retrieval methods and inference in a unified framework. Key components of the model are a graph-based representation of the corpus and retrieval driven by an inference mechanism achieved as a traversal over the graph. The model is proposed to tackle the semantic gap problem-the mismatch between the raw data and the way a human being interprets it. We break down the semantic gap problem into five core issues, each requiring a specific type of inference in order to be overcome. Our model and evaluation is applied to the medical domain because search within this domain is particularly challenging and, as we show, often requires inference. In addition, this domain features both structured knowledge resources as well as unstructured text. Our evaluation shows that inference can be effective, retrieving many new relevant documents that are not retrieved by state-of-the-art information retrieval models. We show that many retrieved documents were not pooled by keyword-based search methods, prompting us to perform additional relevance assessment on these new documents. A third of the newly retrieved documents judged were found to be relevant. Our analysis provides a thorough understanding of when and how to apply inference for retrieval, including a categorisation of queries according to the effect of inference. The inference mechanism promoted recall by retrieving new relevant documents not found by previous keyword-based approaches. In addition, it promoted precision by an effective reranking of documents. When inference is used, performance gains can generally be expected on hard queries. However, inference should not be applied universally: for easy, unambiguous queries and queries with few relevant documents, inference did adversely affect effectiveness. These conclusions reflect the fact that for retrieval as inference to be effective, a careful balancing act is involved. Finally, although the Graph Inference model is developed and applied to medical search, it is a general retrieval model applicable to other areas such as web search, where an emerging research trend is to utilise structured knowledge resources for more effective semantic search.

Rekabsaz, N. et al.: Toward optimized multimodal concept indexing (2016) 0.00

8.1179454E-4 = product of:
  0.011365123 = sum of:
    0.011365123 = product of:
      0.03409537 = sum of:
        0.03409537 = weight(_text_:22 in 2751) [ClassicSimilarity], result of:
          0.03409537 = score(doc=2751,freq=2.0), product of:
            0.08812423 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.025165197 = queryNorm
            0.38690117 = fieldWeight in 2751, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2751)
      0.33333334 = coord(1/3)
  0.071428575 = coord(1/14)

Date: 1. 2.2016 18:25:22

Kozikowski, P. et al.: Support of part-whole relations in query answering (2016) 0.00

8.1179454E-4 = product of:
  0.011365123 = sum of:
    0.011365123 = product of:
      0.03409537 = sum of:
        0.03409537 = weight(_text_:22 in 2754) [ClassicSimilarity], result of:
          0.03409537 = score(doc=2754,freq=2.0), product of:
            0.08812423 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.025165197 = queryNorm
            0.38690117 = fieldWeight in 2754, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2754)
      0.33333334 = coord(1/3)
  0.071428575 = coord(1/14)

Date: 1. 2.2016 18:25:22

Marx, E. et al.: Exploring term networks for semantic search over RDF knowledge graphs (2016) 0.00

8.1179454E-4 = product of:
  0.011365123 = sum of:
    0.011365123 = product of:
      0.03409537 = sum of:
        0.03409537 = weight(_text_:22 in 3279) [ClassicSimilarity], result of:
          0.03409537 = score(doc=3279,freq=2.0), product of:
            0.08812423 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.025165197 = queryNorm
            0.38690117 = fieldWeight in 3279, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=3279)
      0.33333334 = coord(1/3)
  0.071428575 = coord(1/14)

Source: Metadata and semantics research: 10th International Conference, MTSR 2016, Göttingen, Germany, November 22-25, 2016, Proceedings. Eds.: E. Garoufallou

Kopácsi, S. et al.: Development of a classification server to support metadata harmonization in a long term preservation system (2016) 0.00

8.1179454E-4 = product of:
  0.011365123 = sum of:
    0.011365123 = product of:
      0.03409537 = sum of:
        0.03409537 = weight(_text_:22 in 3280) [ClassicSimilarity], result of:
          0.03409537 = score(doc=3280,freq=2.0), product of:
            0.08812423 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.025165197 = queryNorm
            0.38690117 = fieldWeight in 3280, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=3280)
      0.33333334 = coord(1/3)
  0.071428575 = coord(1/14)

Source: Metadata and semantics research: 10th International Conference, MTSR 2016, Göttingen, Germany, November 22-25, 2016, Proceedings. Eds.: E. Garoufallou

Atanassova, I.; Bertin, M.: Semantic facets for scientific information retrieval (2014) 0.00

5.734144E-4 = product of:
  0.008027801 = sum of:
    0.008027801 = product of:
      0.024083402 = sum of:
        0.024083402 = weight(_text_:29 in 4471) [ClassicSimilarity], result of:
          0.024083402 = score(doc=4471,freq=2.0), product of:
            0.08852329 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.025165197 = queryNorm
            0.27205724 = fieldWeight in 4471, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4471)
      0.33333334 = coord(1/3)
  0.071428575 = coord(1/14)

Source: Semantic Web Evaluation Challenge. SemWebEval 2014 at ESWC 2014, Anissaras, Crete, Greece, May 25-29, 2014, Revised Selected Papers. Eds.: V. Presutti et al

Salaba, A.; Zeng, M.L.: Extending the "Explore" user task beyond subject authority data into the linked data sphere (2014) 0.00

5.6825613E-4 = product of:
  0.007955586 = sum of:
    0.007955586 = product of:
      0.023866756 = sum of:
        0.023866756 = weight(_text_:22 in 1465) [ClassicSimilarity], result of:
          0.023866756 = score(doc=1465,freq=2.0), product of:
            0.08812423 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.025165197 = queryNorm
            0.2708308 = fieldWeight in 1465, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1465)
      0.33333334 = coord(1/3)
  0.071428575 = coord(1/14)

Source: Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik

Mlodzka-Stybel, A.: Towards continuous improvement of users' access to a library catalogue (2014) 0.00

5.6825613E-4 = product of:
  0.007955586 = sum of:
    0.007955586 = product of:
      0.023866756 = sum of:
        0.023866756 = weight(_text_:22 in 1466) [ClassicSimilarity], result of:
          0.023866756 = score(doc=1466,freq=2.0), product of:
            0.08812423 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.025165197 = queryNorm
            0.2708308 = fieldWeight in 1466, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1466)
      0.33333334 = coord(1/3)
  0.071428575 = coord(1/14)

Source: Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik

Vechtomova, O.; Robertson, S.E.: ¬A domain-independent approach to finding related entities (2012) 0.00

4.9149804E-4 = product of:
  0.006880972 = sum of:
    0.006880972 = product of:
      0.020642916 = sum of:
        0.020642916 = weight(_text_:29 in 2733) [ClassicSimilarity], result of:
          0.020642916 = score(doc=2733,freq=2.0), product of:
            0.08852329 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.025165197 = queryNorm
            0.23319192 = fieldWeight in 2733, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.046875 = fieldNorm(doc=2733)
      0.33333334 = coord(1/3)
  0.071428575 = coord(1/14)

Date: 27. 1.2016 18:44:29

Zeng, M.L.; Gracy, K.F.; Zumer, M.: Using a semantic analysis tool to generate subject access points : a study using Panofsky's theory and two research samples (2014) 0.00

4.8707667E-4 = product of:
  0.006819073 = sum of:
    0.006819073 = product of:
      0.02045722 = sum of:
        0.02045722 = weight(_text_:22 in 1464) [ClassicSimilarity], result of:
          0.02045722 = score(doc=1464,freq=2.0), product of:
            0.08812423 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.025165197 = queryNorm
            0.23214069 = fieldWeight in 1464, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=1464)
      0.33333334 = coord(1/3)
  0.071428575 = coord(1/14)

Source: Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik

Bando, L.L.; Scholer, F.; Turpin, A.: Query-biased summary generation assisted by query expansion : temporality (2015) 0.00
```
4.0958173E-4 = product of:
  0.005734144 = sum of:
    0.005734144 = product of:
      0.017202431 = sum of:
        0.017202431 = weight(_text_:29 in 1820) [ClassicSimilarity], result of:
          0.017202431 = score(doc=1820,freq=2.0), product of:
            0.08852329 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.025165197 = queryNorm
            0.19432661 = fieldWeight in 1820, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1820)
      0.33333334 = coord(1/3)
  0.071428575 = coord(1/14)
```
Abstract

Query-biased summaries help users to identify which items returned by a search system should be read in full. In this article, we study the generation of query-biased summaries as a sentence ranking approach, and methods to evaluate their effectiveness. Using sentence-level relevance assessments from the TREC Novelty track, we gauge the benefits of query expansion to minimize the vocabulary mismatch problem between informational requests and sentence ranking methods. Our results from an intrinsic evaluation show that query expansion significantly improves the selection of short relevant sentences (5-13 words) between 7% and 11%. However, query expansion does not lead to improvements for sentences of medium (14-20 words) and long (21-29 words) lengths. In a separate crowdsourcing study, we analyze whether a summary composed of sentences ranked using query expansion was preferred over summaries not assisted by query expansion, rather than assessing sentences individually. We found that participants chose summaries aided by query expansion around 60% of the time over summaries using an unexpanded query. We conclude that query expansion techniques can benefit the selection of sentences for the construction of query-biased summaries at the summary level rather than at the sentence ranking level.

Jiang, Y.; Bai, W.; Zhang, X.; Hu, J.: Wikipedia-based information content and semantic similarity computation (2017) 0.00

4.0958173E-4 = product of:
  0.005734144 = sum of:
    0.005734144 = product of:
      0.017202431 = sum of:
        0.017202431 = weight(_text_:29 in 2877) [ClassicSimilarity], result of:
          0.017202431 = score(doc=2877,freq=2.0), product of:
            0.08852329 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.025165197 = queryNorm
            0.19432661 = fieldWeight in 2877, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2877)
      0.33333334 = coord(1/3)
  0.071428575 = coord(1/14)

Date: 23. 1.2017 14:06:29

Athukorala, K.; Glowacka, D.; Jacucci, G.; Oulasvirta, A.; Vreeken, J.: Is exploratory search different? : a comparison of information search behavior for exploratory and lookup tasks (2016) 0.00

4.0958173E-4 = product of:
  0.005734144 = sum of:
    0.005734144 = product of:
      0.017202431 = sum of:
        0.017202431 = weight(_text_:29 in 3150) [ClassicSimilarity], result of:
          0.017202431 = score(doc=3150,freq=2.0), product of:
            0.08852329 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.025165197 = queryNorm
            0.19432661 = fieldWeight in 3150, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3150)
      0.33333334 = coord(1/3)
  0.071428575 = coord(1/14)

Date: 18.10.2016 13:52:29

Search (25 results, page 1 of 2)

Authors

Types

Themes