Search (96 results, page 2 of 5)

Bernier-Colborne, G.: Identifying semantic relations in a specialized corpus through distributional analysis of a cooccurrence tensor (2014) 0.00

0.0026606917 = product of:
  0.01596415 = sum of:
    0.01596415 = weight(_text_:in in 2153) [ClassicSimilarity], result of:
      0.01596415 = score(doc=2153,freq=10.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.26884392 = fieldWeight in 2153, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0625 = fieldNorm(doc=2153)
  0.16666667 = coord(1/6)

Abstract: We describe a method of encoding cooccurrence information in a three-way tensor from which HAL-style word space models can be derived. We use these models to identify semantic relations in a specialized corpus. Results suggest that the tensor-based methods we propose are more robust than the basic HAL model in some respects.
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Oh, K.E.; Joo, S.; Jeong, E.-J.: Online consumer health information organization : users' perspectives on faceted navigation (2015) 0.00
```
0.0024665273 = product of:
  0.014799163 = sum of:
    0.014799163 = weight(_text_:in in 2197) [ClassicSimilarity], result of:
      0.014799163 = score(doc=2197,freq=22.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.24922498 = fieldWeight in 2197, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2197)
  0.16666667 = coord(1/6)
```
Abstract

We investigate facets of online health information that are preferred, easy-to-use and useful in accessing online consumer health information from a user's perspective. In this study, the existing classification structure of 20 top ranked consumer health information websites in South Korea were analyzed, and nine facets that are used in organizing health information in those websites were identified. Based on the identified facets, an online survey, which asked participants' preferences for as well as perceived ease-of-use and usefulness of each facet in accessing online health information, was conducted. The analysis of the survey results showed that among the nine facets, the "diseases & conditions" and "body part" facets were most preferred, and perceived as easy-to-use and useful in accessing online health information. In contrast, "age," "gender," and "alternative medicine" facets were perceived as relatively less preferred, easy-to-use and useful. This research study has direct implications for organization and design of health information websites in that it suggests facets to include and avoid in organizing and providing access points to online health information.

Theme

Semantisches Umfeld in Indexierung u. Retrieval
Gnoli, C.; Santis, R. de; Pusterla, L.: Commerce, see also Rhetoric : cross-discipline relationships as authority data for enhanced retrieval (2015) 0.00
```
0.0024665273 = product of:
  0.014799163 = sum of:
    0.014799163 = weight(_text_:in in 2299) [ClassicSimilarity], result of:
      0.014799163 = score(doc=2299,freq=22.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.24922498 = fieldWeight in 2299, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2299)
  0.16666667 = coord(1/6)
```
Abstract

Subjects in a classification scheme are often related to other subjects belonging to different hierarchies. This problem was identified already by Hugh of Saint Victor (1096?-1141). Still with present-time bibliographic classifications, a user browsing the class of architecture under the hierarchy of arts may miss relevant items classified in building or in civil engineering under the hierarchy of applied sciences. To face these limitations we have developed SciGator, a browsable interface to explore the collections of all scientific libraries at the University of Pavia. Besides showing subclasses of a given class, the interface points users to related classes in the Dewey Decimal Classification, or in other local schemes, and allows for expanded queries that include them. This is made possible by using a special field for related classes in the database structure which models classification authority data. Ontologically, many relationships between classes in different hierarchies are cases of existential dependence. Dependence can occur between disciplines in such disciplinary classifications as Dewey (e.g. architecture existentially depends on building), or between phenomena in such phenomenon-based classifications as the Integrative Levels Classification (e.g. fishing as a human activity existentially depends on fish as a class of organisms). We provide an example of its representation in OWL and discuss some details of it.

Theme

Semantisches Umfeld in Indexierung u. Retrieval
Ru, C.; Tang, J.; Li, S.; Xie, S.; Wang, T.: Using semantic similarity to reduce wrong labels in distant supervision for relation extraction (2018) 0.00
```
0.0024665273 = product of:
  0.014799163 = sum of:
    0.014799163 = weight(_text_:in in 5055) [ClassicSimilarity], result of:
      0.014799163 = score(doc=5055,freq=22.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.24922498 = fieldWeight in 5055, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5055)
  0.16666667 = coord(1/6)
```
Abstract

Distant supervision (DS) has the advantage of automatically generating large amounts of labelled training data and has been widely used for relation extraction. However, there are usually many wrong labels in the automatically labelled data in distant supervision (Riedel, Yao, & McCallum, 2010). This paper presents a novel method to reduce the wrong labels. The proposed method uses the semantic Jaccard with word embedding to measure the semantic similarity between the relation phrase in the knowledge base and the dependency phrases between two entities in a sentence to filter the wrong labels. In the process of reducing wrong labels, the semantic Jaccard algorithm selects a core dependency phrase to represent the candidate relation in a sentence, which can capture features for relation classification and avoid the negative impact from irrelevant term sequences that previous neural network models of relation extraction often suffer. In the process of relation classification, the core dependency phrases are also used as the input of a convolutional neural network (CNN) for relation classification. The experimental results show that compared with the methods using original DS data, the methods using filtered DS data performed much better in relation extraction. It indicates that the semantic similarity based method is effective in reducing wrong labels. The relation extraction performance of the CNN model using the core dependency phrases as input is the best of all, which indicates that using the core dependency phrases as input of CNN is enough to capture the features for relation classification and could avoid negative impact from irrelevant terms.

Theme

Semantisches Umfeld in Indexierung u. Retrieval
Celik, I.; Abel, F.; Siehndel, P.: Adaptive faceted search on Twitter (2011) 0.00
```
0.0023797948 = product of:
  0.014278769 = sum of:
    0.014278769 = weight(_text_:in in 2221) [ClassicSimilarity], result of:
      0.014278769 = score(doc=2221,freq=8.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.24046129 = fieldWeight in 2221, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0625 = fieldNorm(doc=2221)
  0.16666667 = coord(1/6)
```
Abstract

In the last few years, Twitter has become a powerful tool for publishing and discussing information. Yet, content exploration in Twitter requires substantial efforts and users often have to scan information streams by hand. In this paper, we approach this problem by means of faceted search. We propose strategies for inferring facets and facet values on Twitter by enriching the semantics of individual Twitter messages and present di erent methods, including personalized and context-adaptive methods, for making faceted search on Twitter more effective.

Theme

Semantisches Umfeld in Indexierung u. Retrieval
Bräscher, M.: Semantic relations in knowledge organization systems (2014) 0.00
```
0.0023611297 = product of:
  0.014166778 = sum of:
    0.014166778 = weight(_text_:in in 1380) [ClassicSimilarity], result of:
      0.014166778 = score(doc=1380,freq=14.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.23857531 = fieldWeight in 1380, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=1380)
  0.16666667 = coord(1/6)
```
Abstract

Semantic relations in knowledge organization systems (KOS) are discussed as well as the need to analyze and systematize the contributions from different areas of knowledge that are devoted to semantic studies in order to collaborate in the definition of a theoretical framework for the study of types of relations included in KOS. Partial results of a survey reveal that, in general, standards and guidelines for developing thesauri are limited to defining and exemplifying types of relationships without guidance concerning the theoretical underpinning of these definitions. The possibilities of a compositional approach to defining the meaning of syntagmatic relations is discussed. Studies on the theoretical foundations that guide the establishment of semantic relations and approaches to be adopted for the preparation of KOS certainly contribute to consolidating a theoretical framework for the area of knowledge organization.

Theme

Semantisches Umfeld in Indexierung u. Retrieval
Green, R.: See-also relationships in the Dewey Decimal Classification (2011) 0.00
```
0.0023281053 = product of:
  0.013968632 = sum of:
    0.013968632 = weight(_text_:in in 4615) [ClassicSimilarity], result of:
      0.013968632 = score(doc=4615,freq=10.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.23523843 = fieldWeight in 4615, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4615)
  0.16666667 = coord(1/6)
```
Abstract

This paper investigates the semantics of topical, associative see-also relationships in schedule and table entries of the Dewey Decimal Classification (DDC) system. Based on the see-also relationships in a random sample of 100 classes containing one or more of these relationships, a semi-structured inventory of sources of see-also relationships is generated, of which the most important are lexical similarity, complementarity, facet difference, and relational configuration difference. The premise that see-also relationships based on lexical similarity may be language-specific is briefly examined. The paper concludes with recommendations on the continued use of see-also relationships in the DDC.

Theme

Semantisches Umfeld in Indexierung u. Retrieval
Kim, H.H.: Toward video semantic search based on a structured folksonomy (2011) 0.00
```
0.0022310577 = product of:
  0.0133863455 = sum of:
    0.0133863455 = weight(_text_:in in 4350) [ClassicSimilarity], result of:
      0.0133863455 = score(doc=4350,freq=18.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.22543246 = fieldWeight in 4350, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4350)
  0.16666667 = coord(1/6)
```
Abstract

This study investigated the effectiveness of query expansion using synonymous and co-occurrence tags in users' video searches as well as the effect of visual storyboard surrogates on users' relevance judgments when browsing videos. To do so, we designed a structured folksonomy-based system in which tag queries can be expanded via synonyms or co-occurrence words, based on the use of WordNet 2.1 synonyms and Flickr's related tags. To evaluate the structured folksonomy-based system, we conducted an experiment, the results of which suggest that the mean recall rate in the structured folksonomy-based system is statistically higher than that in a tag-based system without query expansion; however, the mean precision rate in the structured folksonomy-based system is not statistically higher than that in the tag-based system. Next, we compared the precision rates of the proposed system with storyboards (SB), in which SB and text metadata are shown to users when they browse video search results, with those of the proposed system without SB, in which only text metadata are shown. Our result showed that browsing only text surrogates-including tags without multimedia surrogates-is not sufficient for users' relevance judgments.

Theme

Semantisches Umfeld in Indexierung u. Retrieval
Järvelin, A.; Keskustalo, H.; Sormunen, E.; Saastamoinen, M.; Kettunen, K.: Information retrieval from historical newspaper collections in highly inflectional languages : a query expansion approach (2016) 0.00
```
0.0022310577 = product of:
  0.0133863455 = sum of:
    0.0133863455 = weight(_text_:in in 3223) [ClassicSimilarity], result of:
      0.0133863455 = score(doc=3223,freq=18.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.22543246 = fieldWeight in 3223, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3223)
  0.16666667 = coord(1/6)
```
Abstract

The aim of the study was to test whether query expansion by approximate string matching methods is beneficial in retrieval from historical newspaper collections in a language rich with compounds and inflectional forms (Finnish). First, approximate string matching methods were used to generate lists of index words most similar to contemporary query terms in a digitized newspaper collection from the 1800s. Top index word variants were categorized to estimate the appropriate query expansion ranges in the retrieval test. Second, the effectiveness of approximate string matching methods, automatically generated inflectional forms, and their combinations were measured in a Cranfield-style test. Finally, a detailed topic-level analysis of test results was conducted. In the index of historical newspaper collection the occurrences of a word typically spread to many linguistic and historical variants along with optical character recognition (OCR) errors. All query expansion methods improved the baseline results. Extensive expansion of around 30 variants for each query word was required to achieve the highest performance improvement. Query expansion based on approximate string matching was superior to using the inflectional forms of the query words, showing that coverage of the different types of variation is more important than precision in handling one type of variation.

Theme

Semantisches Umfeld in Indexierung u. Retrieval
Hazrina, S.; Sharef, N.M.; Ibrahim, H.; Murad, M.A.A.; Noah, S.A.M.: Review on the advancements of disambiguation in semantic question answering system (2017) 0.00
```
0.0022260942 = product of:
  0.013356565 = sum of:
    0.013356565 = weight(_text_:in in 3292) [ClassicSimilarity], result of:
      0.013356565 = score(doc=3292,freq=28.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.22493094 = fieldWeight in 3292, product of:
          5.2915025 = tf(freq=28.0), with freq of:
            28.0 = termFreq=28.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.03125 = fieldNorm(doc=3292)
  0.16666667 = coord(1/6)
```
Abstract

Ambiguity is a potential problem in any semantic question answering (SQA) system due to the nature of idiosyncrasy in composing natural language (NL) question and semantic resources. Thus, disambiguation of SQA systems is a field of ongoing research. Ambiguity occurs in SQA because a word or a sentence can have more than one meaning or multiple words in the same language can share the same meaning. Therefore, an SQA system needs disambiguation solutions to select the correct meaning when the linguistic triples matched with multiple KB concepts, and enumerate similar words especially when linguistic triples do not match with any KB concept. The latest development in this field is a solution for SQA systems that is able to process a complex NL question while accessing open-domain data from linked open data (LOD). The contributions in this paper include (1) formulating an SQA conceptual framework based on an in-depth study of existing SQA processes; (2) identifying the ambiguity types, specifically in English based on an interdisciplinary literature review; (3) highlighting the ambiguity types that had been resolved by the previous SQA studies; and (4) analysing the results of the existing SQA disambiguation solutions, the complexity of NL question processing, and the complexity of data retrieval from KB(s) or LOD. The results of this review demonstrated that out of thirteen types of ambiguity identified in the literature, only six types had been successfully resolved by the previous studies. Efforts to improve the disambiguation are in progress for the remaining unresolved ambiguity types to improve the accuracy of the formulated answers by the SQA system. The remaining ambiguity types are potentially resolved in the identified SQA process based on ambiguity scenarios elaborated in this paper. The results of this review also demonstrated that most existing research on SQA systems have treated the processing of the NL question complexity separate from the processing of the KB structure complexity.

Theme

Semantisches Umfeld in Indexierung u. Retrieval
Surfing versus Drilling for knowledge in science : When should you use your computer? When should you use your brain? (2018) 0.00
```
0.0022260942 = product of:
  0.013356565 = sum of:
    0.013356565 = weight(_text_:in in 4564) [ClassicSimilarity], result of:
      0.013356565 = score(doc=4564,freq=28.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.22493094 = fieldWeight in 4564, product of:
          5.2915025 = tf(freq=28.0), with freq of:
            28.0 = termFreq=28.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.03125 = fieldNorm(doc=4564)
  0.16666667 = coord(1/6)
```
Abstract

For this second Special Issue of Infozine, we have invited students, teachers, researchers, and software developers to share their opinions about one or the other aspect of this broad topic: how to balance drilling (for depth) vs. surfing (for breadth) in scientific learning, teaching, research, and software design - and how the modern digital-liberal system affects our ability to strike this balance. This special issue is meant to provide a wide and unbiased spectrum of possible viewpoints on the topic, helping readers to define lucidly their own position and information use behavior.

Content

Editorial: Surfing versus Drilling for Knowledge in Science: When should you use your computer? When should you use your brain? Blaise Pascal: Les deux infinis - The two infinities / Philippe Hünenberger and Oliver Renn - "Surfing" vs. "drilling" in the modern scientific world / Antonio Loprieno - Of millimeter paper and machine learning / Philippe Hünenberger - From one to many, from breadth to depth - industrializing research / Janne Soetbeer - "Deep drilling" requires "surfing" / Gerd Folkers and Laura Folkers - Surfing vs. drilling in science: A delicate balance / Alzbeta Kubincová - Digital trends in academia - for the sake of critical thinking or comfort? / Leif-Thore Deck - I diagnose, therefore I am a Doctor? Will drilling computer software replace human doctors in the future? / Yi Zheng - Surfing versus drilling in fundamental research / Wilfred van Gunsteren - Using brain vs. brute force in computational studies of biological systems / Arieh Warshel - Laboratory literature boards in the digital age / Jeffrey Bode - Research strategies in computational chemistry / Sereina Riniker - Surfing on the hype waves or drilling deep for knowledge? A perspective from industry / Nadine Schneider and Nikolaus Stiefl - The use and purpose of articles and scientists / Philip Mark Lund - Can you look at papers like artwork? / Oliver Renn - Dynamite fishing in the data swamp / Frank Perabo 34 Streetlights, augmented intelligence, and information discovery / Jeffrey Saffer and Vicki Burnett - "Yes Dave. Happy to do that for you." Why AI, machine learning, and blockchain will lead to deeper "drilling" / Michiel Kolman and Sjors de Heuvel - Trends in scientific document search ( Stefan Geißler - Power tools for text mining / Jane Reed 42 Publishing and patenting: Navigating the differences to ensure search success / Paul Peters

Theme

Semantisches Umfeld in Indexierung u. Retrieval
Zhang, W.; Yoshida, T.; Tang, X.: ¬A comparative study of TF*IDF, LSI and multi-words for text classification (2011) 0.00
```
0.0021859813 = product of:
  0.013115887 = sum of:
    0.013115887 = weight(_text_:in in 1165) [ClassicSimilarity], result of:
      0.013115887 = score(doc=1165,freq=12.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.22087781 = fieldWeight in 1165, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=1165)
  0.16666667 = coord(1/6)
```
Abstract

One of the main themes in text mining is text representation, which is fundamental and indispensable for text-based intellegent information processing. Generally, text representation inludes two tasks: indexing and weighting. This paper has comparatively studied TF*IDF, LSI and multi-word for text representation. We used a Chinese and an English document collection to respectively evaluate the three methods in information retreival and text categorization. Experimental results have demonstrated that in text categorization, LSI has better performance than other methods in both document collections. Also, LSI has produced the best performance in retrieving English documents. This outcome has shown that LSI has both favorable semantic and statistical quality and is different with the claim that LSI can not produce discriminative power for indexing.

Theme

Semantisches Umfeld in Indexierung u. Retrieval
Vidinli, I.B.; Ozcan, R.: New query suggestion framework and algorithms : a case study for an educational search engine (2016) 0.00
```
0.0021859813 = product of:
  0.013115887 = sum of:
    0.013115887 = weight(_text_:in in 3185) [ClassicSimilarity], result of:
      0.013115887 = score(doc=3185,freq=12.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.22087781 = fieldWeight in 3185, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=3185)
  0.16666667 = coord(1/6)
```
Abstract

Query suggestion is generally an integrated part of web search engines. In this study, we first redefine and reduce the query suggestion problem as "comparison of queries". We then propose a general modular framework for query suggestion algorithm development. We also develop new query suggestion algorithms which are used in our proposed framework, exploiting query, session and user features. As a case study, we use query logs of a real educational search engine that targets K-12 students in Turkey. We also exploit educational features (course, grade) in our query suggestion algorithms. We test our framework and algorithms over a set of queries by an experiment and demonstrate a 66-90% statistically significant increase in relevance of query suggestions compared to a baseline method.

Theme

Semantisches Umfeld in Indexierung u. Retrieval
Wongthontham, P.; Abu-Salih, B.: Ontology-based approach for semantic data extraction from social big data : state-of-the-art and research directions (2018) 0.00
```
0.0021859813 = product of:
  0.013115887 = sum of:
    0.013115887 = weight(_text_:in in 4097) [ClassicSimilarity], result of:
      0.013115887 = score(doc=4097,freq=12.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.22087781 = fieldWeight in 4097, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=4097)
  0.16666667 = coord(1/6)
```
Abstract

A challenge of managing and extracting useful knowledge from social media data sources has attracted much attention from academic and industry. To address this challenge, semantic analysis of textual data is focused in this paper. We propose an ontology-based approach to extract semantics of textual data and define the domain of data. In other words, we semantically analyse the social data at two levels i.e. the entity level and the domain level. We have chosen Twitter as a social channel challenge for a purpose of concept proof. Domain knowledge is captured in ontologies which are then used to enrich the semantics of tweets provided with specific semantic conceptual representation of entities that appear in the tweets. Case studies are used to demonstrate this approach. We experiment and evaluate our proposed approach with a public dataset collected from Twitter and from the politics domain. The ontology-based approach leverages entity extraction and concept mappings in terms of quantity and accuracy of concept identification.

Theme

Semantisches Umfeld in Indexierung u. Retrieval
Jiang, Y.; Zhang, X.; Tang, Y.; Nie, R.: Feature-based approaches to semantic similarity assessment of concepts using Wikipedia (2015) 0.00
```
0.0021034614 = product of:
  0.012620768 = sum of:
    0.012620768 = weight(_text_:in in 2682) [ClassicSimilarity], result of:
      0.012620768 = score(doc=2682,freq=16.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.21253976 = fieldWeight in 2682, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2682)
  0.16666667 = coord(1/6)
```
Abstract

Semantic similarity assessment between concepts is an important task in many language related applications. In the past, several approaches to assess similarity by evaluating the knowledge modeled in an (or multiple) ontology (or ontologies) have been proposed. However, there are some limitations such as the facts of relying on predefined ontologies and fitting non-dynamic domains in the existing measures. Wikipedia provides a very large domain-independent encyclopedic repository and semantic network for computing semantic similarity of concepts with more coverage than usual ontologies. In this paper, we propose some novel feature based similarity assessment methods that are fully dependent on Wikipedia and can avoid most of the limitations and drawbacks introduced above. To implement similarity assessment based on feature by making use of Wikipedia, firstly a formal representation of Wikipedia concepts is presented. We then give a framework for feature based similarity based on the formal representation of Wikipedia concepts. Lastly, we investigate several feature based approaches to semantic similarity measures resulting from instantiations of the framework. The evaluation, based on several widely used benchmarks and a benchmark developed in ourselves, sustains the intuitions with respect to human judgements. Overall, several methods proposed in this paper have good human correlation and constitute some effective ways of determining similarity between Wikipedia concepts.

Theme

Semantisches Umfeld in Indexierung u. Retrieval

Looking for information : a survey on research on information seeking, needs, and behavior (2012) 0.00

0.0021034614 = product of:
  0.012620768 = sum of:
    0.012620768 = weight(_text_:in in 3802) [ClassicSimilarity], result of:
      0.012620768 = score(doc=3802,freq=4.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.21253976 = fieldWeight in 3802, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.078125 = fieldNorm(doc=3802)
  0.16666667 = coord(1/6)

Footnote: Rez. in: JASIST 63(2012) no.12, S.2557-2558 (Heidi Julien)
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Atanassova, I.; Bertin, M.: Semantic facets for scientific information retrieval (2014) 0.00
```
0.0020823204 = product of:
  0.012493922 = sum of:
    0.012493922 = weight(_text_:in in 4471) [ClassicSimilarity], result of:
      0.012493922 = score(doc=4471,freq=8.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.21040362 = fieldWeight in 4471, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4471)
  0.16666667 = coord(1/6)
```
Abstract

We present an Information Retrieval System for scientific publications that provides the possibility to filter results according to semantic facets. We use sentence-level semantic annotations that identify specific semantic relations in texts, such as methods, definitions, hypotheses, that correspond to common information needs related to scientific literature. The semantic annotations are obtained using a rule-based method that identifies linguistic clues organized into a linguistic ontology. The system is implemented using Solr Search Server and offers efficient search and navigation in scientific papers.

Series

Communications in computer and information science; vol.475

Theme

Semantisches Umfeld in Indexierung u. Retrieval
Moreira, W.; Martínez-Ávila, D.: Concept relationships in knowledge organization systems : elements for analysis and common research among fields (2018) 0.00
```
0.0020823204 = product of:
  0.012493922 = sum of:
    0.012493922 = weight(_text_:in in 5166) [ClassicSimilarity], result of:
      0.012493922 = score(doc=5166,freq=8.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.21040362 = fieldWeight in 5166, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5166)
  0.16666667 = coord(1/6)
```
Abstract

Knowledge organization systems have been studied in several fields and for different and complementary aspects. Among the aspects that concentrate common interests, in this article we highlight those related to the terminological and conceptual relationships among the components of any knowledge organization system. This research aims to contribute to the critical analysis of knowledge organization systems, especially ontologies, thesauri, and classification systems, by the comprehension of its similarities and differences when dealing with concepts and their ways of relating to each other as well as to the conceptual design that is adopted.

Theme

Semantisches Umfeld in Indexierung u. Retrieval
Agarwal, N.K.: Exploring context in information behavior : seeker, situation, surroundings, and shared identities (2018) 0.00
```
0.0020609628 = product of:
  0.012365777 = sum of:
    0.012365777 = weight(_text_:in in 4992) [ClassicSimilarity], result of:
      0.012365777 = score(doc=4992,freq=24.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.2082456 = fieldWeight in 4992, product of:
          4.8989797 = tf(freq=24.0), with freq of:
            24.0 = termFreq=24.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.03125 = fieldNorm(doc=4992)
  0.16666667 = coord(1/6)
```
Abstract

The field of human information behavior runs the gamut of processes from the realization of a need or gap in understanding, to the search for information from one or more sources to fill that gap, to the use of that information to complete a task at hand or to satisfy a curiosity, as well as other behaviors such as avoiding information or finding information serendipitously. Designers of mechanisms, tools, and computer-based systems to facilitate this seeking and search process often lack a full knowledge of the context surrounding the search. This context may vary depending on the job or role of the person; individual characteristics such as personality, domain knowledge, age, gender, perception of self, etc.; the task at hand; the source and the channel and their degree of accessibility and usability; and the relationship that the seeker shares with the source. Yet researchers have yet to agree on what context really means. While there have been various research studies incorporating context, and biennial conferences on context in information behavior, there lacks a clear definition of what context is, what its boundaries are, and what elements and variables comprise context. In this book, we look at the many definitions of and the theoretical and empirical studies on context, and I attempt to map the conceptual space of context in information behavior. I propose theoretical frameworks to map the boundaries, elements, and variables of context. I then discuss how to incorporate these frameworks and variables in the design of research studies on context. We then arrive at a unified definition of context. This book should provide designers of search systems a better understanding of context as they seek to meet the needs and demands of information seekers. It will be an important resource for researchers in Library and Information Science, especially doctoral students looking for one resource that covers an exhaustive range of the most current literature related to context, the best selection of classics, and a synthesis of these into theoretical frameworks and a unified definition. The book should help to move forward research in the field by clarifying the elements, variables, and views that are pertinent. In particular, the list of elements to be considered, and the variables associated with each element will be extremely useful to researchers wanting to include the influences of context in their studies.

Footnote

Rez. in: JASIST 70(2019) no.3, S.301-303 (Ina Fourie)

Theme

Semantisches Umfeld in Indexierung u. Retrieval
Selvaretnam, B.; Belkhatir, M.: ¬A linguistically driven framework for query expansion via grammatical constituent highlighting and role-based concept weighting (2016) 0.00
```
0.0019955188 = product of:
  0.011973113 = sum of:
    0.011973113 = weight(_text_:in in 2876) [ClassicSimilarity], result of:
      0.011973113 = score(doc=2876,freq=10.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.20163295 = fieldWeight in 2876, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=2876)
  0.16666667 = coord(1/6)
```
Abstract

In this paper, we propose a linguistically-motivated query expansion framework that recognizes and encodes significant query constituents characterizing query intent in order to improve retrieval performance. Concepts-of-Interest are recognized as the core concepts that represent the gist of the search goal whilst the remaining query constituents which serve to specify the search goal and complete the query structure are classified as descriptive, relational or structural. Acknowledging the need to form semantically-associated base pairs for the purpose of extracting related potential expansion concepts, an algorithm which capitalizes on syntactical dependencies to capture relationships between adjacent and non-adjacent query concepts is proposed. Lastly, a robust weighting scheme that duly emphasizes the importance of query constituents based on their linguistic role within the expanded query is presented. We demonstrate improvements in retrieval effectiveness in terms of increased mean average precision garnered by the proposed linguistic-based query expansion framework through experimentation on the TREC ad hoc test collections.

Theme

Semantisches Umfeld in Indexierung u. Retrieval

Search (96 results, page 2 of 5)

Authors

Languages

Types

Themes

Subjects

Classifications