Search (15 results, page 1 of 1)

Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.39

0.39343736 = product of:
  0.5508123 = sum of:
    0.018316243 = product of:
      0.045790605 = sum of:
        0.02197135 = weight(_text_:retrieval in 563) [ClassicSimilarity], result of:
          0.02197135 = score(doc=563,freq=2.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.20052543 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.023819257 = weight(_text_:system in 563) [ClassicSimilarity], result of:
          0.023819257 = score(doc=563,freq=2.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.20878783 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
      0.4 = coord(2/5)
    0.17259108 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
      0.17259108 = score(doc=563,freq=2.0), product of:
        0.3070917 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03622214 = queryNorm
        0.56201804 = fieldWeight in 563, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=563)
    0.17259108 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
      0.17259108 = score(doc=563,freq=2.0), product of:
        0.3070917 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03622214 = queryNorm
        0.56201804 = fieldWeight in 563, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=563)
    0.17259108 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
      0.17259108 = score(doc=563,freq=2.0), product of:
        0.3070917 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03622214 = queryNorm
        0.56201804 = fieldWeight in 563, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=563)
    0.0147228 = product of:
      0.0294456 = sum of:
        0.0294456 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
          0.0294456 = score(doc=563,freq=2.0), product of:
            0.12684377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03622214 = queryNorm
            0.23214069 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
      0.5 = coord(1/2)
  0.71428573 = coord(5/7)

Abstract: In this thesis we propose three new word association measures for multi-word term extraction. We combine these association measures with LocalMaxs algorithm in our extraction model and compare the results of different multi-word term extraction methods. Our approach is language and domain independent and requires no training data. It can be applied to such tasks as text summarization, information retrieval, and document classification. We further explore the potential of using multi-word terms as an effective representation for general web-page summarization. We extract multi-word terms from human written summaries in a large collection of web-pages, and generate the summaries by aligning document words with these multi-word terms. Our system applies machine translation technology to learn the aligning process from a training set and focuses on selecting high quality multi-word terms from human written summaries to generate suitable results for web-page summarization.
Content: A Thesis presented to The University of Guelph In partial fulfilment of requirements for the degree of Master of Science in Computer Science. Vgl. Unter: http://www.inf.ufrgs.br%2F~ceramisch%2Fdownload_files%2Fpublications%2F2009%2Fp01.pdf.
Date: 10. 1.2013 19:22:47

Xiong, C.: Knowledge based text representations for information retrieval (2016) 0.33

0.3338872 = product of:
  0.58430254 = sum of:
    0.09614123 = product of:
      0.16023538 = sum of:
        0.115060724 = weight(_text_:3a in 5820) [ClassicSimilarity], result of:
          0.115060724 = score(doc=5820,freq=2.0), product of:
            0.3070917 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03622214 = queryNorm
            0.3746787 = fieldWeight in 5820, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03125 = fieldNorm(doc=5820)
        0.029295133 = weight(_text_:retrieval in 5820) [ClassicSimilarity], result of:
          0.029295133 = score(doc=5820,freq=8.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.26736724 = fieldWeight in 5820, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03125 = fieldNorm(doc=5820)
        0.015879504 = weight(_text_:system in 5820) [ClassicSimilarity], result of:
          0.015879504 = score(doc=5820,freq=2.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.13919188 = fieldWeight in 5820, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03125 = fieldNorm(doc=5820)
      0.6 = coord(3/5)
    0.16272044 = weight(_text_:2f in 5820) [ClassicSimilarity], result of:
      0.16272044 = score(doc=5820,freq=4.0), product of:
        0.3070917 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03622214 = queryNorm
        0.5298757 = fieldWeight in 5820, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03125 = fieldNorm(doc=5820)
    0.16272044 = weight(_text_:2f in 5820) [ClassicSimilarity], result of:
      0.16272044 = score(doc=5820,freq=4.0), product of:
        0.3070917 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03622214 = queryNorm
        0.5298757 = fieldWeight in 5820, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03125 = fieldNorm(doc=5820)
    0.16272044 = weight(_text_:2f in 5820) [ClassicSimilarity], result of:
      0.16272044 = score(doc=5820,freq=4.0), product of:
        0.3070917 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03622214 = queryNorm
        0.5298757 = fieldWeight in 5820, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03125 = fieldNorm(doc=5820)
  0.5714286 = coord(4/7)

Abstract: The successes of information retrieval (IR) in recent decades were built upon bag-of-words representations. Effective as it is, bag-of-words is only a shallow text understanding; there is a limited amount of information for document ranking in the word space. This dissertation goes beyond words and builds knowledge based text representations, which embed the external and carefully curated information from knowledge bases, and provide richer and structured evidence for more advanced information retrieval systems. This thesis research first builds query representations with entities associated with the query. Entities' descriptions are used by query expansion techniques that enrich the query with explanation terms. Then we present a general framework that represents a query with entities that appear in the query, are retrieved by the query, or frequently show up in the top retrieved documents. A latent space model is developed to jointly learn the connections from query to entities and the ranking of documents, modeling the external evidence from knowledge bases and internal ranking features cooperatively. To further improve the quality of relevant entities, a defining factor of our query representations, we introduce learning to rank to entity search and retrieve better entities from knowledge bases. In the document representation part, this thesis research also moves one step forward with a bag-of-entities model, in which documents are represented by their automatic entity annotations, and the ranking is performed in the entity space.
This proposal includes plans to improve the quality of relevant entities with a co-learning framework that learns from both entity labels and document labels. We also plan to develop a hybrid ranking system that combines word based and entity based representations together with their uncertainties considered. At last, we plan to enrich the text representations with connections between entities. We propose several ways to infer entity graph representations for texts, and to rank documents using their structure representations. This dissertation overcomes the limitation of word based representations with external and carefully curated information from knowledge bases. We believe this thesis research is a solid start towards the new generation of intelligent, semantic, and structured information retrieval.
Content: Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Language and Information Technologies. Vgl.: https%3A%2F%2Fwww.cs.cmu.edu%2F~cx%2Fpapers%2Fknowledge_based_text_representation.pdf&usg=AOvVaw0SaTSvhWLTh__Uz_HtOtl3.

Farazi, M.: Faceted lightweight ontologies : a formalization and some experiments (2010) 0.26

0.26299593 = product of:
  0.46024287 = sum of:
    0.028765181 = product of:
      0.1438259 = sum of:
        0.1438259 = weight(_text_:3a in 4997) [ClassicSimilarity], result of:
          0.1438259 = score(doc=4997,freq=2.0), product of:
            0.3070917 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03622214 = queryNorm
            0.46834838 = fieldWeight in 4997, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4997)
      0.2 = coord(1/5)
    0.1438259 = weight(_text_:2f in 4997) [ClassicSimilarity], result of:
      0.1438259 = score(doc=4997,freq=2.0), product of:
        0.3070917 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03622214 = queryNorm
        0.46834838 = fieldWeight in 4997, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4997)
    0.1438259 = weight(_text_:2f in 4997) [ClassicSimilarity], result of:
      0.1438259 = score(doc=4997,freq=2.0), product of:
        0.3070917 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03622214 = queryNorm
        0.46834838 = fieldWeight in 4997, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4997)
    0.1438259 = weight(_text_:2f in 4997) [ClassicSimilarity], result of:
      0.1438259 = score(doc=4997,freq=2.0), product of:
        0.3070917 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03622214 = queryNorm
        0.46834838 = fieldWeight in 4997, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4997)
  0.5714286 = coord(4/7)

Content: PhD Dissertation at International Doctorate School in Information and Communication Technology. Vgl.: https%3A%2F%2Fcore.ac.uk%2Fdownload%2Fpdf%2F150083013.pdf&usg=AOvVaw2n-qisNagpyT0lli_6QbAQ.

Kara, S.: ¬An ontology-based retrieval system using semantic indexing (2012) 0.01

0.0052332124 = product of:
  0.036632486 = sum of:
    0.036632486 = product of:
      0.09158121 = sum of:
        0.0439427 = weight(_text_:retrieval in 3829) [ClassicSimilarity], result of:
          0.0439427 = score(doc=3829,freq=8.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.40105087 = fieldWeight in 3829, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=3829)
        0.047638513 = weight(_text_:system in 3829) [ClassicSimilarity], result of:
          0.047638513 = score(doc=3829,freq=8.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.41757566 = fieldWeight in 3829, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=3829)
      0.4 = coord(2/5)
  0.14285715 = coord(1/7)

Abstract: In this thesis, we present an ontology-based information extraction and retrieval system and its application to soccer domain. In general, we deal with three issues in semantic search, namely, usability, scalability and retrieval performance. We propose a keyword-based semantic retrieval approach. The performance of the system is improved considerably using domain-specific information extraction, inference and rules. Scalability is achieved by adapting a semantic indexing approach. The system is implemented using the state-of-the-art technologies in SemanticWeb and its performance is evaluated against traditional systems as well as the query expansion methods. Furthermore, a detailed evaluation is provided to observe the performance gain due to domain-specific information extraction and inference. Finally, we show how we use semantic indexing to solve simple structural ambiguities.

Líska, M.: Evaluation of mathematics retrieval (2013) 0.00

0.004782734 = product of:
  0.033479135 = sum of:
    0.033479135 = product of:
      0.08369784 = sum of:
        0.044398077 = weight(_text_:retrieval in 1653) [ClassicSimilarity], result of:
          0.044398077 = score(doc=1653,freq=6.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.40520695 = fieldWeight in 1653, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1653)
        0.039299767 = weight(_text_:system in 1653) [ClassicSimilarity], result of:
          0.039299767 = score(doc=1653,freq=4.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.34448233 = fieldWeight in 1653, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1653)
      0.4 = coord(2/5)
  0.14285715 = coord(1/7)

Abstract: The thesis deals with the evaluation of mathematics information retrieval (IR). It gives an overview of the history of regular IR evaluation, initiatives that are engaged in this field of research as well as most common methods and measures used for evaluation. The findings are applied to the specifics of mathematics retrieval. This thesis also summarizes the state-of-the-art of MIaS math search system, which is already being used in an international web portal. Latest developments aiming towards the second version of the system are described. In addition to participating in the international evaluation conference and workshop, MIaS is tested for effectiveness and efficiency in this work. Measured performance indicators are evaluated and future work is suggested accordingly.

Kiren, T.: ¬A clustering based indexing technique of modularized ontologies for information retrieval (2017) 0.00
```
0.0046759406 = product of:
  0.01636579 = sum of:
    0.006550591 = product of:
      0.032752953 = sum of:
        0.032752953 = weight(_text_:retrieval in 4399) [ClassicSimilarity], result of:
          0.032752953 = score(doc=4399,freq=10.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.29892567 = fieldWeight in 4399, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03125 = fieldNorm(doc=4399)
      0.2 = coord(1/5)
    0.0098152 = product of:
      0.0196304 = sum of:
        0.0196304 = weight(_text_:22 in 4399) [ClassicSimilarity], result of:
          0.0196304 = score(doc=4399,freq=2.0), product of:
            0.12684377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03622214 = queryNorm
            0.15476047 = fieldWeight in 4399, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=4399)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)
```
Abstract

Indexing plays a vital role in Information Retrieval. With the availability of huge volume of information, it has become necessary to index the information in such a way to make easier for the end users to find the information they want efficiently and accurately. Keyword-based indexing uses words as indexing terms. It is not capable of capturing the implicit relation among terms or the semantics of the words in the document. To eliminate this limitation, ontology-based indexing came into existence, which allows semantic based indexing to solve complex and indirect user queries. Ontologies are used for document indexing which allows semantic based information retrieval. Existing ontologies or the ones constructed from scratch are used presently for indexing. Constructing ontologies from scratch is a labor-intensive task and requires extensive domain knowledge whereas use of an existing ontology may leave some important concepts in documents un-annotated. Using multiple ontologies can overcome the problem of missing out concepts to a great extent, but it is difficult to manage (changes in ontologies over time by their developers) multiple ontologies and ontology heterogeneity also arises due to ontologies constructed by different ontology developers. One possible solution to managing multiple ontologies and build from scratch is to use modular ontologies for indexing.
Modular ontologies are built in modular manner by combining modules from multiple relevant ontologies. Ontology heterogeneity also arises during modular ontology construction because multiple ontologies are being dealt with, during this process. Ontologies need to be aligned before using them for modular ontology construction. The existing approaches for ontology alignment compare all the concepts of each ontology to be aligned, hence not optimized in terms of time and search space utilization. A new indexing technique is proposed based on modular ontology. An efficient ontology alignment technique is proposed to solve the heterogeneity problem during the construction of modular ontology. Results are satisfactory as Precision and Recall are improved by (8%) and (10%) respectively. The value of Pearsons Correlation Coefficient for degree of similarity, time, search space requirement, precision and recall are close to 1 which shows that the results are significant. Further research can be carried out for using modular ontology based indexing technique for Multimedia Information Retrieval and Bio-Medical information retrieval.

Date

20. 1.2015 18:30:22
Ziemba, L.: Information retrieval with concept discovery in digital collections for agriculture and natural resources (2011) 0.00
```
0.0020911023 = product of:
  0.014637716 = sum of:
    0.014637716 = product of:
      0.03659429 = sum of:
        0.020714786 = weight(_text_:retrieval in 4728) [ClassicSimilarity], result of:
          0.020714786 = score(doc=4728,freq=4.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.18905719 = fieldWeight in 4728, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03125 = fieldNorm(doc=4728)
        0.015879504 = weight(_text_:system in 4728) [ClassicSimilarity], result of:
          0.015879504 = score(doc=4728,freq=2.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.13919188 = fieldWeight in 4728, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03125 = fieldNorm(doc=4728)
      0.4 = coord(2/5)
  0.14285715 = coord(1/7)
```
Abstract

The amount and complexity of information available in a digital form is already huge and new information is being produced every day. Retrieving information relevant to address a particular need becomes a significant issue. This work utilizes knowledge organization systems (KOS), such as thesauri and ontologies and applies information extraction (IE) and computational linguistics (CL) techniques to organize, manage and retrieve information stored in digital collections in the agricultural domain. Two real world applications of the approach have been developed and are available and actively used by the public. An ontology is used to manage the Water Conservation Digital Library holding a dynamic collection of various types of digital resources in the domain of urban water conservation in Florida, USA. The ontology based back-end powers a fully operational web interface, available at http://library.conservefloridawater.org. The system has demonstrated numerous benefits of the ontology application, including accurate retrieval of resources, information sharing and reuse, and has proved to effectively facilitate information management. The major difficulty encountered with the approach is that large and dynamic number of concepts makes it difficult to keep the ontology consistent and to accurately catalog resources manually. To address the aforementioned issues, a combination of IE and CL techniques, such as Vector Space Model and probabilistic parsing, with the use of Agricultural Thesaurus were adapted to automatically extract concepts important for each of the texts in the Best Management Practices (BMP) Publication Library--a collection of documents in the domain of agricultural BMPs in Florida available at http://lyra.ifas.ufl.edu/LIB. A new approach of domain-specific concept discovery with the use of Internet search engine was developed. Initial evaluation of the results indicates significant improvement in precision of information extraction. The approach presented in this work focuses on problems unique to agriculture and natural resources domain, such as domain specific concepts and vocabularies, but should be applicable to any collection of texts in digital format. It may be of potential interest for anyone who needs to effectively manage a collection of digital resources.

Schmolz, H.: Anaphora resolution and text retrieval : a lnguistic analysis of hypertexts (2015) 0.00

0.0018121665 = product of:
  0.012685165 = sum of:
    0.012685165 = product of:
      0.063425824 = sum of:
        0.063425824 = weight(_text_:retrieval in 1172) [ClassicSimilarity], result of:
          0.063425824 = score(doc=1172,freq=6.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.5788671 = fieldWeight in 1172, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.078125 = fieldNorm(doc=1172)
      0.2 = coord(1/5)
  0.14285715 = coord(1/7)

RSWK: Englisch / Anapher <Syntax> / Hypertext / Information Retrieval / Korpus <Linguistik>
Subject: Englisch / Anapher <Syntax> / Hypertext / Information Retrieval / Korpus <Linguistik>

Geisriegler, E.: Enriching electronic texts with semantic metadata : a use case for the historical Newspaper Collection ANNO (Austrian Newspapers Online) of the Austrian National Libraryhek (2012) 0.00

0.0017527144 = product of:
  0.0122690005 = sum of:
    0.0122690005 = product of:
      0.024538001 = sum of:
        0.024538001 = weight(_text_:22 in 595) [ClassicSimilarity], result of:
          0.024538001 = score(doc=595,freq=2.0), product of:
            0.12684377 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03622214 = queryNorm
            0.19345059 = fieldWeight in 595, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=595)
      0.5 = coord(1/2)
  0.14285715 = coord(1/7)

Date: 3. 2.2013 18:00:22

Schmolz, H.: Anaphora resolution and text retrieval : a lnguistic analysis of hypertexts (2013) 0.00

0.0014796278 = product of:
  0.010357394 = sum of:
    0.010357394 = product of:
      0.051786967 = sum of:
        0.051786967 = weight(_text_:retrieval in 1810) [ClassicSimilarity], result of:
          0.051786967 = score(doc=1810,freq=4.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.47264296 = fieldWeight in 1810, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.078125 = fieldNorm(doc=1810)
      0.2 = coord(1/5)
  0.14285715 = coord(1/7)

Content: Trägerin des VFI-Dissertationspreises 2014: "Überzeugende gründliche linguistische und quantitative Analyse eines im Information Retrieval bisher wenig beachteten Textelementes anhand eines eigens erstellten grossen Hypertextkorpus, einschliesslich der Evaluation selbsterstellter Auflösungsregeln für die Nutzung in künftigen IR-Systemen.".

Vocht, L. De: Exploring semantic relationships in the Web of Data : Semantische relaties verkennen in data op het web (2017) 0.00
```
0.0010902527 = product of:
  0.0076317685 = sum of:
    0.0076317685 = product of:
      0.01907942 = sum of:
        0.00915473 = weight(_text_:retrieval in 4232) [ClassicSimilarity], result of:
          0.00915473 = score(doc=4232,freq=2.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.08355226 = fieldWeight in 4232, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.01953125 = fieldNorm(doc=4232)
        0.00992469 = weight(_text_:system in 4232) [ClassicSimilarity], result of:
          0.00992469 = score(doc=4232,freq=2.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.08699492 = fieldWeight in 4232, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.01953125 = fieldNorm(doc=4232)
      0.4 = coord(2/5)
  0.14285715 = coord(1/7)
```
Abstract

This PhD-thesis describes how to effectively explore linked data on the Web. The main focus is on scenarios where users want to discover relationships between resources rather than finding out more about something specific. Searching for a specific document or piece of information fits in the theoretical framework of information retrieval and is associated with exploratory search. Exploratory search goes beyond 'looking up something' when users are seeking more detailed understanding, further investigation or navigation of the initial search results. The ideas behind exploratory search and querying linked data merge when it comes to the way knowledge is represented and indexed by machines - how data is structured and stored for optimal searchability. Queries and information should be aligned to facilitate that searches also reveal connections between results. This implies that they take into account the same semantic entities, relevant at that moment. To realize this, we research three techniques that are evaluated one by one in an experimental set-up to assess how well they succeed in their goals. In the end, the techniques are applied to a practical use case that focuses on forming a bridge between the Web and the use of digital libraries in scientific research. Our first technique focuses on the interactive visualization of search results. Linked data resources can be brought in relation with each other at will. This leads to complex and diverse graphs structures. Our technique facilitates navigation and supports a workflow starting from a broad overview on the data and allows narrowing down until the desired level of detail to then broaden again. To validate the flow, two visualizations where implemented and presented to test-users. The users judged the usability of the visualizations, how the visualizations fit in the workflow and to which degree their features seemed useful for the exploration of linked data.
When we speak about finding relationships between resources, it is necessary to dive deeper in the structure. The graph structure of linked data where the semantics give meaning to the relationships between resources enable the execution of pathfinding algorithms. The assigned weights and heuristics are base components of such algorithms and ultimately define (the order) which resources are included in a path. These paths explain indirect connections between resources. Our third technique proposes an algorithm that optimizes the choice of resources in terms of serendipity. Some optimizations guard the consistence of candidate-paths where the coherence of consecutive connections is maximized to avoid trivial and too arbitrary paths. The implementation uses the A* algorithm, the de-facto reference when it comes to heuristically optimized minimal cost paths. The effectiveness of paths was measured based on common automatic metrics and surveys where the users could indicate their preference for paths, generated each time in a different way. Finally, all our techniques are applied to a use case about publications in digital libraries where they are aligned with information about scientific conferences and researchers. The application to this use case is a practical example because the different aspects of exploratory search come together. In fact, the techniques also evolved from the experiences when implementing the use case. Practical details about the semantic model are explained and the implementation of the search system is clarified module by module. The evaluation positions the result, a prototype of a tool to explore scientific publications, researchers and conferences next to some important alternatives.
Thornton, K: Powerful structure : inspecting infrastructures of information organization in Wikimedia Foundation projects (2016) 0.00
```
9.624433E-4 = product of:
  0.0067371028 = sum of:
    0.0067371028 = product of:
      0.033685513 = sum of:
        0.033685513 = weight(_text_:system in 3288) [ClassicSimilarity], result of:
          0.033685513 = score(doc=3288,freq=4.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.29527056 = fieldWeight in 3288, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=3288)
      0.2 = coord(1/5)
  0.14285715 = coord(1/7)
```
Abstract

This dissertation investigates the social and technological factors of collaboratively organizing information in commons-based peer production systems. To do so, it analyzes the diverse strategies that members of Wikimedia Foundation (WMF) project communities use to organize information. Key findings from this dissertation show that conceptual structures of information organization are encoded into the infrastructure of WMF projects. The fact that WMF projects are commons-based peer production systems means that we can inspect the code that enables these systems, but a specific type of technical literacy is required to do so. I use three methods in this dissertation. I conduct a qualitative content analysis of the discussions surrounding the design, implementation and evaluation of the category system; a quantitative analysis using descriptive statistics of patterns of editing among editors who contributed to the code of templates for information boxes; and a close reading of the infrastructure used to create the category system, the infobox templates, and the knowledge base of structured data.
Castellanos Ardila, J.P.: Investigation of an OSLC-domain targeting ISO 26262 : focus on the left side of the software V-model (2016) 0.00
```
4.5370017E-4 = product of:
  0.003175901 = sum of:
    0.003175901 = product of:
      0.015879504 = sum of:
        0.015879504 = weight(_text_:system in 5819) [ClassicSimilarity], result of:
          0.015879504 = score(doc=5819,freq=2.0), product of:
            0.11408355 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03622214 = queryNorm
            0.13919188 = fieldWeight in 5819, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03125 = fieldNorm(doc=5819)
      0.2 = coord(1/5)
  0.14285715 = coord(1/7)
```
Abstract

Industries have adopted a standardized set of practices for developing their products. In the automotive domain, the provision of safety-compliant systems is guided by ISO 26262, a standard that specifies a set of requirements and recommendations for developing automotive safety-critical systems. For being in compliance with ISO 26262, the safety lifecycle proposed by the standard must be included in the development process of a vehicle. Besides, a safety case that shows that the system is acceptably safe has to be provided. The provision of a safety case implies the execution of a precise documentation process. This process makes sure that the work products are available and traceable. Further, the documentation management is defined in the standard as a mandatory activity and guidelines are proposed/imposed for its elaboration. It would be appropriate to point out that a well-documented safety lifecycle will provide the necessary inputs for the generation of an ISO 26262-compliant safety case. The OSLC (Open Services for Lifecycle Collaboration) standard and the maturing stack of semantic web technologies represent a promising integration platform for enabling semantic interoperability between the tools involved in the safety lifecycle. Tools for requirements, architecture, development management, among others, are expected to interact and shared data with the help of domains specifications created in OSLC. This thesis proposes the creation of an OSLC tool-chain infrastructure for sharing safety-related information, where fragments of safety information can be generated. The steps carried out during the elaboration of this master thesis consist in the identification, representation, and shaping of the RDF resources needed for the creation of a safety case. The focus of the thesis is limited to a tiny portion of the ISO 26262 left-hand side of the V-model, more exactly part 6 clause 8 of the standard: Software unit design and implementation. Regardless of the use of a restricted portion of the standard during the execution of this thesis, the findings can be extended to other parts, and the conclusions can be generalize. This master thesis is considered one of the first steps towards the provision of an OSLC-based and ISO 26262-compliant methodological approach for representing and shaping the work products resulting from the execution of the safety lifecycle, documentation required in the conformation of an ISO-compliant safety case.

Sebastian, Y.: Literature-based discovery by learning heterogeneous bibliographic information networks (2017) 0.00

4.1850194E-4 = product of:
  0.0029295133 = sum of:
    0.0029295133 = product of:
      0.014647567 = sum of:
        0.014647567 = weight(_text_:retrieval in 535) [ClassicSimilarity], result of:
          0.014647567 = score(doc=535,freq=2.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.13368362 = fieldWeight in 535, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03125 = fieldNorm(doc=535)
      0.2 = coord(1/5)
  0.14285715 = coord(1/7)

Theme: Semantisches Umfeld in Indexierung u. Retrieval

Nagy T., I.: Detecting multiword expressions and named entities in natural language texts (2014) 0.00
```
3.661892E-4 = product of:
  0.0025633243 = sum of:
    0.0025633243 = product of:
      0.012816621 = sum of:
        0.012816621 = weight(_text_:retrieval in 1536) [ClassicSimilarity], result of:
          0.012816621 = score(doc=1536,freq=2.0), product of:
            0.109568894 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03622214 = queryNorm
            0.11697317 = fieldWeight in 1536, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1536)
      0.2 = coord(1/5)
  0.14285715 = coord(1/7)
```
Abstract

Multiword expressions (MWEs) are lexical items that can be decomposed into single words and display lexical, syntactic, semantic, pragmatic and/or statistical idiosyncrasy (Sag et al., 2002; Kim, 2008; Calzolari et al., 2002). The proper treatment of multiword expressions such as rock 'n' roll and make a decision is essential for many natural language processing (NLP) applications like information extraction and retrieval, terminology extraction and machine translation, and it is important to identify multiword expressions in context. For example, in machine translation we must know that MWEs form one semantic unit, hence their parts should not be translated separately. For this, multiword expressions should be identified first in the text to be translated. The chief aim of this thesis is to develop machine learning-based approaches for the automatic detection of different types of multiword expressions in English and Hungarian natural language texts. In our investigations, we pay attention to the characteristics of different types of multiword expressions such as nominal compounds, multiword named entities and light verb constructions, and we apply novel methods to identify MWEs in raw texts. In the thesis it will be demonstrated that nominal compounds and multiword amed entities may require a similar approach for their automatic detection as they behave in the same way from a linguistic point of view. Furthermore, it will be shown that the automatic detection of light verb constructions can be carried out using two effective machine learning-based approaches.

Search (15 results, page 1 of 1)

Authors

Types

Themes

Classifications