Search (25 results, page 1 of 2)

Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.19

0.18843201 = product of:
  0.37686402 = sum of:
    0.18322212 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
      0.18322212 = score(doc=563,freq=2.0), product of:
        0.32600754 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.038453303 = queryNorm
        0.56201804 = fieldWeight in 563, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=563)
    0.18322212 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
      0.18322212 = score(doc=563,freq=2.0), product of:
        0.32600754 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.038453303 = queryNorm
        0.56201804 = fieldWeight in 563, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=563)
    0.010419784 = product of:
      0.03125935 = sum of:
        0.03125935 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
          0.03125935 = score(doc=563,freq=2.0), product of:
            0.13465692 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.038453303 = queryNorm
            0.23214069 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
      0.33333334 = coord(1/3)
  0.5 = coord(3/6)

Content: A Thesis presented to The University of Guelph In partial fulfilment of requirements for the degree of Master of Science in Computer Science. Vgl. Unter: http://www.inf.ufrgs.br%2F~ceramisch%2Fdownload_files%2Fpublications%2F2009%2Fp01.pdf.
Date: 10. 1.2013 19:22:47

Rettinger, A.; Schumilin, A.; Thoma, S.; Ell, B.: Learning a cross-lingual semantic representation of relations expressed in text (2015) 0.01

0.0061715 = product of:
  0.037028998 = sum of:
    0.037028998 = weight(_text_:internet in 2027) [ClassicSimilarity], result of:
      0.037028998 = score(doc=2027,freq=2.0), product of:
        0.11352337 = queryWeight, product of:
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.038453303 = queryNorm
        0.3261795 = fieldWeight in 2027, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.078125 = fieldNorm(doc=2027)
  0.16666667 = coord(1/6)

Series: Information Systems and Applications, incl. Internet/Web, and HCI; Bd. 9088

Gencosman, B.C.; Ozmutlu, H.C.; Ozmutlu, S.: Character n-gram application for automatic new topic identification (2014) 0.00
```
0.0043639094 = product of:
  0.026183454 = sum of:
    0.026183454 = weight(_text_:internet in 2688) [ClassicSimilarity], result of:
      0.026183454 = score(doc=2688,freq=4.0), product of:
        0.11352337 = queryWeight, product of:
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.038453303 = queryNorm
        0.23064373 = fieldWeight in 2688, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2688)
  0.16666667 = coord(1/6)
```
Abstract

The widespread availability of the Internet and the variety of Internet-based applications have resulted in a significant increase in the amount of web pages. Determining the behaviors of search engine users has become a critical step in enhancing search engine performance. Search engine user behaviors can be determined by content-based or content-ignorant algorithms. Although many content-ignorant studies have been performed to automatically identify new topics, previous results have demonstrated that spelling errors can cause significant errors in topic shift estimates. In this study, we focused on minimizing the number of wrong estimates that were based on spelling errors. We developed a new hybrid algorithm combining character n-gram and neural network methodologies, and compared the experimental results with results from previous studies. For the FAST and Excite datasets, the proposed algorithm improved topic shift estimates by 6.987% and 2.639%, respectively. Moreover, we analyzed the performance of the character n-gram method in different aspects including the comparison with Levenshtein edit-distance method. The experimental results demonstrated that the character n-gram method outperformed to the Levensthein edit distance method in terms of topic identification.
Kajanan, S.; Bao, Y.; Datta, A.; VanderMeer, D.; Dutta, K.: Efficient automatic search query formulation using phrase-level analysis (2014) 0.00
```
0.0034911274 = product of:
  0.020946763 = sum of:
    0.020946763 = weight(_text_:internet in 1264) [ClassicSimilarity], result of:
      0.020946763 = score(doc=1264,freq=4.0), product of:
        0.11352337 = queryWeight, product of:
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.038453303 = queryNorm
        0.18451498 = fieldWeight in 1264, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.03125 = fieldNorm(doc=1264)
  0.16666667 = coord(1/6)
```
Abstract

Over the past decade, the volume of information available digitally over the Internet has grown enormously. Technical developments in the area of search, such as Google's Page Rank algorithm, have proved so good at serving relevant results that Internet search has become integrated into daily human activity. One can endlessly explore topics of interest simply by querying and reading through the resulting links. Yet, although search engines are well known for providing relevant results based on users' queries, users do not always receive the results they are looking for. Google's Director of Research describes clickstream evidence of frustrated users repeatedly reformulating queries and searching through page after page of results. Given the general quality of search engine results, one must consider the possibility that the frustrated user's query is not effective; that is, it does not describe the essence of the user's interest. Indeed, extensive research into human search behavior has found that humans are not very effective at formulating good search queries that describe what they are interested in. Ideally, the user should simply point to a portion of text that sparked the user's interest, and a system should automatically formulate a search query that captures the essence of the text. In this paper, we describe an implemented system that provides this capability. We first describe how our work differs from existing work in automatic query formulation, and propose a new method for improved quantification of the relevance of candidate search terms drawn from input text using phrase-level analysis. We then propose an implementable method designed to provide relevant queries based on a user's text input. We demonstrate the quality of our results and performance of our system through experimental studies. Our results demonstrate that our system produces relevant search terms with roughly two-thirds precision and recall compared to search terms selected by experts, and that typical users find significantly more relevant results (31% more relevant) more quickly (64% faster) using our system than self-formulated search queries. Further, we show that our implementation can scale to request loads of up to 10 requests per second within current online responsiveness expectations (<2-second response times at the highest loads tested).

Luo, Z.; Yu, Y.; Osborne, M.; Wang, T.: Structuring tweets for improving Twitter search (2015) 0.00

0.00308575 = product of:
  0.018514499 = sum of:
    0.018514499 = weight(_text_:internet in 2335) [ClassicSimilarity], result of:
      0.018514499 = score(doc=2335,freq=2.0), product of:
        0.11352337 = queryWeight, product of:
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.038453303 = queryNorm
        0.16308975 = fieldWeight in 2335, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2335)
  0.16666667 = coord(1/6)

Theme: Internet

Renker, L.: Exploration von Textkorpora : Topic Models als Grundlage der Interaktion (2015) 0.00
```
0.00308575 = product of:
  0.018514499 = sum of:
    0.018514499 = weight(_text_:internet in 2380) [ClassicSimilarity], result of:
      0.018514499 = score(doc=2380,freq=2.0), product of:
        0.11352337 = queryWeight, product of:
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.038453303 = queryNorm
        0.16308975 = fieldWeight in 2380, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2380)
  0.16666667 = coord(1/6)
```
Abstract

Das Internet birgt schier endlose Informationen. Ein zentrales Problem besteht heutzutage darin diese auch zugänglich zu machen. Es ist ein fundamentales Domänenwissen erforderlich, um in einer Volltextsuche die korrekten Suchanfragen zu formulieren. Das ist jedoch oftmals nicht vorhanden, so dass viel Zeit aufgewandt werden muss, um einen Überblick des behandelten Themas zu erhalten. In solchen Situationen findet sich ein Nutzer in einem explorativen Suchvorgang, in dem er sich schrittweise an ein Thema heranarbeiten muss. Für die Organisation von Daten werden mittlerweile ganz selbstverständlich Verfahren des Machine Learnings verwendet. In den meisten Fällen bleiben sie allerdings für den Anwender unsichtbar. Die interaktive Verwendung in explorativen Suchprozessen könnte die menschliche Urteilskraft enger mit der maschinellen Verarbeitung großer Datenmengen verbinden. Topic Models sind ebensolche Verfahren. Sie finden in einem Textkorpus verborgene Themen, die sich relativ gut von Menschen interpretieren lassen und sind daher vielversprechend für die Anwendung in explorativen Suchprozessen. Nutzer können damit beim Verstehen unbekannter Quellen unterstützt werden. Bei der Betrachtung entsprechender Forschungsarbeiten fiel auf, dass Topic Models vorwiegend zur Erzeugung statischer Visualisierungen verwendet werden. Das Sensemaking ist ein wesentlicher Bestandteil der explorativen Suche und wird dennoch nur in sehr geringem Umfang genutzt, um algorithmische Neuerungen zu begründen und in einen umfassenden Kontext zu setzen. Daraus leitet sich die Vermutung ab, dass die Verwendung von Modellen des Sensemakings und die nutzerzentrierte Konzeption von explorativen Suchen, neue Funktionen für die Interaktion mit Topic Models hervorbringen und einen Kontext für entsprechende Forschungsarbeiten bieten können.

Rozinajová, V.; Macko, P.: Using natural language to search linked data (2017) 0.00

0.00308575 = product of:
  0.018514499 = sum of:
    0.018514499 = weight(_text_:internet in 3488) [ClassicSimilarity], result of:
      0.018514499 = score(doc=3488,freq=2.0), product of:
        0.11352337 = queryWeight, product of:
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.038453303 = queryNorm
        0.16308975 = fieldWeight in 3488, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3488)
  0.16666667 = coord(1/6)

Series: Information Systems and Applications, incl. Internet/Web, and HCI; 10151

¬Die Bibel als Stilkompass (2019) 0.00
```
0.00308575 = product of:
  0.018514499 = sum of:
    0.018514499 = weight(_text_:internet in 5331) [ClassicSimilarity], result of:
      0.018514499 = score(doc=5331,freq=2.0), product of:
        0.11352337 = queryWeight, product of:
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.038453303 = queryNorm
        0.16308975 = fieldWeight in 5331, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5331)
  0.16666667 = coord(1/6)
```
Content

"Die Heilige Schrift gibt es nicht nur in mehreren hundert Sprachen, sondern oft innerhalb eines Sprachraums auch in mehreren Varianten. Britische Leser konnen unter anderem zwischen der bewusst sehr einfach geschriebenen Bible in Basic English und der linguistisch komplexen King James Version aus dem 17. Jahrhundert wahlen. Die Fassungen unterscheiden sich in Satzlänge, Wortwahl sowie Förmlichkeit und sprechen so Menschen aus verschiedenen Kulturen und mit unterschiedlichem Bildungsstand an. Ein Team um Keith Carlson vom Dartmouth College will die insgesamt 34 englischsprachigen Versionen der Bibel nun dazu nutzen, um Computern unterschiedliche Stilformen beizubringen Bisher übersetzen entsprechende Programme zwar Fremdsprachen, zum Teil mit beeindruckender Genauigkeit. Oft scheitern sie aber, wenn sie einen Text zielsicher stilistisch verändern sollen, vor allem wenn es dabei um mehr als ein einzelnes Merkmal wie beispielsweise die Komplexität geht. Die Bibel eigne sich mit ihren rund 31 000 Versen wie kein anderes Werk für das Training von Übersetzungsprogrammen, argumentiert das Team um Carlson. Schließlich seien alle Fassungen sehr gewissenhaft von Menschen übersetzt und außerdem Vers für Vers durchnummeriert worden. Das erleichtere einer Maschine die Zuordnung und sei bei anderen umfangreichen Schriftquellen wie dem Werk von William Shakespeare oder der Wikipedia nicht zwangsläufig der Fall. Als erste Demonstration haben die Forscher zwei Algorithmen, von denen einer auf neuronalen Netzen basierte, mit acht frei im Internet verfügbaren Bibelversionen trainiert. Anschließend testeten sie, wie gut die beiden Programme Verse der Vorlagen in einen gewünschten Stil übertrugen, ohne dass die Software auf die anvisierte Fassung der Bibel zugreifen konnte. Insgesamt seien die automatischen Übersetzer dem Ziel schon recht nahegekommen, berichten die Forscher. Sie sehen ihre Arbeit aber erst als Startpunkt bei der Entwicklung einer künstlichen Intelligenz, die souverän zwischen verschiedenen Sprachstilen wechseln kann."

Snajder, J.: Distributional semantics of multi-word expressions (2013) 0.00

0.0029206579 = product of:
  0.017523946 = sum of:
    0.017523946 = product of:
      0.052571837 = sum of:
        0.052571837 = weight(_text_:29 in 2868) [ClassicSimilarity], result of:
          0.052571837 = score(doc=2868,freq=2.0), product of:
            0.13526669 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.038453303 = queryNorm
            0.38865322 = fieldWeight in 2868, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.078125 = fieldNorm(doc=2868)
      0.33333334 = coord(1/3)
  0.16666667 = coord(1/6)

Date: 29. 4.2016 12:04:50

Engerer, V.: Informationswissenschaft und Linguistik. : kurze Geschichte eines fruchtbaren interdisziplinäaren Verhäaltnisses in drei Akten (2012) 0.00

0.0029206579 = product of:
  0.017523946 = sum of:
    0.017523946 = product of:
      0.052571837 = sum of:
        0.052571837 = weight(_text_:29 in 3376) [ClassicSimilarity], result of:
          0.052571837 = score(doc=3376,freq=2.0), product of:
            0.13526669 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.038453303 = queryNorm
            0.38865322 = fieldWeight in 3376, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.078125 = fieldNorm(doc=3376)
      0.33333334 = coord(1/3)
  0.16666667 = coord(1/6)

Date: 19. 2.2017 13:29:08

Clark, M.; Kim, Y.; Kruschwitz, U.; Song, D.; Albakour, D.; Dignum, S.; Beresi, U.C.; Fasli, M.; Roeck, A De: Automatically structuring domain knowledge from text : an overview of current research (2012) 0.00

0.0024782603 = product of:
  0.014869561 = sum of:
    0.014869561 = product of:
      0.044608682 = sum of:
        0.044608682 = weight(_text_:29 in 2738) [ClassicSimilarity], result of:
          0.044608682 = score(doc=2738,freq=4.0), product of:
            0.13526669 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.038453303 = queryNorm
            0.3297832 = fieldWeight in 2738, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.046875 = fieldNorm(doc=2738)
      0.33333334 = coord(1/3)
  0.16666667 = coord(1/6)

Date: 29. 1.2016 18:29:51

Belbachir, F.; Boughanem, M.: Using language models to improve opinion detection (2018) 0.00
```
0.0024685997 = product of:
  0.014811598 = sum of:
    0.014811598 = weight(_text_:internet in 5044) [ClassicSimilarity], result of:
      0.014811598 = score(doc=5044,freq=2.0), product of:
        0.11352337 = queryWeight, product of:
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.038453303 = queryNorm
        0.1304718 = fieldWeight in 5044, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.9522398 = idf(docFreq=6276, maxDocs=44218)
          0.03125 = fieldNorm(doc=5044)
  0.16666667 = coord(1/6)
```
Abstract

Opinion mining is one of the most important research tasks in the information retrieval research community. With the huge volume of opinionated data available on the Web, approaches must be developed to differentiate opinion from fact. In this paper, we present a lexicon-based approach for opinion retrieval. Generally, opinion retrieval consists of two stages: relevance to the query and opinion detection. In our work, we focus on the second state which itself focusses on detecting opinionated documents . We compare the document to be analyzed with opinionated sources that contain subjective information. We hypothesize that a document with a strong similarity to opinionated sources is more likely to be opinionated itself. Typical lexicon-based approaches treat and choose their opinion sources according to their test collection, then calculate the opinion score based on the frequency of subjective terms in the document. In our work, we use different open opinion collections without any specific treatment and consider them as a reference collection. We then use language models to determine opinion scores. The analysis document and reference collection are represented by different language models (i.e., Dirichlet, Jelinek-Mercer and two-stage models). These language models are generally used in information retrieval to represent the relationship between documents and queries. However, in our study, we modify these language models to represent opinionated documents. We carry out several experiments using Text REtrieval Conference (TREC) Blogs 06 as our analysis collection and Internet Movie Data Bases (IMDB), Multi-Perspective Question Answering (MPQA) and CHESLY as our reference collection. To improve opinion detection, we study the impact of using different language models to represent the document and reference collection alongside different combinations of opinion and retrieval scores. We then use this data to deduce the best opinion detection models. Using the best models, our approach improves on the best baseline of TREC Blog (baseline4) by 30%.

Lezius, W.: Morphy - Morphologie und Tagging für das Deutsche (2013) 0.00

0.0023155077 = product of:
  0.0138930455 = sum of:
    0.0138930455 = product of:
      0.041679136 = sum of:
        0.041679136 = weight(_text_:22 in 1490) [ClassicSimilarity], result of:
          0.041679136 = score(doc=1490,freq=2.0), product of:
            0.13465692 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.038453303 = queryNorm
            0.30952093 = fieldWeight in 1490, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=1490)
      0.33333334 = coord(1/3)
  0.16666667 = coord(1/6)

Date: 22. 3.2015 9:30:24

Stoykova, V.; Petkova, E.: Automatic extraction of mathematical terms for precalculus (2012) 0.00

0.0020444603 = product of:
  0.012266762 = sum of:
    0.012266762 = product of:
      0.036800284 = sum of:
        0.036800284 = weight(_text_:29 in 156) [ClassicSimilarity], result of:
          0.036800284 = score(doc=156,freq=2.0), product of:
            0.13526669 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.038453303 = queryNorm
            0.27205724 = fieldWeight in 156, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0546875 = fieldNorm(doc=156)
      0.33333334 = coord(1/3)
  0.16666667 = coord(1/6)

Date: 29. 5.2012 10:17:08

Rayson, P.; Piao, S.; Sharoff, S.; Evert, S.; Moiron, B.V.: Multiword expressions : hard going or plain sailing? (2015) 0.00

0.0020444603 = product of:
  0.012266762 = sum of:
    0.012266762 = product of:
      0.036800284 = sum of:
        0.036800284 = weight(_text_:29 in 2918) [ClassicSimilarity], result of:
          0.036800284 = score(doc=2918,freq=2.0), product of:
            0.13526669 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.038453303 = queryNorm
            0.27205724 = fieldWeight in 2918, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2918)
      0.33333334 = coord(1/3)
  0.16666667 = coord(1/6)

Date: 29. 4.2016 12:05:56

Babik, W.: Keywords as linguistic tools in information and knowledge organization (2017) 0.00
```
0.0020444603 = product of:
  0.012266762 = sum of:
    0.012266762 = product of:
      0.036800284 = sum of:
        0.036800284 = weight(_text_:29 in 3510) [ClassicSimilarity], result of:
          0.036800284 = score(doc=3510,freq=2.0), product of:
            0.13526669 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.038453303 = queryNorm
            0.27205724 = fieldWeight in 3510, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3510)
      0.33333334 = coord(1/3)
  0.16666667 = coord(1/6)
```
Source

Theorie, Semantik und Organisation von Wissen: Proceedings der 13. Tagung der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) und dem 13. Internationalen Symposium der Informationswissenschaft der Higher Education Association for Information Science (HI) Potsdam (19.-20.03.2013): 'Theory, Information and Organization of Knowledge' / Proceedings der 14. Tagung der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) und Natural Language & Information Systems (NLDB) Passau (16.06.2015): 'Lexical Resources for Knowledge Organization' / Proceedings des Workshops der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) auf der SEMANTICS Leipzig (1.09.2014): 'Knowledge Organization and Semantic Web' / Proceedings des Workshops der Polnischen und Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) Cottbus (29.-30.09.2011): 'Economics of Knowledge Production and Organization'. Hrsg. von W. Babik, H.P. Ohly u. K. Weber

Schöneberg, U.; Sperber, W.: POS tagging and its applications for mathematics (2014) 0.00

0.0017523945 = product of:
  0.010514366 = sum of:
    0.010514366 = product of:
      0.0315431 = sum of:
        0.0315431 = weight(_text_:29 in 1748) [ClassicSimilarity], result of:
          0.0315431 = score(doc=1748,freq=2.0), product of:
            0.13526669 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.038453303 = queryNorm
            0.23319192 = fieldWeight in 1748, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.046875 = fieldNorm(doc=1748)
      0.33333334 = coord(1/3)
  0.16666667 = coord(1/6)

Date: 29. 3.2015 19:34:37

Snajder, J.; Almic, P.: Modeling semantic compositionality of Croatian multiword expressions (2015) 0.00

0.0017523945 = product of:
  0.010514366 = sum of:
    0.010514366 = product of:
      0.0315431 = sum of:
        0.0315431 = weight(_text_:29 in 2920) [ClassicSimilarity], result of:
          0.0315431 = score(doc=2920,freq=2.0), product of:
            0.13526669 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.038453303 = queryNorm
            0.23319192 = fieldWeight in 2920, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.046875 = fieldNorm(doc=2920)
      0.33333334 = coord(1/3)
  0.16666667 = coord(1/6)

Date: 29. 4.2016 12:42:17

Geißler, S.: Maschinelles Lernen und NLP : Reif für die industrielle Anwendung! (2019) 0.00

0.0017523945 = product of:
  0.010514366 = sum of:
    0.010514366 = product of:
      0.0315431 = sum of:
        0.0315431 = weight(_text_:29 in 3547) [ClassicSimilarity], result of:
          0.0315431 = score(doc=3547,freq=2.0), product of:
            0.13526669 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.038453303 = queryNorm
            0.23319192 = fieldWeight in 3547, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.046875 = fieldNorm(doc=3547)
      0.33333334 = coord(1/3)
  0.16666667 = coord(1/6)

Date: 2. 9.2019 19:29:24

Lawrie, D.; Mayfield, J.; McNamee, P.; Oard, P.W.: Cross-language person-entity linking from 20 languages (2015) 0.00
```
0.0017366307 = product of:
  0.010419784 = sum of:
    0.010419784 = product of:
      0.03125935 = sum of:
        0.03125935 = weight(_text_:22 in 1848) [ClassicSimilarity], result of:
          0.03125935 = score(doc=1848,freq=2.0), product of:
            0.13465692 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.038453303 = queryNorm
            0.23214069 = fieldWeight in 1848, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=1848)
      0.33333334 = coord(1/3)
  0.16666667 = coord(1/6)
```
Abstract

The goal of entity linking is to associate references to an entity that is found in unstructured natural language content to an authoritative inventory of known entities. This article describes the construction of 6 test collections for cross-language person-entity linking that together span 22 languages. Fully automated components were used together with 2 crowdsourced validation stages to affordably generate ground-truth annotations with an accuracy comparable to that of a completely manual process. The resulting test collections each contain between 642 (Arabic) and 2,361 (Romanian) person references in non-English texts for which the correct resolution in English Wikipedia is known, plus a similar number of references for which no correct resolution into English Wikipedia is believed to exist. Fully automated cross-language person-name linking experiments with 20 non-English languages yielded a resolution accuracy of between 0.84 (Serbian) and 0.98 (Romanian), which compares favorably with previously reported cross-language entity linking results for Spanish.

Search (25 results, page 1 of 2)

Authors

Languages

Types

Themes

Subjects