Search (158 results, page 2 of 8)

Granitzer, M.: Statistische Verfahren der Textanalyse (2006) 0.00
```
0.004167614 = product of:
  0.025005683 = sum of:
    0.025005683 = product of:
      0.050011367 = sum of:
        0.050011367 = weight(_text_:web in 5809) [ClassicSimilarity], result of:
          0.050011367 = score(doc=5809,freq=6.0), product of:
            0.11439841 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03505379 = queryNorm
            0.43716836 = fieldWeight in 5809, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5809)
      0.5 = coord(1/2)
  0.16666667 = coord(1/6)
```
Abstract

Der vorliegende Artikel bietet einen Überblick über statistische Verfahren der Textanalyse im Kontext des Semantic Webs. Als Einleitung erfolgt die Diskussion von Methoden und gängigen Techniken zur Vorverarbeitung von Texten wie z. B. Stemming oder Part-of-Speech Tagging. Die so eingeführten Repräsentationsformen dienen als Basis für statistische Merkmalsanalysen sowie für weiterführende Techniken wie Information Extraction und maschinelle Lernverfahren. Die Darstellung dieser speziellen Techniken erfolgt im Überblick, wobei auf die wichtigsten Aspekte in Bezug auf das Semantic Web detailliert eingegangen wird. Die Anwendung der vorgestellten Techniken zur Erstellung und Wartung von Ontologien sowie der Verweis auf weiterführende Literatur bilden den Abschluss dieses Artikels.

Source

Semantic Web: Wege zur vernetzten Wissensgesellschaft. Hrsg.: T. Pellegrini, u. A. Blumauer

Theme

Semantic Web
Wang, F.L.; Yang, C.C.: Mining Web data for Chinese segmentation (2007) 0.00
```
0.003843119 = product of:
  0.023058712 = sum of:
    0.023058712 = product of:
      0.046117425 = sum of:
        0.046117425 = weight(_text_:web in 604) [ClassicSimilarity], result of:
          0.046117425 = score(doc=604,freq=10.0), product of:
            0.11439841 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03505379 = queryNorm
            0.40312994 = fieldWeight in 604, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=604)
      0.5 = coord(1/2)
  0.16666667 = coord(1/6)
```
Abstract

Modern information retrieval systems use keywords within documents as indexing terms for search of relevant documents. As Chinese is an ideographic character-based language, the words in the texts are not delimited by white spaces. Indexing of Chinese documents is impossible without a proper segmentation algorithm. Many Chinese segmentation algorithms have been proposed in the past. Traditional segmentation algorithms cannot operate without a large dictionary or a large corpus of training data. Nowadays, the Web has become the largest corpus that is ideal for Chinese segmentation. Although most search engines have problems in segmenting texts into proper words, they maintain huge databases of documents and frequencies of character sequences in the documents. Their databases are important potential resources for segmentation. In this paper, we propose a segmentation algorithm by mining Web data with the help of search engines. On the other hand, the Romanized pinyin of Chinese language indicates boundaries of words in the text. Our algorithm is the first to utilize the Romanized pinyin to segmentation. It is the first unified segmentation algorithm for the Chinese language from different geographical areas, and it is also domain independent because of the nature of the Web. Experiments have been conducted on the datasets of a recent Chinese segmentation competition. The results show that our algorithm outperforms the traditional algorithms in terms of precision and recall. Moreover, our algorithm can effectively deal with the problems of segmentation ambiguity, new word (unknown word) detection, and stop words.

Footnote

Beitrag eines Themenschwerpunktes "Mining Web resources for enhancing information retrieval"

Hahn, U.; Reimer, U.: Informationslinguistische Konzepte der Volltextverarbeitung in TOPIC (1983) 0.00

0.0037274342 = product of:
  0.022364605 = sum of:
    0.022364605 = product of:
      0.06709381 = sum of:
        0.06709381 = weight(_text_:29 in 450) [ClassicSimilarity], result of:
          0.06709381 = score(doc=450,freq=2.0), product of:
            0.12330827 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.03505379 = queryNorm
            0.5441145 = fieldWeight in 450, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.109375 = fieldNorm(doc=450)
      0.33333334 = coord(1/3)
  0.16666667 = coord(1/6)

Source: Deutscher Dokumentartag 1982, Lübeck-Travemünde, 29.-30.9.1982: Fachinformation im Zeitalter der Informationsindustrie. Bearb.: H. Strohl-Goebel

Proszeky, G.: Language technology tools in the translator's practice (1999) 0.00

0.0037274342 = product of:
  0.022364605 = sum of:
    0.022364605 = product of:
      0.06709381 = sum of:
        0.06709381 = weight(_text_:29 in 6873) [ClassicSimilarity], result of:
          0.06709381 = score(doc=6873,freq=2.0), product of:
            0.12330827 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.03505379 = queryNorm
            0.5441145 = fieldWeight in 6873, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.109375 = fieldNorm(doc=6873)
      0.33333334 = coord(1/3)
  0.16666667 = coord(1/6)

Date: 30. 3.2002 18:29:40

McMahon, J.G.; Smith, F.J.: Improved statistical language model performance with automatic generated word hierarchies (1996) 0.00

0.0036939036 = product of:
  0.02216342 = sum of:
    0.02216342 = product of:
      0.06649026 = sum of:
        0.06649026 = weight(_text_:22 in 3164) [ClassicSimilarity], result of:
          0.06649026 = score(doc=3164,freq=2.0), product of:
            0.1227524 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03505379 = queryNorm
            0.5416616 = fieldWeight in 3164, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=3164)
      0.33333334 = coord(1/3)
  0.16666667 = coord(1/6)

Source: Computational linguistics. 22(1996) no.2, S.217-248

Ruge, G.: ¬A spreading activation network for automatic generation of thesaurus relationships (1991) 0.00

0.0036939036 = product of:
  0.02216342 = sum of:
    0.02216342 = product of:
      0.06649026 = sum of:
        0.06649026 = weight(_text_:22 in 4506) [ClassicSimilarity], result of:
          0.06649026 = score(doc=4506,freq=2.0), product of:
            0.1227524 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03505379 = queryNorm
            0.5416616 = fieldWeight in 4506, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=4506)
      0.33333334 = coord(1/3)
  0.16666667 = coord(1/6)

Date: 8.10.2000 11:52:22

Somers, H.: Example-based machine translation : Review article (1999) 0.00

0.0036939036 = product of:
  0.02216342 = sum of:
    0.02216342 = product of:
      0.06649026 = sum of:
        0.06649026 = weight(_text_:22 in 6672) [ClassicSimilarity], result of:
          0.06649026 = score(doc=6672,freq=2.0), product of:
            0.1227524 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03505379 = queryNorm
            0.5416616 = fieldWeight in 6672, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=6672)
      0.33333334 = coord(1/3)
  0.16666667 = coord(1/6)

Date: 31. 7.1996 9:22:19

New tools for human translators (1997) 0.00

0.0036939036 = product of:
  0.02216342 = sum of:
    0.02216342 = product of:
      0.06649026 = sum of:
        0.06649026 = weight(_text_:22 in 1179) [ClassicSimilarity], result of:
          0.06649026 = score(doc=1179,freq=2.0), product of:
            0.1227524 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03505379 = queryNorm
            0.5416616 = fieldWeight in 1179, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=1179)
      0.33333334 = coord(1/3)
  0.16666667 = coord(1/6)

Date: 31. 7.1996 9:22:19

Baayen, R.H.; Lieber, H.: Word frequency distributions and lexical semantics (1997) 0.00

0.0036939036 = product of:
  0.02216342 = sum of:
    0.02216342 = product of:
      0.06649026 = sum of:
        0.06649026 = weight(_text_:22 in 3117) [ClassicSimilarity], result of:
          0.06649026 = score(doc=3117,freq=2.0), product of:
            0.1227524 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03505379 = queryNorm
            0.5416616 = fieldWeight in 3117, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=3117)
      0.33333334 = coord(1/3)
  0.16666667 = coord(1/6)

Date: 28. 2.1999 10:48:22

¬Der Student aus dem Computer (2023) 0.00

0.0036939036 = product of:
  0.02216342 = sum of:
    0.02216342 = product of:
      0.06649026 = sum of:
        0.06649026 = weight(_text_:22 in 1079) [ClassicSimilarity], result of:
          0.06649026 = score(doc=1079,freq=2.0), product of:
            0.1227524 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03505379 = queryNorm
            0.5416616 = fieldWeight in 1079, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=1079)
      0.33333334 = coord(1/3)
  0.16666667 = coord(1/6)

Date: 27. 1.2023 16:22:55

Radev, D.; Fan, W.; Qu, H.; Wu, H.; Grewal, A.: Probabilistic question answering on the Web (2005) 0.00
```
0.0035722405 = product of:
  0.021433443 = sum of:
    0.021433443 = product of:
      0.042866886 = sum of:
        0.042866886 = weight(_text_:web in 3455) [ClassicSimilarity], result of:
          0.042866886 = score(doc=3455,freq=6.0), product of:
            0.11439841 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03505379 = queryNorm
            0.37471575 = fieldWeight in 3455, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=3455)
      0.5 = coord(1/2)
  0.16666667 = coord(1/6)
```
Abstract

Web-based search engines such as Google and NorthernLight return documents that are relevant to a user query, not answers to user questions. We have developed an architecture that augments existing search engines so that they support natural language question answering. The process entails five steps: query modulation, document retrieval, passage extraction, phrase extraction, and answer ranking. In this article, we describe some probabilistic approaches to the last three of these stages. We show how our techniques apply to a number of existing search engines, and we also present results contrasting three different methods for question answering. Our algorithm, probabilistic phrase reranking (PPR), uses proximity and question type features and achieves a total reciprocal document rank of .20 an the TREC8 corpus. Our techniques have been implemented as a Web-accessible system, called NSIR.
Thelwall, M.; Price, L.: Language evolution and the spread of ideas on the Web : a procedure for identifying emergent hybrid word (2006) 0.00
```
0.0035722405 = product of:
  0.021433443 = sum of:
    0.021433443 = product of:
      0.042866886 = sum of:
        0.042866886 = weight(_text_:web in 5896) [ClassicSimilarity], result of:
          0.042866886 = score(doc=5896,freq=6.0), product of:
            0.11439841 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03505379 = queryNorm
            0.37471575 = fieldWeight in 5896, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=5896)
      0.5 = coord(1/2)
  0.16666667 = coord(1/6)
```
Abstract

Word usage is of interest to linguists for its own sake as well as to social scientists and others who seek to track the spread of ideas, for example, in public debates over political decisions. The historical evolution of language can be analyzed with the tools of corpus linguistics through evolving corpora and the Web. But word usage statistics can only be gathered for known words. In this article, techniques are described and tested for identifying new words from the Web, focusing on the case when the words are related to a topic and have a hybrid form with a common sequence of letters. The results highlight the need to employ a combination of search techniques and show the wide potential of hybrid word family investigations in linguistics and social science.
Jensen, N.: Evaluierung von mehrsprachigem Web-Retrieval : Experimente mit dem EuroGOV-Korpus im Rahmen des Cross Language Evaluation Forum (CLEF) (2006) 0.00
```
0.0035722405 = product of:
  0.021433443 = sum of:
    0.021433443 = product of:
      0.042866886 = sum of:
        0.042866886 = weight(_text_:web in 5964) [ClassicSimilarity], result of:
          0.042866886 = score(doc=5964,freq=6.0), product of:
            0.11439841 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03505379 = queryNorm
            0.37471575 = fieldWeight in 5964, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=5964)
      0.5 = coord(1/2)
  0.16666667 = coord(1/6)
```
Abstract

Der vorliegende Artikel beschreibt die Experimente der Universität Hildesheim im Rahmen des ersten Web Track der CLEF-Initiative (WebCLEF) im Jahr 2005. Bei der Teilnahme konnten Erfahrungen mit einem multilingualen Web-Korpus (EuroGOV) bei der Vorverarbeitung, der Topic- bzw. Query-Entwicklung, bei sprachunabhängigen Indexierungsmethoden und multilingualen Retrieval-Strategien gesammelt werden. Aufgrund des großen Um-fangs des Korpus und der zeitlichen Einschränkungen wurden multilinguale Indizes aufgebaut. Der Artikel beschreibt die Vorgehensweise bei der Teilnahme der Universität Hildesheim und die Ergebnisse der offiziell eingereichten sowie weiterer Experimente. Für den Multilingual Task konnte das beste Ergebnis in CLEF erzielt werden.
Airio, E.: Who benefits from CLIR in web retrieval? (2008) 0.00
```
0.0035722405 = product of:
  0.021433443 = sum of:
    0.021433443 = product of:
      0.042866886 = sum of:
        0.042866886 = weight(_text_:web in 2342) [ClassicSimilarity], result of:
          0.042866886 = score(doc=2342,freq=6.0), product of:
            0.11439841 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03505379 = queryNorm
            0.37471575 = fieldWeight in 2342, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=2342)
      0.5 = coord(1/2)
  0.16666667 = coord(1/6)
```
Abstract

Purpose - The aim of the current paper is to test whether query translation is beneficial in web retrieval. Design/methodology/approach - The language pairs were Finnish-Swedish, English-German and Finnish-French. A total of 12-18 participants were recruited for each language pair. Each participant performed four retrieval tasks. The author's aim was to compare the performance of the translated queries with that of the target language queries. Thus, the author asked participants to formulate a source language query and a target language query for each task. The source language queries were translated into the target language utilizing a dictionary-based system. In English-German, also machine translation was utilized. The author used Google as the search engine. Findings - The results differed depending on the language pair. The author concluded that the dictionary coverage had an effect on the results. On average, the results of query-translation were better than in the traditional laboratory tests. Originality/value - This research shows that query translation in web is beneficial especially for users with moderate and non-active language skills. This is valuable information for developers of cross-language information retrieval systems.

Dreehsen, B.: ¬Der PC als Dolmetscher (1998) 0.00

0.00343739 = product of:
  0.02062434 = sum of:
    0.02062434 = product of:
      0.04124868 = sum of:
        0.04124868 = weight(_text_:web in 1474) [ClassicSimilarity], result of:
          0.04124868 = score(doc=1474,freq=2.0), product of:
            0.11439841 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03505379 = queryNorm
            0.36057037 = fieldWeight in 1474, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.078125 = fieldNorm(doc=1474)
      0.5 = coord(1/2)
  0.16666667 = coord(1/6)

Abstract: Für englische Web-Seiten und fremdsprachige Korrespondenz ist Übersetzungssoftware hilfreich, die per Mausklick den Text ins Deutsche überträgt und umgekehrt. Die neuen Versionen geben den Inhalt sinngemäß bereits gut wieder. CHIP hat die Leistungen von 5 Programmen getestet

Li, Q.; Chen, Y.P.; Myaeng, S.-H.; Jin, Y.; Kang, B.-Y.: Concept unification of terms in different languages via web mining for Information Retrieval (2009) 0.00
```
0.00343739 = product of:
  0.02062434 = sum of:
    0.02062434 = product of:
      0.04124868 = sum of:
        0.04124868 = weight(_text_:web in 4215) [ClassicSimilarity], result of:
          0.04124868 = score(doc=4215,freq=8.0), product of:
            0.11439841 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03505379 = queryNorm
            0.36057037 = fieldWeight in 4215, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4215)
      0.5 = coord(1/2)
  0.16666667 = coord(1/6)
```
Abstract

For historical and cultural reasons, English phrases, especially proper nouns and new words, frequently appear in Web pages written primarily in East Asian languages such as Chinese, Korean, and Japanese. Although such English terms and their equivalences in these East Asian languages refer to the same concept, they are often erroneously treated as independent index units in traditional Information Retrieval (IR). This paper describes the degree to which the problem arises in IR and proposes a novel technique to solve it. Our method first extracts English terms from native Web documents in an East Asian language, and then unifies the extracted terms and their equivalences in the native language as one index unit. For Cross-Language Information Retrieval (CLIR), one of the major hindrances to achieving retrieval performance at the level of Mono-Lingual Information Retrieval (MLIR) is the translation of terms in search queries which can not be found in a bilingual dictionary. The Web mining approach proposed in this paper for concept unification of terms in different languages can also be applied to solve this well-known challenge in CLIR. Experimental results based on NTCIR and KT-Set test collections show that the high translation precision of our approach greatly improves performance of both Mono-Lingual and Cross-Language Information Retrieval.
Rajasurya, S.; Muralidharan, T.; Devi, S.; Swamynathan, S.: Semantic information retrieval using ontology in university domain (2012) 0.00
```
0.00343739 = product of:
  0.02062434 = sum of:
    0.02062434 = product of:
      0.04124868 = sum of:
        0.04124868 = weight(_text_:web in 2861) [ClassicSimilarity], result of:
          0.04124868 = score(doc=2861,freq=8.0), product of:
            0.11439841 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03505379 = queryNorm
            0.36057037 = fieldWeight in 2861, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2861)
      0.5 = coord(1/2)
  0.16666667 = coord(1/6)
```
Abstract

Today's conventional search engines hardly do provide the essential content relevant to the user's search query. This is because the context and semantics of the request made by the user is not analyzed to the full extent. So here the need for a semantic web search arises. SWS is upcoming in the area of web search which combines Natural Language Processing and Artificial Intelligence. The objective of the work done here is to design, develop and implement a semantic search engine- SIEU(Semantic Information Extraction in University Domain) confined to the university domain. SIEU uses ontology as a knowledge base for the information retrieval process. It is not just a mere keyword search. It is one layer above what Google or any other search engines retrieve by analyzing just the keywords. Here the query is analyzed both syntactically and semantically. The developed system retrieves the web results more relevant to the user query through keyword expansion. The results obtained here will be accurate enough to satisfy the request made by the user. The level of accuracy will be enhanced since the query is analyzed semantically. The system will be of great use to the developers and researchers who work on web. The Google results are re-ranked and optimized for providing the relevant links. For ranking an algorithm has been applied which fetches more apt results for the user query.
Rozinajová, V.; Macko, P.: Using natural language to search linked data (2017) 0.00
```
0.00343739 = product of:
  0.02062434 = sum of:
    0.02062434 = product of:
      0.04124868 = sum of:
        0.04124868 = weight(_text_:web in 3488) [ClassicSimilarity], result of:
          0.04124868 = score(doc=3488,freq=8.0), product of:
            0.11439841 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03505379 = queryNorm
            0.36057037 = fieldWeight in 3488, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3488)
      0.5 = coord(1/2)
  0.16666667 = coord(1/6)
```
Abstract

There are many endeavors aiming to offer users more effective ways of getting relevant information from web. One of them is represented by a concept of Linked Data, which provides interconnected data sources. But querying these types of data is difficult not only for the conventional web users but also for ex-perts in this field. Therefore, a more comfortable way of user query would be of great value. One direction could be to allow the user to use a natural language. To make this task easier we have proposed a method for translating natural language query to SPARQL query. It is based on a sentence structure - utilizing dependen-cies between the words in user queries. Dependencies are used to map the query to the semantic web structure, which is in the next step translated to SPARQL query. According to our first experiments we are able to answer a significant group of user queries.

Series

Information Systems and Applications, incl. Internet/Web, and HCI; 10151
Kuo, J.-S.; Li, H.; Yang, Y.-K.: Active learning for constructing transliteration lexicons from the Web (2008) 0.00
```
0.0034028427 = product of:
  0.020417055 = sum of:
    0.020417055 = product of:
      0.04083411 = sum of:
        0.04083411 = weight(_text_:web in 1345) [ClassicSimilarity], result of:
          0.04083411 = score(doc=1345,freq=4.0), product of:
            0.11439841 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03505379 = queryNorm
            0.35694647 = fieldWeight in 1345, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1345)
      0.5 = coord(1/2)
  0.16666667 = coord(1/6)
```
Abstract

This article presents an adaptive learning framework for Phonetic Similarity Modeling (PSM) that supports the automatic construction of transliteration lexicons. The learning algorithm starts with minimum prior knowledge about machine transliteration and acquires knowledge iteratively from the Web. We study the unsupervised learning and the active learning strategies that minimize human supervision in terms of data labeling. The learning process refines the PSM and constructs a transliteration lexicon at the same time. We evaluate the proposed PSM and its learning algorithm through a series of systematic experiments, which show that the proposed framework is reliably effective on two independent databases.
Wong, W.; Liu, W.; Bennamoun, M.: Ontology learning from text : a look back and into the future (2010) 0.00
```
0.0034028427 = product of:
  0.020417055 = sum of:
    0.020417055 = product of:
      0.04083411 = sum of:
        0.04083411 = weight(_text_:web in 4733) [ClassicSimilarity], result of:
          0.04083411 = score(doc=4733,freq=4.0), product of:
            0.11439841 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03505379 = queryNorm
            0.35694647 = fieldWeight in 4733, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4733)
      0.5 = coord(1/2)
  0.16666667 = coord(1/6)
```
Abstract

Ontologies are often viewed as the answer to the need for inter-operable semantics in modern information systems. The explosion of textual information on the "Read/Write" Web coupled with the increasing demand for ontologies to power the Semantic Web have made (semi-)automatic ontology learning from text a very promising research area. This together with the advanced state in related areas such as natural language processing have fuelled research into ontology learning over the past decade. This survey looks at how far we have come since the turn of the millennium, and discusses the remaining challenges that will define the research directions in this area in the near future.

Search (158 results, page 2 of 8)

Authors

Years

Languages

Types

Themes

Subjects

Classifications