Search (459 results, page 23 of 23)

Escolano, C.; Costa-Jussà, M.R.; Fonollosa, J.A.: From bilingual to multilingual neural-based machine translation by incremental training (2021) 0.00

0.00294135 = product of:
  0.0088240495 = sum of:
    0.0088240495 = weight(_text_:information in 97) [ClassicSimilarity], result of:
      0.0088240495 = score(doc=97,freq=2.0), product of:
        0.09099081 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0518325 = queryNorm
        0.09697737 = fieldWeight in 97, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=97)
  0.33333334 = coord(1/3)

Source: Journal of the Association for Information Science and Technology. 72(2021) no.2, S.190-203

Giesselbach, S.; Estler-Ziegler, T.: Dokumente schneller analysieren mit Künstlicher Intelligenz (2021) 0.00

0.00294135 = product of:
  0.0088240495 = sum of:
    0.0088240495 = weight(_text_:information in 128) [ClassicSimilarity], result of:
      0.0088240495 = score(doc=128,freq=2.0), product of:
        0.09099081 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0518325 = queryNorm
        0.09697737 = fieldWeight in 128, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=128)
  0.33333334 = coord(1/3)

Footnote: Vortrag im Rahmen des Berliner Arbeitskreis Information (BAK) am 25.02.2021.

Lee, G.E.; Sun, A.: Understanding the stability of medical concept embeddings (2021) 0.00

0.00294135 = product of:
  0.0088240495 = sum of:
    0.0088240495 = weight(_text_:information in 159) [ClassicSimilarity], result of:
      0.0088240495 = score(doc=159,freq=2.0), product of:
        0.09099081 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0518325 = queryNorm
        0.09697737 = fieldWeight in 159, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=159)
  0.33333334 = coord(1/3)

Source: Journal of the Association for Information Science and Technology. 72(2021) no.3, S.346-356

Soni, S.; Lerman, K.; Eisenstein, J.: Follow the leader : documents on the leading edge of semantic change get more citations (2021) 0.00

0.00294135 = product of:
  0.0088240495 = sum of:
    0.0088240495 = weight(_text_:information in 169) [ClassicSimilarity], result of:
      0.0088240495 = score(doc=169,freq=2.0), product of:
        0.09099081 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0518325 = queryNorm
        0.09697737 = fieldWeight in 169, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=169)
  0.33333334 = coord(1/3)

Source: Journal of the Association for Information Science and Technology. 72(2021) no.4, S.478-492

Tao, J.; Zhou, L.; Hickey, K.: Making sense of the black-boxes : toward interpretable text classification using deep learning models (2023) 0.00

0.00294135 = product of:
  0.0088240495 = sum of:
    0.0088240495 = weight(_text_:information in 990) [ClassicSimilarity], result of:
      0.0088240495 = score(doc=990,freq=2.0), product of:
        0.09099081 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0518325 = queryNorm
        0.09697737 = fieldWeight in 990, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=990)
  0.33333334 = coord(1/3)

Source: Journal of the Association for Information Science and Technology. 74(2023) no.6, S.685-700

Laparra, E.; Binford-Walsh, A.; Emerson, K.; Miller, M.L.; López-Hoffman, L.; Currim, F.; Bethard, S.: Addressing structural hurdles for metadata extraction from environmental impact statements (2023) 0.00

0.00294135 = product of:
  0.0088240495 = sum of:
    0.0088240495 = weight(_text_:information in 1042) [ClassicSimilarity], result of:
      0.0088240495 = score(doc=1042,freq=2.0), product of:
        0.09099081 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0518325 = queryNorm
        0.09697737 = fieldWeight in 1042, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1042)
  0.33333334 = coord(1/3)

Source: Journal of the Association for Information Science and Technology. 74(2023) no.9, S.1124-1139

Needham, R.M.; Sparck Jones, K.: Keywords and clumps (1985) 0.00
```
0.002911788 = product of:
  0.008735363 = sum of:
    0.008735363 = weight(_text_:information in 3645) [ClassicSimilarity], result of:
      0.008735363 = score(doc=3645,freq=4.0), product of:
        0.09099081 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0518325 = queryNorm
        0.0960027 = fieldWeight in 3645, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02734375 = fieldNorm(doc=3645)
  0.33333334 = coord(1/3)
```
Abstract

The selection that follows was chosen as it represents "a very early paper an the possibilities allowed by computers an documentation." In the early 1960s computers were being used to provide simple automatic indexing systems wherein keywords were extracted from documents. The problem with such systems was that they lacked vocabulary control, thus documents related in subject matter were not always collocated in retrieval. To improve retrieval by improving recall is the raison d'être of vocabulary control tools such as classifications and thesauri. The question arose whether it was possible by automatic means to construct classes of terms, which when substituted, one for another, could be used to improve retrieval performance? One of the first theoretical approaches to this question was initiated by R. M. Needham and Karen Sparck Jones at the Cambridge Language Research Institute in England.t The question was later pursued using experimental methodologies by Sparck Jones, who, as a Senior Research Associate in the Computer Laboratory at the University of Cambridge, has devoted her life's work to research in information retrieval and automatic naturai language processing. Based an the principles of numerical taxonomy, automatic classification techniques start from the premise that two objects are similar to the degree that they share attributes in common. When these two objects are keywords, their similarity is measured in terms of the number of documents they index in common. Step 1 in automatic classification is to compute mathematically the degree to which two terms are similar. Step 2 is to group together those terms that are "most similar" to each other, forming equivalence classes of intersubstitutable terms. The technique for forming such classes varies and is the factor that characteristically distinguishes different approaches to automatic classification. The technique used by Needham and Sparck Jones, that of clumping, is described in the selection that follows. Questions that must be asked are whether the use of automatically generated classes really does improve retrieval performance and whether there is a true eco nomic advantage in substituting mechanical for manual labor. Several years after her work with clumping, Sparck Jones was to observe that while it was not wholly satisfactory in itself, it was valuable in that it stimulated research into automatic classification. To this it might be added that it was valuable in that it introduced to libraryl information science the methods of numerical taxonomy, thus stimulating us to think again about the fundamental nature and purpose of classification. In this connection it might be useful to review how automatically derived classes differ from those of manually constructed classifications: 1) the manner of their derivation is purely a posteriori, the ultimate operationalization of the principle of literary warrant; 2) the relationship between members forming such classes is essentially statistical; the members of a given class are similar to each other not because they possess the class-defining characteristic but by virtue of sharing a family resemblance; and finally, 3) automatically derived classes are not related meaningfully one to another, that is, they are not ordered in traditional hierarchical and precedence relationships.
Rötzer, F.: Computer ergooglen die Bedeutung von Worten (2005) 0.00
```
0.0024958183 = product of:
  0.0074874545 = sum of:
    0.0074874545 = weight(_text_:information in 3385) [ClassicSimilarity], result of:
      0.0074874545 = score(doc=3385,freq=4.0), product of:
        0.09099081 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0518325 = queryNorm
        0.08228803 = fieldWeight in 3385, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0234375 = fieldNorm(doc=3385)
  0.33333334 = coord(1/3)
```
Content

Mit einem bereits zuvor von Paul Vitanyi und anderen entwickeltem Verfahren, das den Zusammenhang von Objekten misst (normalized information distance - NID ), kann die Nähe zwischen bestimmten Objekten (Bilder, Worte, Muster, Intervalle, Genome, Programme etc.) anhand aller Eigenschaften analysiert und aufgrund der dominanten gemeinsamen Eigenschaft bestimmt werden. Ähnlich können auch die allgemein verwendeten, nicht unbedingt "wahren" Bedeutungen von Namen mit der Google-Suche erschlossen werden. 'At this moment one database stands out as the pinnacle of computer-accessible human knowledge and the most inclusive summary of statistical information: the Google search engine. There can be no doubt that Google has already enabled science to accelerate tremendously and revolutionized the research process. It has dominated the attention of internet users for years, and has recently attracted substantial attention of many Wall Street investors, even reshaping their ideas of company financing.' (Paul Vitanyi und Rudi Cilibrasi) Gibt man ein Wort ein wie beispielsweise "Pferd", erhält man bei Google 4.310.000 indexierte Seiten. Für "Reiter" sind es 3.400.000 Seiten. Kombiniert man beide Begriffe, werden noch 315.000 Seiten erfasst. Für das gemeinsame Auftreten beispielsweise von "Pferd" und "Bart" werden zwar noch immer erstaunliche 67.100 Seiten aufgeführt, aber man sieht schon, dass "Pferd" und "Reiter" enger zusammen hängen. Daraus ergibt sich eine bestimmte Wahrscheinlichkeit für das gemeinsame Auftreten von Begriffen. Aus dieser Häufigkeit, die sich im Vergleich mit der maximalen Menge (5.000.000.000) an indexierten Seiten ergibt, haben die beiden Wissenschaftler eine statistische Größe entwickelt, die sie "normalised Google distance" (NGD) nennen und die normalerweise zwischen 0 und 1 liegt. Je geringer NGD ist, desto enger hängen zwei Begriffe zusammen. "Das ist eine automatische Bedeutungsgenerierung", sagt Vitanyi gegenüber dern New Scientist (4). "Das könnte gut eine Möglichkeit darstellen, einen Computer Dinge verstehen und halbintelligent handeln zu lassen." Werden solche Suchen immer wieder durchgeführt, lässt sich eine Karte für die Verbindungen von Worten erstellen. Und aus dieser Karte wiederum kann ein Computer, so die Hoffnung, auch die Bedeutung der einzelnen Worte in unterschiedlichen natürlichen Sprachen und Kontexten erfassen. So habe man über einige Suchen realisiert, dass ein Computer zwischen Farben und Zahlen unterscheiden, holländische Maler aus dem 17. Jahrhundert und Notfälle sowie Fast-Notfälle auseinander halten oder elektrische oder religiöse Begriffe verstehen könne. Überdies habe eine einfache automatische Übersetzung Englisch-Spanisch bewerkstelligt werden können. Auf diese Weise ließe sich auch, so hoffen die Wissenschaftler, die Bedeutung von Worten erlernen, könne man Spracherkennung verbessern oder ein semantisches Web erstellen und natürlich endlich eine bessere automatische Übersetzung von einer Sprache in die andere realisieren.

Olsen, K.A.; Williams, J.G.: Spelling and grammar checking using the Web as a text repository (2004) 0.00

0.00235308 = product of:
  0.00705924 = sum of:
    0.00705924 = weight(_text_:information in 2891) [ClassicSimilarity], result of:
      0.00705924 = score(doc=2891,freq=2.0), product of:
        0.09099081 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0518325 = queryNorm
        0.0775819 = fieldWeight in 2891, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.03125 = fieldNorm(doc=2891)
  0.33333334 = coord(1/3)

Source: Journal of the American Society for Information Science and Technology. 55(2004) no.11, S.1020-1023

Humphrey, S.M.; Rogers, W.J.; Kilicoglu, H.; Demner-Fushman, D.; Rindflesch, T.C.: Word sense disambiguation by selecting the best semantic type based on journal descriptor indexing : preliminary experiment (2006) 0.00

0.00235308 = product of:
  0.00705924 = sum of:
    0.00705924 = weight(_text_:information in 4912) [ClassicSimilarity], result of:
      0.00705924 = score(doc=4912,freq=2.0), product of:
        0.09099081 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0518325 = queryNorm
        0.0775819 = fieldWeight in 4912, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.03125 = fieldNorm(doc=4912)
  0.33333334 = coord(1/3)

Source: Journal of the American Society for Information Science and Technology. 57(2006) no.1, S.96-113

Kim, W.; Wilbur, W.J.: Corpus-based statistical screening for content-bearing terms (2001) 0.00

0.00235308 = product of:
  0.00705924 = sum of:
    0.00705924 = weight(_text_:information in 5188) [ClassicSimilarity], result of:
      0.00705924 = score(doc=5188,freq=2.0), product of:
        0.09099081 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0518325 = queryNorm
        0.0775819 = fieldWeight in 5188, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.03125 = fieldNorm(doc=5188)
  0.33333334 = coord(1/3)

Source: Journal of the American Society for Information Science and technology. 52(2001) no.3, S.247-259

Thiel, M.: Bedingt wahrscheinliche Syntaxbäume (2006) 0.00

0.00235308 = product of:
  0.00705924 = sum of:
    0.00705924 = weight(_text_:information in 6069) [ClassicSimilarity], result of:
      0.00705924 = score(doc=6069,freq=2.0), product of:
        0.09099081 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0518325 = queryNorm
        0.0775819 = fieldWeight in 6069, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.03125 = fieldNorm(doc=6069)
  0.33333334 = coord(1/3)

Source: Information und Sprache: Beiträge zu Informationswissenschaft, Computerlinguistik, Bibliothekswesen und verwandten Fächern. Festschrift für Harald H. Zimmermann. Herausgegeben von Ilse Harms, Heinz-Dirk Luckhardt und Hans W. Giessen

Semantic role universals and argument linking : theoretical, typological, and psycholinguistic perspectives (2006) 0.00
```
0.00235308 = product of:
  0.00705924 = sum of:
    0.00705924 = weight(_text_:information in 3670) [ClassicSimilarity], result of:
      0.00705924 = score(doc=3670,freq=2.0), product of:
        0.09099081 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0518325 = queryNorm
        0.0775819 = fieldWeight in 3670, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.03125 = fieldNorm(doc=3670)
  0.33333334 = coord(1/3)
```
Abstract

The concept of semantic roles has been central to linguistic theory for many decades. More specifically, the assumption of such representations as mediators in the correspondence between a linguistic form and its associated meaning has helped to address a number of critical issues related to grammatical phenomena. Furthermore, in addition to featuring in all major theories of grammar, semantic (or 'thematic') roles have been referred to extensively within a wide range of other linguistic subdisciplines, including language typology and psycho-/neurolinguistics. This volume brings together insights from these different perspectives and thereby, for the first time, seeks to build upon the obvious potential for cross-fertilisation between hitherto autonomous approaches to a common theme. To this end, a view on semantic roles is adopted that goes beyond the mere assumption of generalised roles, but also focuses on their hierarchical organisation. The book is thus centred around the interdisciplinary examination of how these hierarchical dependencies subserve argument linking - both in terms of linguistic theory and with respect to real-time language processing - and how they interact with other information types in this process. Furthermore, the contributions examine the interaction between the role hierarchy and the conceptual content of (generalised) semantic roles and investigate their cross-linguistic applicability and psychological reality, as well as their explanatory potential in accounting for phenomena in the domain of language disorders. In bridging the gap between different disciplines, the book provides a valuable overview of current thought on semantic roles and argument linking, and may further serve as a point of departure for future interdisciplinary research in this area. As such, it will be of interest to scientists and advanced students in all domains of linguistics and cognitive science.
Spitkovsky, V.; Norvig, P.: From words to concepts and back : dictionaries for linking text, entities and ideas (2012) 0.00
```
0.00235308 = product of:
  0.00705924 = sum of:
    0.00705924 = weight(_text_:information in 337) [ClassicSimilarity], result of:
      0.00705924 = score(doc=337,freq=2.0), product of:
        0.09099081 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0518325 = queryNorm
        0.0775819 = fieldWeight in 337, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.03125 = fieldNorm(doc=337)
  0.33333334 = coord(1/3)
```
Abstract

Human language is both rich and ambiguous. When we hear or read words, we resolve meanings to mental representations, for example recognizing and linking names to the intended persons, locations or organizations. Bridging words and meaning - from turning search queries into relevant results to suggesting targeted keywords for advertisers - is also Google's core competency, and important for many other tasks in information retrieval and natural language processing. We are happy to release a resource, spanning 7,560,141 concepts and 175,100,788 unique text strings, that we hope will help everyone working in these areas. How do we represent concepts? Our approach piggybacks on the unique titles of entries from an encyclopedia, which are mostly proper and common noun phrases. We consider each individual Wikipedia article as representing a concept (an entity or an idea), identified by its URL. Text strings that refer to concepts were collected using the publicly available hypertext of anchors (the text you click on in a web link) that point to each Wikipedia page, thus drawing on the vast link structure of the web. For every English article we harvested the strings associated with its incoming hyperlinks from the rest of Wikipedia, the greater web, and also anchors of parallel, non-English Wikipedia pages. Our dictionaries are cross-lingual, and any concept deemed too fine can be broadened to a desired level of generality using Wikipedia's groupings of articles into hierarchical categories. The data set contains triples, each consisting of (i) text, a short, raw natural language string; (ii) url, a related concept, represented by an English Wikipedia article's canonical location; and (iii) count, an integer indicating the number of times text has been observed connected with the concept's url. Our database thus includes weights that measure degrees of association. For example, the top two entries for football indicate that it is an ambiguous term, which is almost twice as likely to refer to what we in the US call soccer. Vgl. auch: Spitkovsky, V.I., A.X. Chang: A cross-lingual dictionary for english Wikipedia concepts. In: http://nlp.stanford.edu/pubs/crosswikis.pdf.

Menge-Sonnentag, R.: Google veröffentlicht einen Parser für natürliche Sprache (2016) 0.00

0.00235308 = product of:
  0.00705924 = sum of:
    0.00705924 = weight(_text_:information in 2941) [ClassicSimilarity], result of:
      0.00705924 = score(doc=2941,freq=2.0), product of:
        0.09099081 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0518325 = queryNorm
        0.0775819 = fieldWeight in 2941, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.03125 = fieldNorm(doc=2941)
  0.33333334 = coord(1/3)

Footnote: Download unter: https://github.com/tensorflow/models/tree/master/syntaxnet. Dort befinden sich auch weitere Information zu dem Modell sowie Vergleichszahlen zur Erkennungsrate.

Aydin, Ö.; Karaarslan, E.: OpenAI ChatGPT generated literature review: : digital twin in healthcare (2022) 0.00
```
0.00235308 = product of:
  0.00705924 = sum of:
    0.00705924 = weight(_text_:information in 851) [ClassicSimilarity], result of:
      0.00705924 = score(doc=851,freq=2.0), product of:
        0.09099081 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0518325 = queryNorm
        0.0775819 = fieldWeight in 851, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.03125 = fieldNorm(doc=851)
  0.33333334 = coord(1/3)
```
Abstract

Literature review articles are essential to summarize the related work in the selected field. However, covering all related studies takes too much time and effort. This study questions how Artificial Intelligence can be used in this process. We used ChatGPT to create a literature review article to show the stage of the OpenAI ChatGPT artificial intelligence application. As the subject, the applications of Digital Twin in the health field were chosen. Abstracts of the last three years (2020, 2021 and 2022) papers were obtained from the keyword "Digital twin in healthcare" search results on Google Scholar and paraphrased by ChatGPT. Later on, we asked ChatGPT questions. The results are promising; however, the paraphrased parts had significant matches when checked with the Ithenticate tool. This article is the first attempt to show the compilation and expression of knowledge will be accelerated with the help of artificial intelligence. We are still at the beginning of such advances. The future academic publishing process will require less human effort, which in turn will allow academics to focus on their studies. In future studies, we will monitor citations to this study to evaluate the academic validity of the content produced by the ChatGPT. 1. Introduction OpenAI ChatGPT (ChatGPT, 2022) is a chatbot based on the OpenAI GPT-3 language model. It is designed to generate human-like text responses to user input in a conversational context. OpenAI ChatGPT is trained on a large dataset of human conversations and can be used to create responses to a wide range of topics and prompts. The chatbot can be used for customer service, content creation, and language translation tasks, creating replies in multiple languages. OpenAI ChatGPT is available through the OpenAI API, which allows developers to access and integrate the chatbot into their applications and systems. OpenAI ChatGPT is a variant of the GPT (Generative Pre-trained Transformer) language model developed by OpenAI. It is designed to generate human-like text, allowing it to engage in conversation with users naturally and intuitively. OpenAI ChatGPT is trained on a large dataset of human conversations, allowing it to understand and respond to a wide range of topics and contexts. It can be used in various applications, such as chatbots, customer service agents, and language translation systems. OpenAI ChatGPT is a state-of-the-art language model able to generate coherent and natural text that can be indistinguishable from text written by a human. As an artificial intelligence, ChatGPT may need help to change academic writing practices. However, it can provide information and guidance on ways to improve people's academic writing skills.
Artemenko, O.; Shramko, M.: Entwicklung eines Werkzeugs zur Sprachidentifikation in mono- und multilingualen Texten (2005) 0.00
```
0.0020589451 = product of:
  0.006176835 = sum of:
    0.006176835 = weight(_text_:information in 572) [ClassicSimilarity], result of:
      0.006176835 = score(doc=572,freq=2.0), product of:
        0.09099081 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0518325 = queryNorm
        0.06788416 = fieldWeight in 572, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02734375 = fieldNorm(doc=572)
  0.33333334 = coord(1/3)
```
Abstract

Mit der Verbreitung des Internets vermehrt sich die Menge der im World Wide Web verfügbaren Dokumente. Die Gewährleistung eines effizienten Zugangs zu gewünschten Informationen für die Internetbenutzer wird zu einer großen Herausforderung an die moderne Informationsgesellschaft. Eine Vielzahl von Werkzeugen wird bereits eingesetzt, um den Nutzern die Orientierung in der wachsenden Informationsflut zu erleichtern. Allerdings stellt die enorme Menge an unstrukturierten und verteilten Informationen nicht die einzige Schwierigkeit dar, die bei der Entwicklung von Werkzeugen dieser Art zu bewältigen ist. Die zunehmende Vielsprachigkeit von Web-Inhalten resultiert in dem Bedarf an Sprachidentifikations-Software, die Sprache/en von elektronischen Dokumenten zwecks gezielter Weiterverarbeitung identifiziert. Solche Sprachidentifizierer können beispielsweise effektiv im Bereich des Multilingualen Information Retrieval eingesetzt werden, da auf den Sprachidentifikationsergebnissen Prozesse der automatischen Indexbildung wie Stemming, Stoppwörterextraktion etc. aufbauen. In der vorliegenden Arbeit wird das neue System "LangIdent" zur Sprachidentifikation von elektronischen Textdokumenten vorgestellt, das in erster Linie für Lehre und Forschung an der Universität Hildesheim verwendet werden soll. "LangIdent" enthält eine Auswahl von gängigen Algorithmen zu der monolingualen Sprachidentifikation, die durch den Benutzer interaktiv ausgewählt und eingestellt werden können. Zusätzlich wurde im System ein neuer Algorithmus implementiert, der die Identifikation von Sprachen, in denen ein multilinguales Dokument verfasst ist, ermöglicht. Die Identifikation beschränkt sich nicht nur auf eine Aufzählung von gefundenen Sprachen, vielmehr wird der Text in monolinguale Abschnitte aufgeteilt, jeweils mit der Angabe der identifizierten Sprache.
Nagy T., I.: Detecting multiword expressions and named entities in natural language texts (2014) 0.00
```
0.0020589451 = product of:
  0.006176835 = sum of:
    0.006176835 = weight(_text_:information in 1536) [ClassicSimilarity], result of:
      0.006176835 = score(doc=1536,freq=2.0), product of:
        0.09099081 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0518325 = queryNorm
        0.06788416 = fieldWeight in 1536, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1536)
  0.33333334 = coord(1/3)
```
Abstract

Multiword expressions (MWEs) are lexical items that can be decomposed into single words and display lexical, syntactic, semantic, pragmatic and/or statistical idiosyncrasy (Sag et al., 2002; Kim, 2008; Calzolari et al., 2002). The proper treatment of multiword expressions such as rock 'n' roll and make a decision is essential for many natural language processing (NLP) applications like information extraction and retrieval, terminology extraction and machine translation, and it is important to identify multiword expressions in context. For example, in machine translation we must know that MWEs form one semantic unit, hence their parts should not be translated separately. For this, multiword expressions should be identified first in the text to be translated. The chief aim of this thesis is to develop machine learning-based approaches for the automatic detection of different types of multiword expressions in English and Hungarian natural language texts. In our investigations, we pay attention to the characteristics of different types of multiword expressions such as nominal compounds, multiword named entities and light verb constructions, and we apply novel methods to identify MWEs in raw texts. In the thesis it will be demonstrated that nominal compounds and multiword amed entities may require a similar approach for their automatic detection as they behave in the same way from a linguistic point of view. Furthermore, it will be shown that the automatic detection of light verb constructions can be carried out using two effective machine learning-based approaches.
Sprachtechnologie, mobile Kommunikation und linguistische Ressourcen : Beiträge zur GLDV Tagung 2005 in Bonn (2005) 0.00
```
0.0017648101 = product of:
  0.00529443 = sum of:
    0.00529443 = weight(_text_:information in 3578) [ClassicSimilarity], result of:
      0.00529443 = score(doc=3578,freq=2.0), product of:
        0.09099081 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0518325 = queryNorm
        0.058186423 = fieldWeight in 3578, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0234375 = fieldNorm(doc=3578)
  0.33333334 = coord(1/3)
```
Content

INHALT: Chris Biemann/Rainer Osswald: Automatische Erweiterung eines semantikbasierten Lexikons durch Bootstrapping auf großen Korpora - Ernesto William De Luca/Andreas Nürnberger: Supporting Mobile Web Search by Ontology-based Categorization - Rüdiger Gleim: HyGraph - Ein Framework zur Extraktion, Repräsentation und Analyse webbasierter Hypertextstrukturen - Felicitas Haas/Bernhard Schröder: Freges Grundgesetze der Arithmetik: Dokumentbaum und Formelwald - Ulrich Held/ Andre Blessing/Bettina Säuberlich/Jürgen Sienel/Horst Rößler/Dieter Kopp: A personalized multimodal news service -Jürgen Hermes/Christoph Benden: Fusion von Annotation und Präprozessierung als Vorschlag zur Behebung des Rohtextproblems - Sonja Hüwel/Britta Wrede/Gerhard Sagerer: Semantisches Parsing mit Frames für robuste multimodale Mensch-Maschine-Kommunikation - Brigitte Krenn/Stefan Evert: Separating the wheat from the chaff- Corpus-driven evaluation of statistical association measures for collocation extraction - Jörn Kreutel: An application-centered Perspective an Multimodal Dialogue Systems - Jonas Kuhn: An Architecture for Prallel Corpusbased Grammar Learning - Thomas Mandl/Rene Schneider/Pia Schnetzler/Christa Womser-Hacker: Evaluierung von Systemen für die Eigennamenerkennung im crosslingualen Information Retrieval - Alexander Mehler/Matthias Dehmer/Rüdiger Gleim: Zur Automatischen Klassifikation von Webgenres - Charlotte Merz/Martin Volk: Requirements for a Parallel Treebank Search Tool - Sally YK. Mok: Multilingual Text Retrieval an the Web: The Case of a Cantonese-Dagaare-English Trilingual e-Lexicon -

Search (459 results, page 23 of 23)

Authors

Years

Languages

Types

Themes

Subjects

Classifications