Search (20 results, page 1 of 1)

Rötzer, F.: KI-Programm besser als Menschen im Verständnis natürlicher Sprache (2018) 0.02
```
0.018499283 = product of:
  0.07399713 = sum of:
    0.07399713 = sum of:
      0.053608507 = weight(_text_:intelligenz in 4217) [ClassicSimilarity], result of:
        0.053608507 = score(doc=4217,freq=2.0), product of:
          0.21362439 = queryWeight, product of:
            5.678294 = idf(docFreq=410, maxDocs=44218)
            0.037621226 = queryNorm
          0.2509475 = fieldWeight in 4217, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.678294 = idf(docFreq=410, maxDocs=44218)
            0.03125 = fieldNorm(doc=4217)
      0.020388626 = weight(_text_:22 in 4217) [ClassicSimilarity], result of:
        0.020388626 = score(doc=4217,freq=2.0), product of:
          0.13174312 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.037621226 = queryNorm
          0.15476047 = fieldWeight in 4217, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.03125 = fieldNorm(doc=4217)
  0.25 = coord(1/4)
```
Abstract

Jetzt scheint es allmählich ans Eingemachte zu gehen. Ein von der chinesischen Alibaba-Gruppe entwickelte KI-Programm konnte erstmals Menschen in der Beantwortung von Fragen und dem Verständnis von Text schlagen. Die chinesische Regierung will das Land führend in der Entwicklung von Künstlicher Intelligenz machen und hat dafür eine nationale Strategie aufgestellt. Dazu ernannte das Ministerium für Wissenschaft und Technik die Internetkonzerne Baidu, Alibaba und Tencent sowie iFlyTek zum ersten nationalen Team für die Entwicklung der KI-Technik der nächsten Generation. Baidu ist zuständig für die Entwicklung autonomer Fahrzeuge, Alibaba für die Entwicklung von Clouds für "city brains" (Smart Cities sollen sich an ihre Einwohner und ihre Umgebung anpassen), Tencent für die Enwicklung von Computervision für medizinische Anwendungen und iFlyTec für "Stimmenintelligenz". Die vier Konzerne sollen offene Plattformen herstellen, die auch andere Firmen und Start-ups verwenden können. Überdies wird bei Peking für eine Milliarde US-Dollar ein Technologiepark für die Entwicklung von KI gebaut. Dabei geht es selbstverständlich nicht nur um zivile Anwendungen, sondern auch militärische. Noch gibt es in den USA mehr KI-Firmen, aber China liegt bereits an zweiter Stelle. Das Pentagon ist beunruhigt. Offenbar kommt China rasch vorwärts. Ende 2017 stellte die KI-Firma iFlyTek, die zunächst auf Stimmerkennung und digitale Assistenten spezialisiert war, einen Roboter vor, der den schriftlichen Test der nationalen Medizinprüfung erfolgreich bestanden hatte. Der Roboter war nicht nur mit immensem Wissen aus 53 medizinischen Lehrbüchern, 2 Millionen medizinischen Aufzeichnungen und 400.000 medizinischen Texten und Berichten gefüttert worden, er soll von Medizinexperten klinische Erfahrungen und Falldiagnosen übernommen haben. Eingesetzt werden soll er, in China herrscht vor allem auf dem Land, Ärztemangel, als Helfer, der mit der automatischen Auswertung von Patientendaten eine erste Diagnose erstellt und ansonsten Ärzten mit Vorschlägen zur Seite stehen.

Date

22. 1.2018 11:32:44
Strube, M.: Kreativ durch Analogien (2011) 0.01
```
0.011726861 = product of:
  0.046907444 = sum of:
    0.046907444 = product of:
      0.09381489 = sum of:
        0.09381489 = weight(_text_:intelligenz in 4805) [ClassicSimilarity], result of:
          0.09381489 = score(doc=4805,freq=2.0), product of:
            0.21362439 = queryWeight, product of:
              5.678294 = idf(docFreq=410, maxDocs=44218)
              0.037621226 = queryNorm
            0.43915814 = fieldWeight in 4805, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.678294 = idf(docFreq=410, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4805)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Content

"Die Computerlinguistik vereinigt Elemente von Informatik und Linguistik; sie verwendet darüber hinaus Methoden aus weiteren Gebieten wie Mathematik, Psychologie, Statistik und künstliche Intelligenz. Der Reiz und die Herausforderung einer solchen interdisziplinären Wissenschaft liegen darin, Analogien zwischen Konzepten aus weit entfernten Teilgebieten zu erkennen und zu nutzen. Paradebeispiel dafür ist einer der entscheidenden Durchbrüche, welche die Computerlinguistik prägten. Es geht um das »Parsing«: Ein Computerprogramm, genauer gesagt ein Compiler, nimmt Zeichen für Zeichen den Input des Benutzers entgegen, der in diesem Fall seinerseits aus dem Text eines Computerprogramms besteht, und ermittelt dessen Struktur. Im Prinzip dasselbe tut ein Mensch, der einen gesprochenen Satz hört und versteht."
¬Die Bibel als Stilkompass (2019) 0.01
```
0.008376329 = product of:
  0.033505317 = sum of:
    0.033505317 = product of:
      0.06701063 = sum of:
        0.06701063 = weight(_text_:intelligenz in 5331) [ClassicSimilarity], result of:
          0.06701063 = score(doc=5331,freq=2.0), product of:
            0.21362439 = queryWeight, product of:
              5.678294 = idf(docFreq=410, maxDocs=44218)
              0.037621226 = queryNorm
            0.31368437 = fieldWeight in 5331, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.678294 = idf(docFreq=410, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5331)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Content

"Die Heilige Schrift gibt es nicht nur in mehreren hundert Sprachen, sondern oft innerhalb eines Sprachraums auch in mehreren Varianten. Britische Leser konnen unter anderem zwischen der bewusst sehr einfach geschriebenen Bible in Basic English und der linguistisch komplexen King James Version aus dem 17. Jahrhundert wahlen. Die Fassungen unterscheiden sich in Satzlänge, Wortwahl sowie Förmlichkeit und sprechen so Menschen aus verschiedenen Kulturen und mit unterschiedlichem Bildungsstand an. Ein Team um Keith Carlson vom Dartmouth College will die insgesamt 34 englischsprachigen Versionen der Bibel nun dazu nutzen, um Computern unterschiedliche Stilformen beizubringen Bisher übersetzen entsprechende Programme zwar Fremdsprachen, zum Teil mit beeindruckender Genauigkeit. Oft scheitern sie aber, wenn sie einen Text zielsicher stilistisch verändern sollen, vor allem wenn es dabei um mehr als ein einzelnes Merkmal wie beispielsweise die Komplexität geht. Die Bibel eigne sich mit ihren rund 31 000 Versen wie kein anderes Werk für das Training von Übersetzungsprogrammen, argumentiert das Team um Carlson. Schließlich seien alle Fassungen sehr gewissenhaft von Menschen übersetzt und außerdem Vers für Vers durchnummeriert worden. Das erleichtere einer Maschine die Zuordnung und sei bei anderen umfangreichen Schriftquellen wie dem Werk von William Shakespeare oder der Wikipedia nicht zwangsläufig der Fall. Als erste Demonstration haben die Forscher zwei Algorithmen, von denen einer auf neuronalen Netzen basierte, mit acht frei im Internet verfügbaren Bibelversionen trainiert. Anschließend testeten sie, wie gut die beiden Programme Verse der Vorlagen in einen gewünschten Stil übertrugen, ohne dass die Software auf die anvisierte Fassung der Bibel zugreifen konnte. Insgesamt seien die automatischen Übersetzer dem Ziel schon recht nahegekommen, berichten die Forscher. Sie sehen ihre Arbeit aber erst als Startpunkt bei der Entwicklung einer künstlichen Intelligenz, die souverän zwischen verschiedenen Sprachstilen wechseln kann."

Kocijan, K.: Visualizing natural language resources (2015) 0.00

0.0044140695 = product of:
  0.017656278 = sum of:
    0.017656278 = product of:
      0.05296883 = sum of:
        0.05296883 = weight(_text_:k in 2995) [ClassicSimilarity], result of:
          0.05296883 = score(doc=2995,freq=2.0), product of:
            0.13429943 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.037621226 = queryNorm
            0.39440846 = fieldWeight in 2995, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.078125 = fieldNorm(doc=2995)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

Lawrie, D.; Mayfield, J.; McNamee, P.; Oard, P.W.: Cross-language person-entity linking from 20 languages (2015) 0.00
```
0.0038228673 = product of:
  0.015291469 = sum of:
    0.015291469 = product of:
      0.030582938 = sum of:
        0.030582938 = weight(_text_:22 in 1848) [ClassicSimilarity], result of:
          0.030582938 = score(doc=1848,freq=2.0), product of:
            0.13174312 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.037621226 = queryNorm
            0.23214069 = fieldWeight in 1848, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=1848)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

The goal of entity linking is to associate references to an entity that is found in unstructured natural language content to an authoritative inventory of known entities. This article describes the construction of 6 test collections for cross-language person-entity linking that together span 22 languages. Fully automated components were used together with 2 crowdsourced validation stages to affordably generate ground-truth annotations with an accuracy comparable to that of a completely manual process. The resulting test collections each contain between 642 (Arabic) and 2,361 (Romanian) person references in non-English texts for which the correct resolution in English Wikipedia is known, plus a similar number of references for which no correct resolution into English Wikipedia is believed to exist. Fully automated cross-language person-name linking experiments with 20 non-English languages yielded a resolution accuracy of between 0.84 (Serbian) and 0.98 (Romanian), which compares favorably with previously reported cross-language entity linking results for Spanish.

Baierer, K.; Zumstein, P.: Verbesserung der OCR in digitalen Sammlungen von Bibliotheken (2016) 0.00

0.0035312555 = product of:
  0.014125022 = sum of:
    0.014125022 = product of:
      0.042375065 = sum of:
        0.042375065 = weight(_text_:k in 2818) [ClassicSimilarity], result of:
          0.042375065 = score(doc=2818,freq=2.0), product of:
            0.13429943 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.037621226 = queryNorm
            0.31552678 = fieldWeight in 2818, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.0625 = fieldNorm(doc=2818)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

Fóris, A.: Network theory and terminology (2013) 0.00

0.003185723 = product of:
  0.012742892 = sum of:
    0.012742892 = product of:
      0.025485784 = sum of:
        0.025485784 = weight(_text_:22 in 1365) [ClassicSimilarity], result of:
          0.025485784 = score(doc=1365,freq=2.0), product of:
            0.13174312 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.037621226 = queryNorm
            0.19345059 = fieldWeight in 1365, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1365)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 2. 9.2014 21:22:48

Babik, W.: Keywords as linguistic tools in information and knowledge organization (2017) 0.00
```
0.0030898487 = product of:
  0.012359395 = sum of:
    0.012359395 = product of:
      0.037078183 = sum of:
        0.037078183 = weight(_text_:k in 3510) [ClassicSimilarity], result of:
          0.037078183 = score(doc=3510,freq=2.0), product of:
            0.13429943 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.037621226 = queryNorm
            0.27608594 = fieldWeight in 3510, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3510)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)
```
Source

Theorie, Semantik und Organisation von Wissen: Proceedings der 13. Tagung der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) und dem 13. Internationalen Symposium der Informationswissenschaft der Higher Education Association for Information Science (HI) Potsdam (19.-20.03.2013): 'Theory, Information and Organization of Knowledge' / Proceedings der 14. Tagung der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) und Natural Language & Information Systems (NLDB) Passau (16.06.2015): 'Lexical Resources for Knowledge Organization' / Proceedings des Workshops der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) auf der SEMANTICS Leipzig (1.09.2014): 'Knowledge Organization and Semantic Web' / Proceedings des Workshops der Polnischen und Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) Cottbus (29.-30.09.2011): 'Economics of Knowledge Production and Organization'. Hrsg. von W. Babik, H.P. Ohly u. K. Weber

Al-Shawakfa, E.; Al-Badarneh, A.; Shatnawi, S.; Al-Rabab'ah, K.; Bani-Ismail, B.: ¬A comparison study of some Arabic root finding algorithms (2010) 0.00

0.0026484418 = product of:
  0.010593767 = sum of:
    0.010593767 = product of:
      0.0317813 = sum of:
        0.0317813 = weight(_text_:k in 3457) [ClassicSimilarity], result of:
          0.0317813 = score(doc=3457,freq=2.0), product of:
            0.13429943 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.037621226 = queryNorm
            0.23664509 = fieldWeight in 3457, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.046875 = fieldNorm(doc=3457)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

Moohebat, M.; Raj, R.G.; Kareem, S.B.A.; Thorleuchter, D.: Identifying ISI-indexed articles by their lexical usage : a text analysis approach (2015) 0.00
```
0.0026484418 = product of:
  0.010593767 = sum of:
    0.010593767 = product of:
      0.0317813 = sum of:
        0.0317813 = weight(_text_:k in 1664) [ClassicSimilarity], result of:
          0.0317813 = score(doc=1664,freq=2.0), product of:
            0.13429943 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.037621226 = queryNorm
            0.23664509 = fieldWeight in 1664, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.046875 = fieldNorm(doc=1664)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)
```
Abstract

This research creates an architecture for investigating the existence of probable lexical divergences between articles, categorized as Institute for Scientific Information (ISI) and non-ISI, and consequently, if such a difference is discovered, to propose the best available classification method. Based on a collection of ISI- and non-ISI-indexed articles in the areas of business and computer science, three classification models are trained. A sensitivity analysis is applied to demonstrate the impact of words in different syntactical forms on the classification decision. The results demonstrate that the lexical domains of ISI and non-ISI articles are distinguishable by machine learning techniques. Our findings indicate that the support vector machine identifies ISI-indexed articles in both disciplines with higher precision than do the Naïve Bayesian and K-Nearest Neighbors techniques.

Lu, K.; Cai, X.; Ajiferuke, I.; Wolfram, D.: Vocabulary size and its effect on topic representation (2017) 0.00

0.0026484418 = product of:
  0.010593767 = sum of:
    0.010593767 = product of:
      0.0317813 = sum of:
        0.0317813 = weight(_text_:k in 3414) [ClassicSimilarity], result of:
          0.0317813 = score(doc=3414,freq=2.0), product of:
            0.13429943 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.037621226 = queryNorm
            0.23664509 = fieldWeight in 3414, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.046875 = fieldNorm(doc=3414)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

Sankarasubramaniam, Y.; Ramanathan, K.; Ghosh, S.: Text summarization using Wikipedia (2014) 0.00

0.0022070347 = product of:
  0.008828139 = sum of:
    0.008828139 = product of:
      0.026484415 = sum of:
        0.026484415 = weight(_text_:k in 2693) [ClassicSimilarity], result of:
          0.026484415 = score(doc=2693,freq=2.0), product of:
            0.13429943 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.037621226 = queryNorm
            0.19720423 = fieldWeight in 2693, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2693)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

Savoy, J.: Text representation strategies : an example with the State of the union addresses (2016) 0.00
```
0.0022070347 = product of:
  0.008828139 = sum of:
    0.008828139 = product of:
      0.026484415 = sum of:
        0.026484415 = weight(_text_:k in 3042) [ClassicSimilarity], result of:
          0.026484415 = score(doc=3042,freq=2.0), product of:
            0.13429943 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.037621226 = queryNorm
            0.19720423 = fieldWeight in 3042, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3042)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)
```
Abstract

Based on State of the Union addresses from 1790 to 2014 (225 speeches delivered by 42 presidents), this paper describes and evaluates different text representation strategies. To determine the most important words of a given text, the term frequencies (tf) or the tf?idf weighting scheme can be applied. Recently, latent Dirichlet allocation (LDA) has been proposed to define the topics included in a corpus. As another strategy, this study proposes to apply a vocabulary specificity measure (Z?score) to determine the most significantly overused word-types or short sequences of them. Our experiments show that the simple term frequency measure is not able to discriminate between specific terms associated with a document or a set of texts. Using the tf idf or LDA approach, the selection requires some arbitrary decisions. Based on the term-specific measure (Z?score), the term selection has a clear theoretical basis. Moreover, the most significant sentences for each presidency can be determined. As another facet, we can visualize the dynamic evolution of usage of some terms associated with their specificity measures. Finally, this technique can be employed to define the most important lexical leaders introducing terms overused by the k following presidencies.
Lian, T.; Yu, C.; Wang, W.; Yuan, Q.; Hou, Z.: Doctoral dissertations on tourism in China : a co-word analysis (2016) 0.00
```
0.0022070347 = product of:
  0.008828139 = sum of:
    0.008828139 = product of:
      0.026484415 = sum of:
        0.026484415 = weight(_text_:k in 3178) [ClassicSimilarity], result of:
          0.026484415 = score(doc=3178,freq=2.0), product of:
            0.13429943 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.037621226 = queryNorm
            0.19720423 = fieldWeight in 3178, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3178)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)
```
Abstract

The aim of this paper is to map the foci of research in doctoral dissertations on tourism in China. In the paper, coword analysis is applied, with keywords coming from six public dissertation databases, i.e. CDFD, Wanfang Data, NLC, CALIS, ISTIC, and NSTL, as well as some university libraries providing doctoral dissertations on tourism. Altogether we have examined 928 doctoral dissertations on tourism written between 1989 and 2013. Doctoral dissertations on tourism in China involve 36 first level disciplines and 102 secondary level disciplines. We collect the top 68 keywords of practical significance in tourism which are mentioned at least four times or more. These keywords are classified into 12 categories based on co-word analysis, including cluster analysis, strategic diagrams analysis, and social network analysis. According to the strategic diagram of the 12 categories, we find the mature and immature areas in tourism study. From social networks, we can see the social network maps of original co-occurrence matrix and k-cores analysis of binary matrix. The paper provides valuable insight into the study of tourism by analyzing doctoral dissertations on tourism in China.

Lhadj, L.S.; Boughanem, M.; Amrouche, K.: Enhancing information retrieval through concept-based language modeling and semantic smoothing (2016) 0.00

0.0022070347 = product of:
  0.008828139 = sum of:
    0.008828139 = product of:
      0.026484415 = sum of:
        0.026484415 = weight(_text_:k in 3221) [ClassicSimilarity], result of:
          0.026484415 = score(doc=3221,freq=2.0), product of:
            0.13429943 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.037621226 = queryNorm
            0.19720423 = fieldWeight in 3221, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3221)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

Järvelin, A.; Keskustalo, H.; Sormunen, E.; Saastamoinen, M.; Kettunen, K.: Information retrieval from historical newspaper collections in highly inflectional languages : a query expansion approach (2016) 0.00

0.0022070347 = product of:
  0.008828139 = sum of:
    0.008828139 = product of:
      0.026484415 = sum of:
        0.026484415 = weight(_text_:k in 3223) [ClassicSimilarity], result of:
          0.026484415 = score(doc=3223,freq=2.0), product of:
            0.13429943 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.037621226 = queryNorm
            0.19720423 = fieldWeight in 3223, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3223)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

K., Vani; Gupta, D.: Unmasking text plagiarism using syntactic-semantic based natural language processing techniques : comparisons, analysis and challenges (2018) 0.00

0.0022070347 = product of:
  0.008828139 = sum of:
    0.008828139 = product of:
      0.026484415 = sum of:
        0.026484415 = weight(_text_:k in 5084) [ClassicSimilarity], result of:
          0.026484415 = score(doc=5084,freq=2.0), product of:
            0.13429943 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.037621226 = queryNorm
            0.19720423 = fieldWeight in 5084, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5084)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

RWI/PH: Auf der Suche nach dem entscheidenden Wort : die Häufung bestimmter Wörter innerhalb eines Textes macht diese zu Schlüsselwörtern (2012) 0.00
```
0.0018727309 = product of:
  0.0074909236 = sum of:
    0.0074909236 = product of:
      0.02247277 = sum of:
        0.02247277 = weight(_text_:k in 331) [ClassicSimilarity], result of:
          0.02247277 = score(doc=331,freq=4.0), product of:
            0.13429943 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.037621226 = queryNorm
            0.16733333 = fieldWeight in 331, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.0234375 = fieldNorm(doc=331)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)
```
Content

"Die Dresdner Wissenschaftler haben die semantischen Eigenschaften von Texten mathematisch untersucht, indem sie zehn verschiedene englische Texte in unterschiedlichen Formen kodierten. Dazu zählt unter anderem die englische Ausgabe von Leo Tolstois "Krieg und Frieden". Beispielsweise übersetzten die Forscher Buchstaben innerhalb eines Textes in eine Binär-Sequenz. Dazu ersetzten sie alle Vokale durch eine Eins und alle Konsonanten durch eine Null. Mit Hilfe weiterer mathematischer Funktionen beleuchteten die Wissenschaftler dabei verschiedene Ebenen des Textes, also sowohl einzelne Vokale, Buchstaben als auch ganze Wörter, die in verschiedenen Formen kodiert wurden. Innerhalb des ganzen Textes lassen sich so wiederkehrende Muster finden. Diesen Zusammenhang innerhalb des Textes bezeichnet man als Langzeitkorrelation. Diese gibt an, ob zwei Buchstaben an beliebig weit voneinander entfernten Textstellen miteinander in Verbindung stehen - beispielsweise gibt es wenn wir an einer Stelle einen Buchstaben "K" finden, eine messbare höhere Wahrscheinlichkeit den Buchstaben "K" einige Seiten später nochmal zu finden. "Es ist zu erwarten, dass wenn es in einem Buch an einer Stelle um Krieg geht, die Wahrscheinlichkeit hoch ist das Wort Krieg auch einige Seiten später zu finden. Überraschend ist es, dass wir die hohe Wahrscheinlichkeit auch auf der Buchstabenebene finden", so Altmann.

Vlachidis, A.; Binding, C.; Tudhope, D.; May, K.: Excavating grey literature : a case study on the rich indexing of archaeological documents via natural language-processing techniques and knowledge-based resources (2010) 0.00

0.0017656278 = product of:
  0.007062511 = sum of:
    0.007062511 = product of:
      0.021187533 = sum of:
        0.021187533 = weight(_text_:k in 3948) [ClassicSimilarity], result of:
          0.021187533 = score(doc=3948,freq=2.0), product of:
            0.13429943 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.037621226 = queryNorm
            0.15776339 = fieldWeight in 3948, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.03125 = fieldNorm(doc=3948)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

Kajanan, S.; Bao, Y.; Datta, A.; VanderMeer, D.; Dutta, K.: Efficient automatic search query formulation using phrase-level analysis (2014) 0.00

0.0017656278 = product of:
  0.007062511 = sum of:
    0.007062511 = product of:
      0.021187533 = sum of:
        0.021187533 = weight(_text_:k in 1264) [ClassicSimilarity], result of:
          0.021187533 = score(doc=1264,freq=2.0), product of:
            0.13429943 = queryWeight, product of:
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.037621226 = queryNorm
            0.15776339 = fieldWeight in 1264, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.569778 = idf(docFreq=3384, maxDocs=44218)
              0.03125 = fieldNorm(doc=1264)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

Search (20 results, page 1 of 1)

Authors

Languages

Themes