Search (52 results, page 1 of 3)

Lezius, W.; Rapp, R.; Wettler, M.: ¬A morphology-system and part-of-speech tagger for German (1996) 0.02

0.017704215 = product of:
  0.044260535 = sum of:
    0.015961302 = weight(_text_:of in 1693) [ClassicSimilarity], result of:
      0.015961302 = score(doc=1693,freq=4.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.24433708 = fieldWeight in 1693, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.078125 = fieldNorm(doc=1693)
    0.028299233 = product of:
      0.056598466 = sum of:
        0.056598466 = weight(_text_:22 in 1693) [ClassicSimilarity], result of:
          0.056598466 = score(doc=1693,freq=2.0), product of:
            0.14628662 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04177434 = queryNorm
            0.38690117 = fieldWeight in 1693, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=1693)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Date: 22. 3.2015 9:37:18
Source: Natural language processing and speech technology: Results of the 3rd KONVENS Conference, Bielefeld, October 1996. Ed.: D. Gibbon

Ruge, G.: Sprache und Computer : Wortbedeutung und Termassoziation. Methoden zur automatischen semantischen Klassifikation (1995) 0.01

0.014163372 = product of:
  0.03540843 = sum of:
    0.0127690425 = weight(_text_:of in 1534) [ClassicSimilarity], result of:
      0.0127690425 = score(doc=1534,freq=4.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.19546966 = fieldWeight in 1534, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=1534)
    0.022639386 = product of:
      0.045278773 = sum of:
        0.045278773 = weight(_text_:22 in 1534) [ClassicSimilarity], result of:
          0.045278773 = score(doc=1534,freq=2.0), product of:
            0.14628662 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04177434 = queryNorm
            0.30952093 = fieldWeight in 1534, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=1534)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Content: Enthält folgende Kapitel: (1) Motivation; (2) Language philosophical foundations; (3) Structural comparison of extensions; (4) Earlier approaches towards term association; (5) Experiments; (6) Spreading-activation networks or memory models; (7) Perspective. Appendices: Heads and modifiers of 'car'. Glossary. Index. Language and computer. Word semantics and term association. Methods towards an automatic semantic classification
Footnote: Rez. in: Knowledge organization 22(1995) no.3/4, S.182-184 (M.T. Rolland)

Wenzel, F.: Semantische Eingrenzung im Freitext-Retrieval auf der Basis morphologischer Segmentierungen (1980) 0.01

0.013036557 = product of:
  0.03259139 = sum of:
    0.016630089 = product of:
      0.08315045 = sum of:
        0.08315045 = weight(_text_:problem in 2037) [ClassicSimilarity], result of:
          0.08315045 = score(doc=2037,freq=2.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.46895373 = fieldWeight in 2037, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.078125 = fieldNorm(doc=2037)
      0.2 = coord(1/5)
    0.015961302 = weight(_text_:of in 2037) [ClassicSimilarity], result of:
      0.015961302 = score(doc=2037,freq=4.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.24433708 = fieldWeight in 2037, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.078125 = fieldNorm(doc=2037)
  0.4 = coord(2/5)

Abstract: The basic problem in freetext retrieval is that the retrieval language is not properly adapted to that of the author. Morphological segmentation, where words with the same root are grouped together in the inverted file, is a good eliminator of noise and information loss, providing high recall but low precision

Kurz, C.: Womit sich Strafverfolger bald befassen müssen : ChatGPT (2023) 0.01

0.00893326 = product of:
  0.022333149 = sum of:
    0.0133040715 = product of:
      0.066520356 = sum of:
        0.066520356 = weight(_text_:problem in 203) [ClassicSimilarity], result of:
          0.066520356 = score(doc=203,freq=2.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.375163 = fieldWeight in 203, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0625 = fieldNorm(doc=203)
      0.2 = coord(1/5)
    0.009029076 = weight(_text_:of in 203) [ClassicSimilarity], result of:
      0.009029076 = score(doc=203,freq=2.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.13821793 = fieldWeight in 203, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=203)
  0.4 = coord(2/5)

Abstract: Ein Europol-Bericht widmet sich den Folgen von ChatGPT, wenn Kriminelle die Fähigkeiten des Chatbots für sich ausnutzen: Es drohe vermehrt Phishing und noch mehr Desinformation. Ein Problem für die Strafverfolgung könne auch automatisiert erzeugter bösartiger Quellcode sein.
Content: Vgl. den Europol-Bericht "ChatGPT: The impact of Large Language Models on Law Enforcement" unter: https://www.europol.europa.eu/cms/sites/default/files/documents/Tech%20Watch%20Flash%20-%20The%20Impact%20of%20Large%20Language%20Models%20on%20Law%20Enforcement.pdf.

¬Der Student aus dem Computer (2023) 0.01

0.007923785 = product of:
  0.039618924 = sum of:
    0.039618924 = product of:
      0.07923785 = sum of:
        0.07923785 = weight(_text_:22 in 1079) [ClassicSimilarity], result of:
          0.07923785 = score(doc=1079,freq=2.0), product of:
            0.14628662 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04177434 = queryNorm
            0.5416616 = fieldWeight in 1079, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=1079)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Date: 27. 1.2023 16:22:55

Sienel, J.; Weiss, M.; Laube, M.: Sprachtechnologien für die Informationsgesellschaft des 21. Jahrhunderts (2000) 0.01

0.007917116 = product of:
  0.01979279 = sum of:
    0.005643173 = weight(_text_:of in 5557) [ClassicSimilarity], result of:
      0.005643173 = score(doc=5557,freq=2.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.086386204 = fieldWeight in 5557, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5557)
    0.0141496165 = product of:
      0.028299233 = sum of:
        0.028299233 = weight(_text_:22 in 5557) [ClassicSimilarity], result of:
          0.028299233 = score(doc=5557,freq=2.0), product of:
            0.14628662 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04177434 = queryNorm
            0.19345059 = fieldWeight in 5557, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5557)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Date: 26.12.2000 13:22:17
Source: Sprachtechnologie für eine dynamische Wirtschaft im Medienzeitalter - Language technologies for dynamic business in the age of the media - L'ingénierie linguistique au service de la dynamisation économique à l'ère du multimédia: Tagungsakten der XXVI. Jahrestagung der Internationalen Vereinigung Sprache und Wirtschaft e.V., 23.-25.11.2000, Fachhochschule Köln. Hrsg.: K.-D. Schmitz

Pinker, S.: Wörter und Regeln : Die Natur der Sprache (2000) 0.01
```
0.007917116 = product of:
  0.01979279 = sum of:
    0.005643173 = weight(_text_:of in 734) [ClassicSimilarity], result of:
      0.005643173 = score(doc=734,freq=2.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.086386204 = fieldWeight in 734, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=734)
    0.0141496165 = product of:
      0.028299233 = sum of:
        0.028299233 = weight(_text_:22 in 734) [ClassicSimilarity], result of:
          0.028299233 = score(doc=734,freq=2.0), product of:
            0.14628662 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04177434 = queryNorm
            0.19345059 = fieldWeight in 734, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=734)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

Wie lernen Kinder sprechen? Welche Hinweise geben gerade ihre Fehler beim Spracherwerb auf den Ablauf des Lernprozesses - getreu dem Motto: "Kinder sagen die töllsten Sachen«? Und wie helfen beziehungsweise warum scheitern bislang Computer bei der Simulation neuronaler Netzwerke, die am komplizierten Gewebe der menschlichen Sprache mitwirken? In seinem neuen Buch Wörter und Regeln hat der bekannte US-amerikanische Kognitionswissenschaftler Steven Pinker (Der Sprachinstinkt) wieder einmal eine ebenso informative wie kurzweifige Erkundungstour ins Reich der Sprache unternommen. Was die Sache besonders spannend und lesenswert macht: Souverän beleuchtet der Professor am Massachusetts Institute of Technology sowohl natur- als auch geisteswissenschaftliche Aspekte. So vermittelt er einerseits linguistische Grundlagen in den Fußspuren Ferdinand de Saussures, etwa die einer generativen Grammatik, liefert einen Exkurs durch die Sprachgeschichte und widmet ein eigenes Kapitel den Schrecken der deutschen Sprache". Andererseits lässt er aber auch die neuesten bildgebenden Verfahren nicht außen vor, die zeigen, was im Gehirn bei der Sprachverarbeitung abläuft. Pinkers Theorie, die sich in diesem Puzzle verschiedenster Aspekte wiederfindet: Sprache besteht im Kein aus zwei Bestandteilen - einem mentalen Lexikon aus erinnerten Wörtern und einer mentalen Grammatik aus verschiedenen kombinatorischen Regeln. Konkret heißt das: Wir prägen uns bekannte Größen und ihre abgestuften, sich kreuzenden Merkmale ein, aber wir erzeugen auch neue geistige Produkte, in dem wir Regeln anwenden. Gerade daraus, so schließt Pinker, erschließt sich der Reichtum und die ungeheure Ausdruckskraft unserer Sprache

Date

19. 7.2002 14:22:31

Monnerjahn, P.: Vorsprung ohne Technik : Übersetzen: Computer und Qualität (2000) 0.01

0.006791815 = product of:
  0.033959076 = sum of:
    0.033959076 = product of:
      0.06791815 = sum of:
        0.06791815 = weight(_text_:22 in 5429) [ClassicSimilarity], result of:
          0.06791815 = score(doc=5429,freq=2.0), product of:
            0.14628662 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04177434 = queryNorm
            0.46428138 = fieldWeight in 5429, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=5429)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Source: c't. 2000, H.22, S.230-231

Melzer, C.: ¬Der Maschine anpassen : PC-Spracherkennung - Programme sind mittlerweile alltagsreif (2005) 0.01
```
0.006290105 = product of:
  0.015725262 = sum of:
    0.005820531 = product of:
      0.029102655 = sum of:
        0.029102655 = weight(_text_:problem in 4044) [ClassicSimilarity], result of:
          0.029102655 = score(doc=4044,freq=2.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.1641338 = fieldWeight in 4044, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.02734375 = fieldNorm(doc=4044)
      0.2 = coord(1/5)
    0.009904731 = product of:
      0.019809462 = sum of:
        0.019809462 = weight(_text_:22 in 4044) [ClassicSimilarity], result of:
          0.019809462 = score(doc=4044,freq=2.0), product of:
            0.14628662 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04177434 = queryNorm
            0.1354154 = fieldWeight in 4044, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.02734375 = fieldNorm(doc=4044)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Content

Billiger geht es mit "Via Voice Standard" von IBM. Die Software kostet etwa 50 Euro, hat aber erhebliche Schwächen in der Lernfähigkeit: Sie schneidet jedoch immer noch besser ab als das gut drei Mal so teure "Voice Office Premium 10"; das im Test der sechs Programme als einziges nur ein "Befriedigend" bekam. "Man liest über Spracherkennung nicht mehr so viel" weil es funktioniert", glaubt Dorothee Wiegand von der in Hannover erscheinenden Computerzeitschrift "c't". Die Technik" etwa "Dragon Naturally Speaking" von ScanSoft, sei ausgereift, "Spracherkennung ist vor allem Statistik, die Auswertung unendlicher Wortmöglichkeiten. Eigentlich war eher die Hardware das Problem", sagt Wiegand. Da jetzt selbst einfache Heimcomputer schnell und leistungsfähig seien, hätten die Entwickler viel mehr Möglichkeiten."Aber selbst ältere Computer kommen mit den Systemen klar. Sie brauchen nur etwas länger! "Jedes Byte macht die Spracherkennung etwas schneller, ungenauer ist sie sonst aber nicht", bestätigt Kristina Henry von linguatec in München. Auch für die Produkte des Herstellers gelte jedoch, dass "üben und deutlich sprechen wichtiger sind als jede Hardware". Selbst Stimmen von Diktiergeräten würden klar, erkannt, versichert Henry: "Wir wollen einen Schritt weiter gehen und das Diktieren von unterwegs möglich machen." Der Benutzer könnte dann eine Nummer anwählen, etwa im Auto einen Text aufsprechen und ihn zu Hause "getippt" vorfinden. Grundsätzlich passt die Spracherkennungssoftware inzwischen auch auf den privaten Computer. Klar ist aber, dass selbst der bestgesprochene Text nachbearbeitet werden muss. Zudem ist vom Nutzer Geduld gefragt: Ebenso wie sein System lernt, muss der Mensch sich in Aussprache und Geschwindigkeit dem System anpassen. Dann sind die Ergebnisse allerdings beachtlich - und "Sexterminvereinbarung" statt "zwecks Terminvereinbarung" gehört der Vergangenheit an."

Date

3. 5.1997 8:44:22

Kuhlmann, U.; Monnerjahn, P.: Sprache auf Knopfdruck : Sieben automatische Übersetzungsprogramme im Test (2000) 0.01

0.0056598466 = product of:
  0.028299233 = sum of:
    0.028299233 = product of:
      0.056598466 = sum of:
        0.056598466 = weight(_text_:22 in 5428) [ClassicSimilarity], result of:
          0.056598466 = score(doc=5428,freq=2.0), product of:
            0.14628662 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04177434 = queryNorm
            0.38690117 = fieldWeight in 5428, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=5428)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Source: c't. 2000, H.22, S.220-229

Manhart, K.: Digitales Kauderwelsch : Online-Übersetzungsdienste (2004) 0.01
```
0.005583287 = product of:
  0.013958218 = sum of:
    0.008315044 = product of:
      0.041575223 = sum of:
        0.041575223 = weight(_text_:problem in 2077) [ClassicSimilarity], result of:
          0.041575223 = score(doc=2077,freq=2.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.23447686 = fieldWeight in 2077, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2077)
      0.2 = coord(1/5)
    0.005643173 = weight(_text_:of in 2077) [ClassicSimilarity], result of:
      0.005643173 = score(doc=2077,freq=2.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.086386204 = fieldWeight in 2077, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2077)
  0.4 = coord(2/5)
```
Abstract

Eine englische oder französische Website mal schnell ins Deutsche übersetzen - nichts einfacher als das. OnlineÜbersetzungsdienste versprechen den Sprachtransfer per Mausklick und zum Nulltarif. Doch was taugen sie wirklich? Online-Übersetzungsdienste wollen die Sprachbarriere im WWW beseitigen. Die automatischen Übersetzer versprechen, die E-Mail-Korrespondenz verständlich zu machen und das deutschsprachige Surfen in fremdsprachigen Webangeboten zu ermöglichen. Englische, spanische oder gar chinesische EMails und Websites können damit per Mausklick schnell in die eigene Sprache übertragen werden. Auch komplizierte englische Bedienungsanleitungen oder russische Nachrichten sollen für die Dienste kein Problem sein. Und der eine oder andere Homepage-Besitzer träumt davon, mit Hilfe der digitalen Übersetzungshelfer seine deutsche Website in perfektem Englisch online stellen zu können - in der Hoffung auf internationale Kontakte und höhere Besucherzahlen. Das klingt schön - doch die Realität sieht anders aus. Wer jemals einen solchen Dienst konsultiert hat, reibt sich meist verwundert die Augen über die gebotenen Ergebnisse. Schon einfache Sätze bereiten vielen Online-Über setzern Probleme-und sorgen unfreiwillig für Humor. Aus der CNN-Meldung "Iraq blast injures 31 U.S. troops" wird im Deutschen der Satz: "Der Irak Knall verletzt 31 Vereinigte Staaten Truppen." Sites mit schwierigem Satzbau können die Übersetzer oft nur unverständlich wiedergeben. Den Satz "The Slider is equipped with a brilliant color screen and sports an innovative design that slides open with a push of your thumb" übersetzt der bekannteste Online-Dolmetscher Babelfish mit folgendem Kauderwelsch: "Der Schweber wird mit einem leuchtenden Farbe Schirm ausgerüstet und ein erfinderisches Design sports, das geöffnetes mit einem Stoß Ihres Daumens schiebt." Solch dadaistische Texte muten alle Übersetzer ihren Nutzern zu.
Renker, L.: Exploration von Textkorpora : Topic Models als Grundlage der Interaktion (2015) 0.01
```
0.005583287 = product of:
  0.013958218 = sum of:
    0.008315044 = product of:
      0.041575223 = sum of:
        0.041575223 = weight(_text_:problem in 2380) [ClassicSimilarity], result of:
          0.041575223 = score(doc=2380,freq=2.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.23447686 = fieldWeight in 2380, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2380)
      0.2 = coord(1/5)
    0.005643173 = weight(_text_:of in 2380) [ClassicSimilarity], result of:
      0.005643173 = score(doc=2380,freq=2.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.086386204 = fieldWeight in 2380, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2380)
  0.4 = coord(2/5)
```
Abstract

Das Internet birgt schier endlose Informationen. Ein zentrales Problem besteht heutzutage darin diese auch zugänglich zu machen. Es ist ein fundamentales Domänenwissen erforderlich, um in einer Volltextsuche die korrekten Suchanfragen zu formulieren. Das ist jedoch oftmals nicht vorhanden, so dass viel Zeit aufgewandt werden muss, um einen Überblick des behandelten Themas zu erhalten. In solchen Situationen findet sich ein Nutzer in einem explorativen Suchvorgang, in dem er sich schrittweise an ein Thema heranarbeiten muss. Für die Organisation von Daten werden mittlerweile ganz selbstverständlich Verfahren des Machine Learnings verwendet. In den meisten Fällen bleiben sie allerdings für den Anwender unsichtbar. Die interaktive Verwendung in explorativen Suchprozessen könnte die menschliche Urteilskraft enger mit der maschinellen Verarbeitung großer Datenmengen verbinden. Topic Models sind ebensolche Verfahren. Sie finden in einem Textkorpus verborgene Themen, die sich relativ gut von Menschen interpretieren lassen und sind daher vielversprechend für die Anwendung in explorativen Suchprozessen. Nutzer können damit beim Verstehen unbekannter Quellen unterstützt werden. Bei der Betrachtung entsprechender Forschungsarbeiten fiel auf, dass Topic Models vorwiegend zur Erzeugung statischer Visualisierungen verwendet werden. Das Sensemaking ist ein wesentlicher Bestandteil der explorativen Suche und wird dennoch nur in sehr geringem Umfang genutzt, um algorithmische Neuerungen zu begründen und in einen umfassenden Kontext zu setzen. Daraus leitet sich die Vermutung ab, dass die Verwendung von Modellen des Sensemakings und die nutzerzentrierte Konzeption von explorativen Suchen, neue Funktionen für die Interaktion mit Topic Models hervorbringen und einen Kontext für entsprechende Forschungsarbeiten bieten können.

Footnote

Masterthesis zur Erlangung des akademischen Grades Master of Science (M.Sc.) vorgelegt an der Fachhochschule Köln / Fakultät für Informatik und Ingenieurswissenschaften im Studiengang Medieninformatik.
Rötzer, F.: Computer ergooglen die Bedeutung von Worten (2005) 0.01
```
0.0050240555 = product of:
  0.0125601385 = sum of:
    0.004989027 = product of:
      0.024945134 = sum of:
        0.024945134 = weight(_text_:problem in 3385) [ClassicSimilarity], result of:
          0.024945134 = score(doc=3385,freq=2.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.14068612 = fieldWeight in 3385, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0234375 = fieldNorm(doc=3385)
      0.2 = coord(1/5)
    0.007571111 = weight(_text_:of in 3385) [ClassicSimilarity], result of:
      0.007571111 = score(doc=3385,freq=10.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.11589926 = fieldWeight in 3385, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0234375 = fieldNorm(doc=3385)
  0.4 = coord(2/5)
```
Content

"Wie könnten Computer Sprache lernen und dabei auch die Bedeutung von Worten sowie die Beziehungen zwischen ihnen verstehen? Dieses Problem der Semantik stellt eine gewaltige, bislang nur ansatzweise bewältigte Aufgabe dar, da Worte und Wortverbindungen oft mehrere oder auch viele Bedeutungen haben, die zudem vom außersprachlichen Kontext abhängen. Die beiden holländischen (Ein künstliches Bewusstsein aus einfachen Aussagen (1)). Paul Vitanyi (2) und Rudi Cilibrasi vom Nationalen Institut für Mathematik und Informatik (3) in Amsterdam schlagen eine elegante Lösung vor: zum Nachschlagen im Internet, der größten Datenbank, die es gibt, wird einfach Google benutzt. Objekte wie eine Maus können mit ihren Namen "Maus" benannt werden, die Bedeutung allgemeiner Begriffe muss aus ihrem Kontext gelernt werden. Ein semantisches Web zur Repräsentation von Wissen besteht aus den möglichen Verbindungen, die Objekte und ihre Namen eingehen können. Natürlich können in der Wirklichkeit neue Namen, aber auch neue Bedeutungen und damit neue Verknüpfungen geschaffen werden. Sprache ist lebendig und flexibel. Um einer Künstlichen Intelligenz alle Wortbedeutungen beizubringen, müsste mit der Hilfe von menschlichen Experten oder auch vielen Mitarbeitern eine riesige Datenbank mit den möglichen semantischen Netzen aufgebaut und dazu noch ständig aktualisiert werden. Das aber müsste gar nicht notwendig sein, denn mit dem Web gibt es nicht nur die größte und weitgehend kostenlos benutzbare semantische Datenbank, sie wird auch ständig von zahllosen Internetnutzern aktualisiert. Zudem gibt es Suchmaschinen wie Google, die Verbindungen zwischen Worten und damit deren Bedeutungskontext in der Praxis in ihrer Wahrscheinlichkeit quantitativ mit der Angabe der Webseiten, auf denen sie gefunden wurden, messen.
Mit einem bereits zuvor von Paul Vitanyi und anderen entwickeltem Verfahren, das den Zusammenhang von Objekten misst (normalized information distance - NID ), kann die Nähe zwischen bestimmten Objekten (Bilder, Worte, Muster, Intervalle, Genome, Programme etc.) anhand aller Eigenschaften analysiert und aufgrund der dominanten gemeinsamen Eigenschaft bestimmt werden. Ähnlich können auch die allgemein verwendeten, nicht unbedingt "wahren" Bedeutungen von Namen mit der Google-Suche erschlossen werden. 'At this moment one database stands out as the pinnacle of computer-accessible human knowledge and the most inclusive summary of statistical information: the Google search engine. There can be no doubt that Google has already enabled science to accelerate tremendously and revolutionized the research process. It has dominated the attention of internet users for years, and has recently attracted substantial attention of many Wall Street investors, even reshaping their ideas of company financing.' (Paul Vitanyi und Rudi Cilibrasi) Gibt man ein Wort ein wie beispielsweise "Pferd", erhält man bei Google 4.310.000 indexierte Seiten. Für "Reiter" sind es 3.400.000 Seiten. Kombiniert man beide Begriffe, werden noch 315.000 Seiten erfasst. Für das gemeinsame Auftreten beispielsweise von "Pferd" und "Bart" werden zwar noch immer erstaunliche 67.100 Seiten aufgeführt, aber man sieht schon, dass "Pferd" und "Reiter" enger zusammen hängen. Daraus ergibt sich eine bestimmte Wahrscheinlichkeit für das gemeinsame Auftreten von Begriffen. Aus dieser Häufigkeit, die sich im Vergleich mit der maximalen Menge (5.000.000.000) an indexierten Seiten ergibt, haben die beiden Wissenschaftler eine statistische Größe entwickelt, die sie "normalised Google distance" (NGD) nennen und die normalerweise zwischen 0 und 1 liegt. Je geringer NGD ist, desto enger hängen zwei Begriffe zusammen. "Das ist eine automatische Bedeutungsgenerierung", sagt Vitanyi gegenüber dern New Scientist (4). "Das könnte gut eine Möglichkeit darstellen, einen Computer Dinge verstehen und halbintelligent handeln zu lassen." Werden solche Suchen immer wieder durchgeführt, lässt sich eine Karte für die Verbindungen von Worten erstellen. Und aus dieser Karte wiederum kann ein Computer, so die Hoffnung, auch die Bedeutung der einzelnen Worte in unterschiedlichen natürlichen Sprachen und Kontexten erfassen. So habe man über einige Suchen realisiert, dass ein Computer zwischen Farben und Zahlen unterscheiden, holländische Maler aus dem 17. Jahrhundert und Notfälle sowie Fast-Notfälle auseinander halten oder elektrische oder religiöse Begriffe verstehen könne. Überdies habe eine einfache automatische Übersetzung Englisch-Spanisch bewerkstelligt werden können. Auf diese Weise ließe sich auch, so hoffen die Wissenschaftler, die Bedeutung von Worten erlernen, könne man Spracherkennung verbessern oder ein semantisches Web erstellen und natürlich endlich eine bessere automatische Übersetzung von einer Sprache in die andere realisieren.
Göpferich, S.: Von der Terminographie zur Textographie : computergestützte Verwaltung textsortenspezifischer Textversatzstücke (1995) 0.00
```
0.0047777384 = product of:
  0.023888692 = sum of:
    0.023888692 = weight(_text_:of in 4567) [ClassicSimilarity], result of:
      0.023888692 = score(doc=4567,freq=14.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.36569026 = fieldWeight in 4567, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=4567)
  0.2 = coord(1/5)
```
Abstract

The paper presents 2 different types of computer-based retrieval systems for text-type specific information ranging from phrases to whole standardized passages. The first part describes the structure of a full-text database for text prototypes, the second part, ways of storing text-type specific phrases and passages an a combined terminological and textographic database. The program used to illustrate this second kind of retrieval system is the terminology system CATS, which the Terminology Centre at the Faculty of Applied Linguistics and Cultural Studies of the University of Mainz in Germersheim uses for its FASTERM database

Lezius, W.: Morphy - Morphologie und Tagging für das Deutsche (2013) 0.00

0.0045278776 = product of:
  0.022639386 = sum of:
    0.022639386 = product of:
      0.045278773 = sum of:
        0.045278773 = weight(_text_:22 in 1490) [ClassicSimilarity], result of:
          0.045278773 = score(doc=1490,freq=2.0), product of:
            0.14628662 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04177434 = queryNorm
            0.30952093 = fieldWeight in 1490, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=1490)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Date: 22. 3.2015 9:30:24

Bager, J.: ¬Die Text-KI ChatGPT schreibt Fachtexte, Prosa, Gedichte und Programmcode (2023) 0.00

0.0045278776 = product of:
  0.022639386 = sum of:
    0.022639386 = product of:
      0.045278773 = sum of:
        0.045278773 = weight(_text_:22 in 835) [ClassicSimilarity], result of:
          0.045278773 = score(doc=835,freq=2.0), product of:
            0.14628662 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04177434 = queryNorm
            0.30952093 = fieldWeight in 835, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=835)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Date: 29.12.2022 18:22:55

Rieger, F.: Lügende Computer (2023) 0.00

0.0045278776 = product of:
  0.022639386 = sum of:
    0.022639386 = product of:
      0.045278773 = sum of:
        0.045278773 = weight(_text_:22 in 912) [ClassicSimilarity], result of:
          0.045278773 = score(doc=912,freq=2.0), product of:
            0.14628662 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04177434 = queryNorm
            0.30952093 = fieldWeight in 912, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=912)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Date: 16. 3.2023 19:22:55

Altmann, E.G.; Cristadoro, G.; Esposti, M.D.: On the origin of long-range correlations in texts (2012) 0.00
```
0.004282867 = product of:
  0.021414334 = sum of:
    0.021414334 = weight(_text_:of in 330) [ClassicSimilarity], result of:
      0.021414334 = score(doc=330,freq=20.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.32781258 = fieldWeight in 330, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=330)
  0.2 = coord(1/5)
```
Abstract

The complexity of human interactions with social and natural phenomena is mirrored in the way we describe our experiences through natural language. In order to retain and convey such a high dimensional information, the statistical properties of our linguistic output has to be highly correlated in time. An example are the robust observations, still largely not understood, of correlations on arbitrary long scales in literary texts. In this paper we explain how long-range correlations flow from highly structured linguistic levels down to the building blocks of a text (words, letters, etc..). By combining calculations and data analysis we show that correlations take form of a bursty sequence of events once we approach the semantically relevant topics of the text. The mechanisms we identify are fairly general and can be equally applied to other hierarchical settings.

Source

Proceedings of the National Academy of Sciences, 2. Juli 2012. DOI: 10.1073/pnas.1117723109
Witschel, H.F.: Global and local resources for peer-to-peer text retrieval (2008) 0.00
```
0.00410519 = product of:
  0.02052595 = sum of:
    0.02052595 = weight(_text_:of in 127) [ClassicSimilarity], result of:
      0.02052595 = score(doc=127,freq=54.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.3142131 = fieldWeight in 127, product of:
          7.3484693 = tf(freq=54.0), with freq of:
            54.0 = termFreq=54.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.02734375 = fieldNorm(doc=127)
  0.2 = coord(1/5)
```
Abstract

This thesis is organised as follows: Chapter 2 gives a general introduction to the field of information retrieval, covering its most important aspects. Further, the tasks of distributed and peer-to-peer information retrieval (P2PIR) are introduced, motivating their application and characterising the special challenges that they involve, including a review of existing architectures and search protocols in P2PIR. Finally, chapter 2 presents approaches to evaluating the e ectiveness of both traditional and peer-to-peer IR systems. Chapter 3 contains a detailed account of state-of-the-art information retrieval models and algorithms. This encompasses models for matching queries against document representations, term weighting algorithms, approaches to feedback and associative retrieval as well as distributed retrieval. It thus defines important terminology for the following chapters. The notion of "multi-level association graphs" (MLAGs) is introduced in chapter 4. An MLAG is a simple, graph-based framework that allows to model most of the theoretical and practical approaches to IR presented in chapter 3. Moreover, it provides an easy-to-grasp way of defining and including new entities into IR modeling, such as paragraphs or peers, dividing them conceptually while at the same time connecting them to each other in a meaningful way. This allows for a unified view on many IR tasks, including that of distributed and peer-to-peer search. Starting from related work and a formal defiition of the framework, the possibilities of modeling that it provides are discussed in detail, followed by an experimental section that shows how new insights gained from modeling inside the framework can lead to novel combinations of principles and eventually to improved retrieval effectiveness.
Chapter 5 empirically tackles the first of the two research questions formulated above, namely the question of global collection statistics. More precisely, it studies possibilities of radically simplified results merging. The simplification comes from the attempt - without having knowledge of the complete collection - to equip all peers with the same global statistics, making document scores comparable across peers. Chapter 5 empirically tackles the first of the two research questions formulated above, namely the question of global collection statistics. More precisely, it studies possibilities of radically simplified results merging. The simplification comes from the attempt - without having knowledge of the complete collection - to equip all peers with the same global statistics, making document scores comparable across peers. What is examined, is the question of how we can obtain such global statistics and to what extent their use will lead to a drop in retrieval effectiveness. In chapter 6, the second research question is tackled, namely that of making forwarding decisions for queries, based on profiles of other peers. After a review of related work in that area, the chapter first defines the approaches that will be compared against each other. Then, a novel evaluation framework is introduced, including a new measure for comparing results of a distributed search engine against those of a centralised one. Finally, the actual evaluation is performed using the new framework.

Dietze, J.; Völkel, H.: Verifikation einer Methode der lexikalischen Semantik : zur computergestützten Bestimmung der semantischen Konsistenz und des semantischen Abstands (1992) 0.00

0.004037926 = product of:
  0.02018963 = sum of:
    0.02018963 = weight(_text_:of in 6680) [ClassicSimilarity], result of:
      0.02018963 = score(doc=6680,freq=10.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.3090647 = fieldWeight in 6680, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=6680)
  0.2 = coord(1/5)

Abstract: Uses a semantic field 'linguistic communication' of 735 verbs to verify two numerically based methods working with the semic cooccurrence interval due to the semic micro-structure of a lexeme. The weak point of this procedure is the one-stage classification of the semantic features (semes) of the field

Search (52 results, page 1 of 3)

Authors

Years

Languages

Types

Themes

Subjects

Classifications