Search (321 results, page 1 of 17)

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.24

0.2350404 = product of:
  0.4113207 = sum of:
    0.057362027 = product of:
      0.17208607 = sum of:
        0.17208607 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
          0.17208607 = score(doc=562,freq=2.0), product of:
            0.30619314 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.036116153 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.33333334 = coord(1/3)
    0.17208607 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
      0.17208607 = score(doc=562,freq=2.0), product of:
        0.30619314 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.036116153 = queryNorm
        0.56201804 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.17208607 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
      0.17208607 = score(doc=562,freq=2.0), product of:
        0.30619314 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.036116153 = queryNorm
        0.56201804 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.009786479 = product of:
      0.029359438 = sum of:
        0.029359438 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
          0.029359438 = score(doc=562,freq=2.0), product of:
            0.1264726 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036116153 = queryNorm
            0.23214069 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.33333334 = coord(1/3)
  0.5714286 = coord(4/7)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32

Informationslinguistische Texterschließung (1986) 0.22

0.22374333 = product of:
  0.5220678 = sum of:
    0.044717602 = weight(_text_:retrieval in 186) [ClassicSimilarity], result of:
      0.044717602 = score(doc=186,freq=12.0), product of:
        0.109248295 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036116153 = queryNorm
        0.40932083 = fieldWeight in 186, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=186)
    0.1798283 = weight(_text_:textverarbeitung in 186) [ClassicSimilarity], result of:
      0.1798283 = score(doc=186,freq=4.0), product of:
        0.28832662 = queryWeight, product of:
          7.983315 = idf(docFreq=40, maxDocs=44218)
          0.036116153 = queryNorm
        0.6236965 = fieldWeight in 186, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          7.983315 = idf(docFreq=40, maxDocs=44218)
          0.0390625 = fieldNorm(doc=186)
    0.29752192 = weight(_text_:aufsatzsammlung in 186) [ClassicSimilarity], result of:
      0.29752192 = score(doc=186,freq=24.0), product of:
        0.23696128 = queryWeight, product of:
          6.5610886 = idf(docFreq=169, maxDocs=44218)
          0.036116153 = queryNorm
        1.2555718 = fieldWeight in 186, product of:
          4.8989797 = tf(freq=24.0), with freq of:
            24.0 = termFreq=24.0
          6.5610886 = idf(docFreq=169, maxDocs=44218)
          0.0390625 = fieldNorm(doc=186)
  0.42857143 = coord(3/7)

Classification: ES 935 Allgemeine und vergleichende Sprach- und Literaturwissenschaft. Indogermanistik. Außereuropäische Sprachen und Literaturen / Spezialbereiche der allgemeinen Sprachwissenschaft / Datenverarbeitung und Sprachwissenschaft. Computerlinguistik / Textverarbeitung
LCSH: Information storage and retrieval systems / Linguistics
RSWK: Information Retrieval / Aufsatzsammlung (DNB)
Automatische Sprachanalyse / Morphologie / Aufsatzsammlung (SBB / GBV)
Automatische Sprachanalyse / Morphologie <Linguistik> / Aufsatzsammlung (DNB)
Linguistische Datenverarbeitung / Linguistik / Aufsatzsammlung (SWB)
Linguistik / Information Retrieval / Aufsatzsammlung (SWB / BVB)
Linguistische Datenverarbeitung / Textanalyse / Aufsatzsammlung (BVB)
RVK: ES 935 Allgemeine und vergleichende Sprach- und Literaturwissenschaft. Indogermanistik. Außereuropäische Sprachen und Literaturen / Spezialbereiche der allgemeinen Sprachwissenschaft / Datenverarbeitung und Sprachwissenschaft. Computerlinguistik / Textverarbeitung
Subject: Information Retrieval / Aufsatzsammlung (DNB)
Automatische Sprachanalyse / Morphologie / Aufsatzsammlung (SBB / GBV)
Automatische Sprachanalyse / Morphologie <Linguistik> / Aufsatzsammlung (DNB)
Linguistische Datenverarbeitung / Linguistik / Aufsatzsammlung (SWB)
Linguistik / Information Retrieval / Aufsatzsammlung (SWB / BVB)
Linguistische Datenverarbeitung / Textanalyse / Aufsatzsammlung (BVB)
Information storage and retrieval systems / Linguistics

Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.21

0.2147804 = product of:
  0.3758657 = sum of:
    0.021907061 = weight(_text_:retrieval in 563) [ClassicSimilarity], result of:
      0.021907061 = score(doc=563,freq=2.0), product of:
        0.109248295 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036116153 = queryNorm
        0.20052543 = fieldWeight in 563, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=563)
    0.17208607 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
      0.17208607 = score(doc=563,freq=2.0), product of:
        0.30619314 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.036116153 = queryNorm
        0.56201804 = fieldWeight in 563, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=563)
    0.17208607 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
      0.17208607 = score(doc=563,freq=2.0), product of:
        0.30619314 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.036116153 = queryNorm
        0.56201804 = fieldWeight in 563, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=563)
    0.009786479 = product of:
      0.029359438 = sum of:
        0.029359438 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
          0.029359438 = score(doc=563,freq=2.0), product of:
            0.1264726 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036116153 = queryNorm
            0.23214069 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
      0.33333334 = coord(1/3)
  0.5714286 = coord(4/7)

Abstract: In this thesis we propose three new word association measures for multi-word term extraction. We combine these association measures with LocalMaxs algorithm in our extraction model and compare the results of different multi-word term extraction methods. Our approach is language and domain independent and requires no training data. It can be applied to such tasks as text summarization, information retrieval, and document classification. We further explore the potential of using multi-word terms as an effective representation for general web-page summarization. We extract multi-word terms from human written summaries in a large collection of web-pages, and generate the summaries by aligning document words with these multi-word terms. Our system applies machine translation technology to learn the aligning process from a training set and focuses on selecting high quality multi-word terms from human written summaries to generate suitable results for web-page summarization.
Content: A Thesis presented to The University of Guelph In partial fulfilment of requirements for the degree of Master of Science in Computer Science. Vgl. Unter: http://www.inf.ufrgs.br%2F~ceramisch%2Fdownload_files%2Fpublications%2F2009%2Fp01.pdf.
Date: 10. 1.2013 19:22:47

Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.17

0.17208609 = product of:
  0.4015342 = sum of:
    0.057362027 = product of:
      0.17208607 = sum of:
        0.17208607 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
          0.17208607 = score(doc=862,freq=2.0), product of:
            0.30619314 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.036116153 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
      0.33333334 = coord(1/3)
    0.17208607 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
      0.17208607 = score(doc=862,freq=2.0), product of:
        0.30619314 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.036116153 = queryNorm
        0.56201804 = fieldWeight in 862, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=862)
    0.17208607 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
      0.17208607 = score(doc=862,freq=2.0), product of:
        0.30619314 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.036116153 = queryNorm
        0.56201804 = fieldWeight in 862, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=862)
  0.42857143 = coord(3/7)

Source: https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN

Tartakovski, O.; Shramko, M.: Implementierung eines Werkzeugs zur Sprachidentifikation in mono- und multilingualen Texten (2006) 0.06

0.06119021 = product of:
  0.21416573 = sum of:
    0.036144804 = weight(_text_:retrieval in 5978) [ClassicSimilarity], result of:
      0.036144804 = score(doc=5978,freq=4.0), product of:
        0.109248295 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036116153 = queryNorm
        0.33085006 = fieldWeight in 5978, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5978)
    0.17802092 = weight(_text_:textverarbeitung in 5978) [ClassicSimilarity], result of:
      0.17802092 = score(doc=5978,freq=2.0), product of:
        0.28832662 = queryWeight, product of:
          7.983315 = idf(docFreq=40, maxDocs=44218)
          0.036116153 = queryNorm
        0.617428 = fieldWeight in 5978, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.983315 = idf(docFreq=40, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5978)
  0.2857143 = coord(2/7)

Abstract: Die Identifikation der Sprache bzw. der Sprachen in Textdokumenten ist einer der wichtigsten Schritte maschineller Textverarbeitung für das Information Retrieval. Der vorliegende Artikel stellt Langldent vor, ein System zur Sprachidentifikation von mono- und multilingualen elektronischen Textdokumenten. Das System bietet sowohl eine Auswahl von gängigen Algorithmen für die Sprachidentifikation monolingualer Textdokumente als auch einen neuen Algorithmus für die Sprachidentifikation multilingualer Textdokumente.
Source: Effektive Information Retrieval Verfahren in Theorie und Praxis: ausgewählte und erweiterte Beiträge des Vierten Hildesheimer Evaluierungs- und Retrievalworkshop (HIER 2005), Hildesheim, 20.7.2005. Hrsg.: T. Mandl u. C. Womser-Hacker

Semantik, Lexikographie und Computeranwendungen : Workshop ... (Bonn) : 1995.01.27-28 (1996) 0.04

0.037033778 = product of:
  0.12961821 = sum of:
    0.12146281 = weight(_text_:aufsatzsammlung in 190) [ClassicSimilarity], result of:
      0.12146281 = score(doc=190,freq=4.0), product of:
        0.23696128 = queryWeight, product of:
          6.5610886 = idf(docFreq=169, maxDocs=44218)
          0.036116153 = queryNorm
        0.51258504 = fieldWeight in 190, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          6.5610886 = idf(docFreq=169, maxDocs=44218)
          0.0390625 = fieldNorm(doc=190)
    0.0081554 = product of:
      0.0244662 = sum of:
        0.0244662 = weight(_text_:22 in 190) [ClassicSimilarity], result of:
          0.0244662 = score(doc=190,freq=2.0), product of:
            0.1264726 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036116153 = queryNorm
            0.19345059 = fieldWeight in 190, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=190)
      0.33333334 = coord(1/3)
  0.2857143 = coord(2/7)

Date: 14. 4.2007 10:04:22
RSWK: Computer / Anwendung / Computerunterstützte Lexikographie / Aufsatzsammlung
Subject: Computer / Anwendung / Computerunterstützte Lexikographie / Aufsatzsammlung

Winograd, T.: Software für Sprachverarbeitung (1984) 0.04

0.036330804 = product of:
  0.2543156 = sum of:
    0.2543156 = weight(_text_:textverarbeitung in 1687) [ClassicSimilarity], result of:
      0.2543156 = score(doc=1687,freq=2.0), product of:
        0.28832662 = queryWeight, product of:
          7.983315 = idf(docFreq=40, maxDocs=44218)
          0.036116153 = queryNorm
        0.88204 = fieldWeight in 1687, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.983315 = idf(docFreq=40, maxDocs=44218)
          0.078125 = fieldNorm(doc=1687)
  0.14285715 = coord(1/7)

Abstract: Der Computer kann mit sprachlichen Zeichen sicher und schnell umgehen. Dies zeigen Programme zur Textverarbeitung. Versuche allerdings, ihn auch mit Bedeutungen operieren zu lassen, sind gescheitert. Wird der Rechner das größte Problem der Sprachverarbeitung - die Mehrdeutigkeit natürlicher Sprachen - jemals bewältigen?

Information und Sprache : Beiträge zu Informationswissenschaft, Computerlinguistik, Bibliothekswesen und verwandten Fächern. Festschrift für Harald H. Zimmermann. Herausgegeben von Ilse Harms, Heinz-Dirk Luckhardt und Hans W. Giessen (2006) 0.03
```
0.03287351 = product of:
  0.11505728 = sum of:
    0.017887041 = weight(_text_:retrieval in 91) [ClassicSimilarity], result of:
      0.017887041 = score(doc=91,freq=12.0), product of:
        0.109248295 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036116153 = queryNorm
        0.16372833 = fieldWeight in 91, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.015625 = fieldNorm(doc=91)
    0.09717024 = weight(_text_:aufsatzsammlung in 91) [ClassicSimilarity], result of:
      0.09717024 = score(doc=91,freq=16.0), product of:
        0.23696128 = queryWeight, product of:
          6.5610886 = idf(docFreq=169, maxDocs=44218)
          0.036116153 = queryNorm
        0.41006804 = fieldWeight in 91, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          6.5610886 = idf(docFreq=169, maxDocs=44218)
          0.015625 = fieldNorm(doc=91)
  0.2857143 = coord(2/7)
```
Content

Inhalt: Information und Sprache und mehr - eine Einleitung - Information und Kommunikation Wolf Rauch: Auch Information ist eine Tochter der Zeit Winfried Lenders: Information und kulturelles Gedächtnis Rainer Hammwöhner: Anmerkungen zur Grundlegung der Informationsethik Hans W. Giessen: Ehrwürdig stille Informationen Gernot Wersig: Vereinheitlichte Medientheorie und ihre Sicht auf das Internet Johann Haller, Anja Rütten: Informationswissenschaft und Translationswissenschaft: Spielarten oder Schwestern? Rainer Kuhlen: In Richtung Summarizing für Diskurse in K3 Werner Schweibenz: Sprache, Information und Bedeutung im Museum. Narrative Vermittlung durch Storytelling - Sprache und Computer, insbesondere Information Retrieval und Automatische Indexierung Manfred Thiel: Bedingt wahrscheinliche Syntaxbäume Jürgen Krause: Shell Model, Semantic Web and Web Information Retrieval Elisabeth Niggemann: Wer suchet, der findet? Verbesserung der inhaltlichen Suchmöglichkeiten im Informationssystem Der Deutschen Bibliothek Christa Womser-Hacker: Zur Rolle von Eigennamen im Cross-Language Information Retrieval Klaus-Dirk Schmitz: Wörterbuch, Thesaurus, Terminologie, Ontologie. Was tragen Terminologiewissenschaft und Informationswissenschaft zur Wissensordnung bei?

Footnote

Rez. in Mitt. VÖB 59(2006) Nr.3, S.75-78 (O. Oberhauser): "Beim vorliegenden Buch handelt es sich um die Festschrift zum 65. Geburtstag des mit Ende des Sommersemesters 2006 in den Ruhestand getretenen Universitätsprofessors für Informationswissenschaft, Harald H. Zimmermann, jenes 1941 in Völklingen geborenen Computerlinguisten, der die Informationswissenschaft als akademische Disziplin in Deutschland mitbegründet und seit 1980 an der Universität des Saarlandes vertreten hat. Die insgesamt 26 Beiträge des von Professor Zimmermanns Mitarbeitern betreuten, optisch gediegen anmutenden Saur-Bandes gliedern sich - so das Inhaltsverzeichnis - in vier Themenschwerpunkte: - Information und Kommunikation - Sprache und Computer, insbesondere Information Retrieval und Automatische Indexierung - Analysen und Entwicklungen - Persönliches Die Aufsätze selbst variieren, wie bei Festschriften üblich bzw. unvermeidbar, hinsichtlich Länge, Stil, thematischem Detail und Anspruchsniveau. Neben wissenschaftlichen Beiträgen findet man hier auch Reminiszenzen und Literarisches. Die nachfolgende Auswahl zeigt, was mich selbst an diesem Buch interessiert hat:

RSWK

Informations- und Dokumentationswissenschaft / Aufsatzsammlung
Information Retrieval / Aufsatzsammlung
Automatische Indexierung / Aufsatzsammlung
Linguistische Datenverarbeitung / Aufsatzsammlung

Subject

Informations- und Dokumentationswissenschaft / Aufsatzsammlung
Information Retrieval / Aufsatzsammlung
Automatische Indexierung / Aufsatzsammlung
Linguistische Datenverarbeitung / Aufsatzsammlung
Artemenko, O.; Shramko, M.: Entwicklung eines Werkzeugs zur Sprachidentifikation in mono- und multilingualen Texten (2005) 0.03
```
0.029082738 = product of:
  0.10178958 = sum of:
    0.012779119 = weight(_text_:retrieval in 572) [ClassicSimilarity], result of:
      0.012779119 = score(doc=572,freq=2.0), product of:
        0.109248295 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036116153 = queryNorm
        0.11697317 = fieldWeight in 572, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02734375 = fieldNorm(doc=572)
    0.08901046 = weight(_text_:textverarbeitung in 572) [ClassicSimilarity], result of:
      0.08901046 = score(doc=572,freq=2.0), product of:
        0.28832662 = queryWeight, product of:
          7.983315 = idf(docFreq=40, maxDocs=44218)
          0.036116153 = queryNorm
        0.308714 = fieldWeight in 572, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.983315 = idf(docFreq=40, maxDocs=44218)
          0.02734375 = fieldNorm(doc=572)
  0.2857143 = coord(2/7)
```
Abstract

Identifikation der Sprache bzw. Sprachen elektronischer Textdokumente ist einer der wichtigsten Schritte in vielen Prozessen maschineller Textverarbeitung. Die vorliegende Arbeit stellt LangIdent, ein System zur Sprachidentifikation von mono- und multilingualen elektronischen Textdokumenten vor. Das System bietet sowohl eine Auswahl von gängigen Algorithmen für die Sprachidentifikation monolingualer Textdokumente als auch einen neuen Algorithmus für die Sprachidentifikation multilingualer Textdokumente.
Mit der Verbreitung des Internets vermehrt sich die Menge der im World Wide Web verfügbaren Dokumente. Die Gewährleistung eines effizienten Zugangs zu gewünschten Informationen für die Internetbenutzer wird zu einer großen Herausforderung an die moderne Informationsgesellschaft. Eine Vielzahl von Werkzeugen wird bereits eingesetzt, um den Nutzern die Orientierung in der wachsenden Informationsflut zu erleichtern. Allerdings stellt die enorme Menge an unstrukturierten und verteilten Informationen nicht die einzige Schwierigkeit dar, die bei der Entwicklung von Werkzeugen dieser Art zu bewältigen ist. Die zunehmende Vielsprachigkeit von Web-Inhalten resultiert in dem Bedarf an Sprachidentifikations-Software, die Sprache/en von elektronischen Dokumenten zwecks gezielter Weiterverarbeitung identifiziert. Solche Sprachidentifizierer können beispielsweise effektiv im Bereich des Multilingualen Information Retrieval eingesetzt werden, da auf den Sprachidentifikationsergebnissen Prozesse der automatischen Indexbildung wie Stemming, Stoppwörterextraktion etc. aufbauen. In der vorliegenden Arbeit wird das neue System "LangIdent" zur Sprachidentifikation von elektronischen Textdokumenten vorgestellt, das in erster Linie für Lehre und Forschung an der Universität Hildesheim verwendet werden soll. "LangIdent" enthält eine Auswahl von gängigen Algorithmen zu der monolingualen Sprachidentifikation, die durch den Benutzer interaktiv ausgewählt und eingestellt werden können. Zusätzlich wurde im System ein neuer Algorithmus implementiert, der die Identifikation von Sprachen, in denen ein multilinguales Dokument verfasst ist, ermöglicht. Die Identifikation beschränkt sich nicht nur auf eine Aufzählung von gefundenen Sprachen, vielmehr wird der Text in monolinguale Abschnitte aufgeteilt, jeweils mit der Angabe der identifizierten Sprache.

Wenzel, F.: Semantische Eingrenzung im Freitext-Retrieval auf der Basis morphologischer Segmentierungen (1980) 0.02

0.022771174 = product of:
  0.07969911 = sum of:
    0.063240245 = weight(_text_:retrieval in 2037) [ClassicSimilarity], result of:
      0.063240245 = score(doc=2037,freq=6.0), product of:
        0.109248295 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036116153 = queryNorm
        0.5788671 = fieldWeight in 2037, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=2037)
    0.01645886 = product of:
      0.049376577 = sum of:
        0.049376577 = weight(_text_:29 in 2037) [ClassicSimilarity], result of:
          0.049376577 = score(doc=2037,freq=2.0), product of:
            0.12704533 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.036116153 = queryNorm
            0.38865322 = fieldWeight in 2037, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.078125 = fieldNorm(doc=2037)
      0.33333334 = coord(1/3)
  0.2857143 = coord(2/7)

Abstract: The basic problem in freetext retrieval is that the retrieval language is not properly adapted to that of the author. Morphological segmentation, where words with the same root are grouped together in the inverted file, is a good eliminator of noise and information loss, providing high recall but low precision
Source: Nachrichten für Dokumentation. 31(1980) H.1, S.29-35

Zimmermann, H.H.: Maschinelle und Computergestützte Übersetzung (2004) 0.02
```
0.021798484 = product of:
  0.15258938 = sum of:
    0.15258938 = weight(_text_:textverarbeitung in 2943) [ClassicSimilarity], result of:
      0.15258938 = score(doc=2943,freq=2.0), product of:
        0.28832662 = queryWeight, product of:
          7.983315 = idf(docFreq=40, maxDocs=44218)
          0.036116153 = queryNorm
        0.52922404 = fieldWeight in 2943, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          7.983315 = idf(docFreq=40, maxDocs=44218)
          0.046875 = fieldNorm(doc=2943)
  0.14285715 = coord(1/7)
```
Abstract

Unter Maschineller Übersetzung (Machine Translation, MT) wird im Folgenden die vollautomatische Übersetzung eines Textes in natürlicher Sprache in eine andere natürliche Sprache verstanden. Unter Human-Übersetzung (Human Translation, HT) wird die intellektuelle Übersetzung eines Textes mit oder ohne maschinelle lexikalische Hilfen mit oder ohne Textverarbeitung verstanden. Unter computergestützter bzw computerunterstützter Übersetzung (CAT) wird einerseits eine intellektuelle Übersetzung verstanden, die auf einer maschinellen Vorübersetzung/Rohübersetzung (MT) aufbaut, die nachfolgend intellektuell nachbereitet wird (Postedition); andererseits wird darunter eine intellektuelle Übersetzung verstanden, bei der vor oder während des intellektuellen Übersetzungsprozesses ein Translation Memory und/ oder eine Terminologie-Bank verwendet werden. Unter ICAT wird eine spezielle Variante von CAT verstanden, bei der ein Nutzer ohne (hinreichende) Kenntnis der Zielsprache bei einer Übersetzung aus seiner Muttersprache so unterstützt wird, dass das zielsprachige Äquivalent relativ fehlerfrei ist.

Rindflesch, T.C.; Aronson, A.R.: Semantic processing in information retrieval (1993) 0.02

0.019620333 = product of:
  0.06867116 = sum of:
    0.05714996 = weight(_text_:retrieval in 4121) [ClassicSimilarity], result of:
      0.05714996 = score(doc=4121,freq=10.0), product of:
        0.109248295 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036116153 = queryNorm
        0.5231199 = fieldWeight in 4121, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4121)
    0.011521201 = product of:
      0.0345636 = sum of:
        0.0345636 = weight(_text_:29 in 4121) [ClassicSimilarity], result of:
          0.0345636 = score(doc=4121,freq=2.0), product of:
            0.12704533 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.036116153 = queryNorm
            0.27205724 = fieldWeight in 4121, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4121)
      0.33333334 = coord(1/3)
  0.2857143 = coord(2/7)

Abstract: Intuition suggests that one way to enhance the information retrieval process would be the use of phrases to characterize the contents of text. A number of researchers, however, have noted that phrases alone do not improve retrieval effectiveness. In this paper we briefly review the use of phrases in information retrieval and then suggest extensions to this paradigm using semantic information. We claim that semantic processing, which can be viewed as expressing relations between the concepts represented by phrases, will in fact enhance retrieval effectiveness. The availability of the UMLS® domain model, which we exploit extensively, significantly contributes to the feasibility of this processing.
Date: 29. 6.2015 14:51:28

Rau, L.F.: Conceptual information extraction and retrieval from natural language input (198) 0.02

0.019455515 = product of:
  0.0680943 = sum of:
    0.051635437 = weight(_text_:retrieval in 1955) [ClassicSimilarity], result of:
      0.051635437 = score(doc=1955,freq=4.0), product of:
        0.109248295 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036116153 = queryNorm
        0.47264296 = fieldWeight in 1955, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=1955)
    0.01645886 = product of:
      0.049376577 = sum of:
        0.049376577 = weight(_text_:29 in 1955) [ClassicSimilarity], result of:
          0.049376577 = score(doc=1955,freq=2.0), product of:
            0.12704533 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.036116153 = queryNorm
            0.38865322 = fieldWeight in 1955, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.078125 = fieldNorm(doc=1955)
      0.33333334 = coord(1/3)
  0.2857143 = coord(2/7)

Date: 16. 8.1998 13:29:20
Footnote: Wiederabgedruckt in: Readings in information retrieval. Ed.: K. Sparck Jones u. P. Willett. San Francisco: Morgan Kaufmann 1997. S.527-533

Liu, S.; Liu, F.; Yu, C.; Meng, W.: ¬An effective approach to document retrieval via utilizing WordNet and recognizing phrases (2004) 0.02

0.019455515 = product of:
  0.0680943 = sum of:
    0.051635437 = weight(_text_:retrieval in 4078) [ClassicSimilarity], result of:
      0.051635437 = score(doc=4078,freq=4.0), product of:
        0.109248295 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036116153 = queryNorm
        0.47264296 = fieldWeight in 4078, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=4078)
    0.01645886 = product of:
      0.049376577 = sum of:
        0.049376577 = weight(_text_:29 in 4078) [ClassicSimilarity], result of:
          0.049376577 = score(doc=4078,freq=2.0), product of:
            0.12704533 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.036116153 = queryNorm
            0.38865322 = fieldWeight in 4078, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.078125 = fieldNorm(doc=4078)
      0.33333334 = coord(1/3)
  0.2857143 = coord(2/7)

Date: 10.10.2005 10:29:08
Source: SIGIR'04: Proceedings of the 27th Annual International ACM-SIGIR Conference an Research and Development in Information Retrieval. Ed.: K. Järvelin, u.a

Doszkocs, T.E.; Zamora, A.: Dictionary services and spelling aids for Web searching (2004) 0.02
```
0.018669581 = product of:
  0.06534353 = sum of:
    0.025817718 = weight(_text_:retrieval in 2541) [ClassicSimilarity], result of:
      0.025817718 = score(doc=2541,freq=4.0), product of:
        0.109248295 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036116153 = queryNorm
        0.23632148 = fieldWeight in 2541, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2541)
    0.039525814 = product of:
      0.059288718 = sum of:
        0.024688289 = weight(_text_:29 in 2541) [ClassicSimilarity], result of:
          0.024688289 = score(doc=2541,freq=2.0), product of:
            0.12704533 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.036116153 = queryNorm
            0.19432661 = fieldWeight in 2541, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2541)
        0.03460043 = weight(_text_:22 in 2541) [ClassicSimilarity], result of:
          0.03460043 = score(doc=2541,freq=4.0), product of:
            0.1264726 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036116153 = queryNorm
            0.27358043 = fieldWeight in 2541, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2541)
      0.6666667 = coord(2/3)
  0.2857143 = coord(2/7)
```
Abstract

The Specialized Information Services Division (SIS) of the National Library of Medicine (NLM) provides Web access to more than a dozen scientific databases on toxicology and the environment on TOXNET . Search queries on TOXNET often include misspelled or variant English words, medical and scientific jargon and chemical names. Following the example of search engines like Google and ClinicalTrials.gov, we set out to develop a spelling "suggestion" system for increased recall and precision in TOXNET searching. This paper describes development of dictionary technology that can be used in a variety of applications such as orthographic verification, writing aid, natural language processing, and information storage and retrieval. The design of the technology allows building complex applications using the components developed in the earlier phases of the work in a modular fashion without extensive rewriting of computer code. Since many of the potential applications envisioned for this work have on-line or web-based interfaces, the dictionaries and other computer components must have fast response, and must be adaptable to open-ended database vocabularies, including chemical nomenclature. The dictionary vocabulary for this work was derived from SIS and other databases and specialized resources, such as NLM's Unified Medical Language Systems (UMLS) . The resulting technology, A-Z Dictionary (AZdict), has three major constituents: 1) the vocabulary list, 2) the word attributes that define part of speech and morphological relationships between words in the list, and 3) a set of programs that implements the retrieval of words and their attributes, and determines similarity between words (ChemSpell). These three components can be used in various applications such as spelling verification, spelling aid, part-of-speech tagging, paraphrasing, and many other natural language processing functions.

Date

14. 8.2004 17:22:56

Source

Online. 28(2004) no.3, S.22-29

Byrne, C.C.; McCracken, S.A.: ¬An adaptive thesaurus employing semantic distance, relational inheritance and nominal compound interpretation for linguistic support of information retrieval (1999) 0.02

0.018110596 = product of:
  0.06338708 = sum of:
    0.043814123 = weight(_text_:retrieval in 4483) [ClassicSimilarity], result of:
      0.043814123 = score(doc=4483,freq=2.0), product of:
        0.109248295 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036116153 = queryNorm
        0.40105087 = fieldWeight in 4483, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.09375 = fieldNorm(doc=4483)
    0.019572958 = product of:
      0.058718875 = sum of:
        0.058718875 = weight(_text_:22 in 4483) [ClassicSimilarity], result of:
          0.058718875 = score(doc=4483,freq=2.0), product of:
            0.1264726 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036116153 = queryNorm
            0.46428138 = fieldWeight in 4483, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=4483)
      0.33333334 = coord(1/3)
  0.2857143 = coord(2/7)

Date: 15. 3.2000 10:22:37

Chen, K.-H.: Evaluating Chinese text retrieval with multilingual queries (2002) 0.02

0.01789648 = product of:
  0.06263768 = sum of:
    0.051116478 = weight(_text_:retrieval in 1851) [ClassicSimilarity], result of:
      0.051116478 = score(doc=1851,freq=8.0), product of:
        0.109248295 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036116153 = queryNorm
        0.46789268 = fieldWeight in 1851, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1851)
    0.011521201 = product of:
      0.0345636 = sum of:
        0.0345636 = weight(_text_:29 in 1851) [ClassicSimilarity], result of:
          0.0345636 = score(doc=1851,freq=2.0), product of:
            0.12704533 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.036116153 = queryNorm
            0.27205724 = fieldWeight in 1851, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1851)
      0.33333334 = coord(1/3)
  0.2857143 = coord(2/7)

Abstract: This paper reports the design of a Chinese test collection with multilingual queries and the application of this test collection to evaluate information retrieval Systems. The effective indexing units, IR models, translation techniques, and query expansion for Chinese text retrieval are identified. The collaboration of East Asian countries for construction of test collections for cross-language multilingual text retrieval is also discussed in this paper. As well, a tool is designed to help assessors judge relevante and gather the events of relevante judgment. The log file created by this tool will be used to analyze the behaviors of assessors in the future.
Source: Knowledge organization. 29(2002) nos.3/4, S.156-170

Multi-source, multilingual information extraction and summarization (2013) 0.02

0.01735183 = product of:
  0.12146281 = sum of:
    0.12146281 = weight(_text_:aufsatzsammlung in 978) [ClassicSimilarity], result of:
      0.12146281 = score(doc=978,freq=4.0), product of:
        0.23696128 = queryWeight, product of:
          6.5610886 = idf(docFreq=169, maxDocs=44218)
          0.036116153 = queryNorm
        0.51258504 = fieldWeight in 978, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          6.5610886 = idf(docFreq=169, maxDocs=44218)
          0.0390625 = fieldNorm(doc=978)
  0.14285715 = coord(1/7)

RSWK: Natürlichsprachiges System / Information Extraction / Automatische Inhaltsanalyse / Zusammenfassung / Aufsatzsammlung
Subject: Natürlichsprachiges System / Information Extraction / Automatische Inhaltsanalyse / Zusammenfassung / Aufsatzsammlung

Czejdo. B.D.; Tucci, R.P.: ¬A dataflow graphical language for database applications (1994) 0.02

0.015134467 = product of:
  0.052970633 = sum of:
    0.03651177 = weight(_text_:retrieval in 559) [ClassicSimilarity], result of:
      0.03651177 = score(doc=559,freq=2.0), product of:
        0.109248295 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036116153 = queryNorm
        0.33420905 = fieldWeight in 559, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=559)
    0.01645886 = product of:
      0.049376577 = sum of:
        0.049376577 = weight(_text_:29 in 559) [ClassicSimilarity], result of:
          0.049376577 = score(doc=559,freq=2.0), product of:
            0.12704533 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.036116153 = queryNorm
            0.38865322 = fieldWeight in 559, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.078125 = fieldNorm(doc=559)
      0.33333334 = coord(1/3)
  0.2857143 = coord(2/7)

Abstract: Discusses a graphical language for information retrieval and processing. A lot of recent activity has occured in the area of improving access to database systems. However, current results are restricted to simple interfacing of database systems. Proposes a graphical language for specifying complex applications
Date: 20.10.2000 13:29:46

Sheremet'eva, S.O.: Teoreticheskie i metodologicheskie problemy inzhenernoi lingvistiki (1998) 0.02

0.015134467 = product of:
  0.052970633 = sum of:
    0.03651177 = weight(_text_:retrieval in 6316) [ClassicSimilarity], result of:
      0.03651177 = score(doc=6316,freq=2.0), product of:
        0.109248295 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.036116153 = queryNorm
        0.33420905 = fieldWeight in 6316, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=6316)
    0.01645886 = product of:
      0.049376577 = sum of:
        0.049376577 = weight(_text_:29 in 6316) [ClassicSimilarity], result of:
          0.049376577 = score(doc=6316,freq=2.0), product of:
            0.12704533 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.036116153 = queryNorm
            0.38865322 = fieldWeight in 6316, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.078125 = fieldNorm(doc=6316)
      0.33333334 = coord(1/3)
  0.2857143 = coord(2/7)

Abstract: Examines the major topical issues in the area of linguistic engineering: machine translation, text synthesis and information retrieval
Date: 6. 3.1999 13:56:29

Search (321 results, page 1 of 17)

Authors

Years

Languages

Types

Themes

Subjects

Classifications