Search (105 results, page 1 of 6)

  • × theme_ss:"Computerlinguistik"
  1. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.27
    0.26777846 = product of:
      0.46861225 = sum of:
        0.064583495 = product of:
          0.19375047 = sum of:
            0.19375047 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
              0.19375047 = score(doc=562,freq=2.0), product of:
                0.34474066 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.04066292 = queryNorm
                0.56201804 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
        0.19375047 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
          0.19375047 = score(doc=562,freq=2.0), product of:
            0.34474066 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.04066292 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.19375047 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
          0.19375047 = score(doc=562,freq=2.0), product of:
            0.34474066 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.04066292 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.016527792 = product of:
          0.033055585 = sum of:
            0.033055585 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
              0.033055585 = score(doc=562,freq=2.0), product of:
                0.14239462 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04066292 = queryNorm
                0.23214069 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.5 = coord(1/2)
      0.5714286 = coord(4/7)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  2. Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.19
    0.19375049 = product of:
      0.45208445 = sum of:
        0.064583495 = product of:
          0.19375047 = sum of:
            0.19375047 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
              0.19375047 = score(doc=862,freq=2.0), product of:
                0.34474066 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.04066292 = queryNorm
                0.56201804 = fieldWeight in 862, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=862)
          0.33333334 = coord(1/3)
        0.19375047 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
          0.19375047 = score(doc=862,freq=2.0), product of:
            0.34474066 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.04066292 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
        0.19375047 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
          0.19375047 = score(doc=862,freq=2.0), product of:
            0.34474066 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.04066292 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
      0.42857143 = coord(3/7)
    
    Source
    https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN
  3. Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.17
    0.17315517 = product of:
      0.40402874 = sum of:
        0.19375047 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
          0.19375047 = score(doc=563,freq=2.0), product of:
            0.34474066 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.04066292 = queryNorm
            0.56201804 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.19375047 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
          0.19375047 = score(doc=563,freq=2.0), product of:
            0.34474066 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.04066292 = queryNorm
            0.56201804 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.016527792 = product of:
          0.033055585 = sum of:
            0.033055585 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
              0.033055585 = score(doc=563,freq=2.0), product of:
                0.14239462 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04066292 = queryNorm
                0.23214069 = fieldWeight in 563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=563)
          0.5 = coord(1/2)
      0.42857143 = coord(3/7)
    
    Content
    A Thesis presented to The University of Guelph In partial fulfilment of requirements for the degree of Master of Science in Computer Science. Vgl. Unter: http://www.inf.ufrgs.br%2F~ceramisch%2Fdownload_files%2Fpublications%2F2009%2Fp01.pdf.
    Date
    10. 1.2013 19:22:47
  4. Manhart, K.: Digitales Kauderwelsch : Online-Übersetzungsdienste (2004) 0.02
    0.02395505 = product of:
      0.08384267 = sum of:
        0.061388128 = weight(_text_:sites in 2077) [ClassicSimilarity], result of:
          0.061388128 = score(doc=2077,freq=2.0), product of:
            0.21257097 = queryWeight, product of:
              5.227637 = idf(docFreq=644, maxDocs=44218)
              0.04066292 = queryNorm
            0.28878886 = fieldWeight in 2077, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.227637 = idf(docFreq=644, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2077)
        0.022454543 = product of:
          0.044909086 = sum of:
            0.044909086 = weight(_text_:design in 2077) [ClassicSimilarity], result of:
              0.044909086 = score(doc=2077,freq=4.0), product of:
                0.15288728 = queryWeight, product of:
                  3.7598698 = idf(docFreq=2798, maxDocs=44218)
                  0.04066292 = queryNorm
                0.29373983 = fieldWeight in 2077, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.7598698 = idf(docFreq=2798, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2077)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    Eine englische oder französische Website mal schnell ins Deutsche übersetzen - nichts einfacher als das. OnlineÜbersetzungsdienste versprechen den Sprachtransfer per Mausklick und zum Nulltarif. Doch was taugen sie wirklich? Online-Übersetzungsdienste wollen die Sprachbarriere im WWW beseitigen. Die automatischen Übersetzer versprechen, die E-Mail-Korrespondenz verständlich zu machen und das deutschsprachige Surfen in fremdsprachigen Webangeboten zu ermöglichen. Englische, spanische oder gar chinesische EMails und Websites können damit per Mausklick schnell in die eigene Sprache übertragen werden. Auch komplizierte englische Bedienungsanleitungen oder russische Nachrichten sollen für die Dienste kein Problem sein. Und der eine oder andere Homepage-Besitzer träumt davon, mit Hilfe der digitalen Übersetzungshelfer seine deutsche Website in perfektem Englisch online stellen zu können - in der Hoffung auf internationale Kontakte und höhere Besucherzahlen. Das klingt schön - doch die Realität sieht anders aus. Wer jemals einen solchen Dienst konsultiert hat, reibt sich meist verwundert die Augen über die gebotenen Ergebnisse. Schon einfache Sätze bereiten vielen Online-Über setzern Probleme-und sorgen unfreiwillig für Humor. Aus der CNN-Meldung "Iraq blast injures 31 U.S. troops" wird im Deutschen der Satz: "Der Irak Knall verletzt 31 Vereinigte Staaten Truppen." Sites mit schwierigem Satzbau können die Übersetzer oft nur unverständlich wiedergeben. Den Satz "The Slider is equipped with a brilliant color screen and sports an innovative design that slides open with a push of your thumb" übersetzt der bekannteste Online-Dolmetscher Babelfish mit folgendem Kauderwelsch: "Der Schweber wird mit einem leuchtenden Farbe Schirm ausgerüstet und ein erfinderisches Design sports, das geöffnetes mit einem Stoß Ihres Daumens schiebt." Solch dadaistische Texte muten alle Übersetzer ihren Nutzern zu.
  5. Vlachidis, A.; Binding, C.; Tudhope, D.; May, K.: Excavating grey literature : a case study on the rich indexing of archaeological documents via natural language-processing techniques and knowledge-based resources (2010) 0.02
    0.019789103 = product of:
      0.06926186 = sum of:
        0.056559652 = weight(_text_:united in 3948) [ClassicSimilarity], result of:
          0.056559652 = score(doc=3948,freq=2.0), product of:
            0.22812355 = queryWeight, product of:
              5.6101127 = idf(docFreq=439, maxDocs=44218)
              0.04066292 = queryNorm
            0.2479343 = fieldWeight in 3948, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.6101127 = idf(docFreq=439, maxDocs=44218)
              0.03125 = fieldNorm(doc=3948)
        0.012702207 = product of:
          0.025404414 = sum of:
            0.025404414 = weight(_text_:design in 3948) [ClassicSimilarity], result of:
              0.025404414 = score(doc=3948,freq=2.0), product of:
                0.15288728 = queryWeight, product of:
                  3.7598698 = idf(docFreq=2798, maxDocs=44218)
                  0.04066292 = queryNorm
                0.16616434 = fieldWeight in 3948, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.7598698 = idf(docFreq=2798, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3948)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    Purpose - This paper sets out to discuss the use of information extraction (IE), a natural language-processing (NLP) technique to assist "rich" semantic indexing of diverse archaeological text resources. The focus of the research is to direct a semantic-aware "rich" indexing of diverse natural language resources with properties capable of satisfying information retrieval from online publications and datasets associated with the Semantic Technologies for Archaeological Resources (STAR) project. Design/methodology/approach - The paper proposes use of the English Heritage extension (CRM-EH) of the standard core ontology in cultural heritage, CIDOC CRM, and exploitation of domain thesauri resources for driving and enhancing an Ontology-Oriented Information Extraction process. The process of semantic indexing is based on a rule-based Information Extraction technique, which is facilitated by the General Architecture of Text Engineering (GATE) toolkit and expressed by Java Annotation Pattern Engine (JAPE) rules. Findings - Initial results suggest that the combination of information extraction with knowledge resources and standard conceptual models is capable of supporting semantic-aware term indexing. Additional efforts are required for further exploitation of the technique and adoption of formal evaluation methods for assessing the performance of the method in measurable terms. Originality/value - The value of the paper lies in the semantic indexing of 535 unpublished online documents often referred to as "Grey Literature", from the Archaeological Data Service OASIS corpus (Online AccesS to the Index of archaeological investigationS), with respect to the CRM ontological concepts E49.Time Appellation and P19.Physical Object.
    Footnote
    Beitrag in einem Special Issue: Content architecture: exploiting and managing diverse resources: proceedings of the first national conference of the United Kingdom chapter of the International Society for Knowedge Organization (ISKO)
  6. Azpiazu, I.M.; Soledad Pera, M.: Is cross-lingual readability assessment possible? (2020) 0.02
    0.017660774 = product of:
      0.06181271 = sum of:
        0.049110502 = weight(_text_:sites in 5868) [ClassicSimilarity], result of:
          0.049110502 = score(doc=5868,freq=2.0), product of:
            0.21257097 = queryWeight, product of:
              5.227637 = idf(docFreq=644, maxDocs=44218)
              0.04066292 = queryNorm
            0.23103109 = fieldWeight in 5868, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.227637 = idf(docFreq=644, maxDocs=44218)
              0.03125 = fieldNorm(doc=5868)
        0.012702207 = product of:
          0.025404414 = sum of:
            0.025404414 = weight(_text_:design in 5868) [ClassicSimilarity], result of:
              0.025404414 = score(doc=5868,freq=2.0), product of:
                0.15288728 = queryWeight, product of:
                  3.7598698 = idf(docFreq=2798, maxDocs=44218)
                  0.04066292 = queryNorm
                0.16616434 = fieldWeight in 5868, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.7598698 = idf(docFreq=2798, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5868)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    Most research efforts related to automatic readability assessment focus on the design of strategies that apply to a specific language. These state-of-the-art strategies are highly dependent on linguistic features that best suit the language for which they were intended, constraining their adaptability and making it difficult to determine whether they would remain effective if they were applied to estimate the level of difficulty of texts in other languages. In this article, we present the results of a study designed to determine the feasibility of a cross-lingual readability assessment strategy. For doing so, we first analyzed the most common features used for readability assessment and determined their influence on the readability prediction process of 6 different languages: English, Spanish, Basque, Italian, French, and Catalan. In addition, we developed a cross-lingual readability assessment strategy that serves as a means to empirically explore the potential advantages of employing a single strategy (and set of features) for readability assessment in different languages, including interlanguage prediction agreement and prediction accuracy improvement for low-resource languages.Friend request acceptance and information disclosure constitute 2 important privacy decisions for users to control the flow of their personal information in social network sites (SNSs). These decisions are greatly influenced by contextual characteristics of the request. However, the contextual influence may not be uniform among users with different levels of privacy concerns. In this study, we hypothesize that users with higher privacy concerns may consider contextual factors differently from those with lower privacy concerns. By conducting a scenario-based survey study and structural equation modeling, we verify the interaction effects between privacy concerns and contextual factors. We additionally find that users' perceived risk towards the requester mediates the effect of context and privacy concerns. These results extend our understanding about the cognitive process behind privacy decision making in SNSs. The interaction effects suggest strategies for SNS providers to predict user's friend request acceptance and to customize context-aware privacy decision support based on users' different privacy attitudes.
  7. Kurz, C.: Womit sich Strafverfolger bald befassen müssen : ChatGPT (2023) 0.01
    0.014031572 = product of:
      0.098221004 = sum of:
        0.098221004 = weight(_text_:sites in 203) [ClassicSimilarity], result of:
          0.098221004 = score(doc=203,freq=2.0), product of:
            0.21257097 = queryWeight, product of:
              5.227637 = idf(docFreq=644, maxDocs=44218)
              0.04066292 = queryNorm
            0.46206218 = fieldWeight in 203, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.227637 = idf(docFreq=644, maxDocs=44218)
              0.0625 = fieldNorm(doc=203)
      0.14285715 = coord(1/7)
    
    Content
    Vgl. den Europol-Bericht "ChatGPT: The impact of Large Language Models on Law Enforcement" unter: https://www.europol.europa.eu/cms/sites/default/files/documents/Tech%20Watch%20Flash%20-%20The%20Impact%20of%20Large%20Language%20Models%20on%20Law%20Enforcement.pdf.
  8. Panicheva, P.; Cardiff, J.; Rosso, P.: Identifying subjective statements in news titles using a personal sense annotation framework (2013) 0.01
    0.011676681 = product of:
      0.081736766 = sum of:
        0.081736766 = weight(_text_:states in 968) [ClassicSimilarity], result of:
          0.081736766 = score(doc=968,freq=2.0), product of:
            0.22391328 = queryWeight, product of:
              5.506572 = idf(docFreq=487, maxDocs=44218)
              0.04066292 = queryNorm
            0.3650376 = fieldWeight in 968, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.506572 = idf(docFreq=487, maxDocs=44218)
              0.046875 = fieldNorm(doc=968)
      0.14285715 = coord(1/7)
    
    Abstract
    Subjective language contains information about private states. The goal of subjective language identification is to determine that a private state is expressed, without considering its polarity or specific emotion. A component of word meaning, "Personal Sense," has clear potential in the field of subjective language identification, as it reflects a meaning of words in terms of unique personal experience and carries personal characteristics. In this paper we investigate how Personal Sense can be harnessed for the purpose of identifying subjectivity in news titles. In the process, we develop a new Personal Sense annotation framework for annotating and classifying subjectivity, polarity, and emotion. The Personal Sense framework yields high performance in a fine-grained subsentence subjectivity classification. Our experiments demonstrate lexico-syntactic features to be useful for the identification of subjectivity indicators and the targets that receive the subjective Personal Sense.
  9. Zadeh, B.Q.; Handschuh, S.: ¬The ACL RD-TEC : a dataset for benchmarking terminology extraction and classification in computational linguistics (2014) 0.01
    0.01052368 = product of:
      0.07366575 = sum of:
        0.07366575 = weight(_text_:sites in 2803) [ClassicSimilarity], result of:
          0.07366575 = score(doc=2803,freq=2.0), product of:
            0.21257097 = queryWeight, product of:
              5.227637 = idf(docFreq=644, maxDocs=44218)
              0.04066292 = queryNorm
            0.34654665 = fieldWeight in 2803, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.227637 = idf(docFreq=644, maxDocs=44218)
              0.046875 = fieldNorm(doc=2803)
      0.14285715 = coord(1/7)
    
    Source
    Proceedings of the 4th International Workshop on Computational Terminology, Dublin, Ireland, August 23 2014. COLING 2014. Eds.: Patrick Drouin et al., Dublin, Ireland, 2014-08-23 [https://www.deri.ie/sites/default/files/publications/the-acl-rd-tec.pdf]
  10. Bian, G.-W.; Chen, H.-H.: Cross-language information access to multilingual collections on the Internet (2000) 0.01
    0.01016603 = product of:
      0.07116221 = sum of:
        0.07116221 = sum of:
          0.03810662 = weight(_text_:design in 4436) [ClassicSimilarity], result of:
            0.03810662 = score(doc=4436,freq=2.0), product of:
              0.15288728 = queryWeight, product of:
                3.7598698 = idf(docFreq=2798, maxDocs=44218)
                0.04066292 = queryNorm
              0.24924651 = fieldWeight in 4436, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.7598698 = idf(docFreq=2798, maxDocs=44218)
                0.046875 = fieldNorm(doc=4436)
          0.033055585 = weight(_text_:22 in 4436) [ClassicSimilarity], result of:
            0.033055585 = score(doc=4436,freq=2.0), product of:
              0.14239462 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.04066292 = queryNorm
              0.23214069 = fieldWeight in 4436, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=4436)
      0.14285715 = coord(1/7)
    
    Abstract
    Language barrier is the major problem that people face in searching for, retrieving, and understanding multilingual collections on the Internet. This paper deals with query translation and document translation in a Chinese-English information retrieval system called MTIR. Bilingual dictionary and monolingual corpus-based approaches are adopted to select suitable tranlated query terms. A machine transliteration algorithm is introduced to resolve proper name searching. We consider several design issues for document translation, including which material is translated, what roles the HTML tags play in translation, what the tradeoff is between the speed performance and the translation performance, and what from the translated result is presented in. About 100.000 Web pages translated in the last 4 months of 1997 are used for quantitative study of online and real-time Web page translation
    Date
    16. 2.2000 14:22:39
  11. Doszkocs, T.E.; Zamora, A.: Dictionary services and spelling aids for Web searching (2004) 0.01
    0.0101017 = product of:
      0.070711896 = sum of:
        0.070711896 = sum of:
          0.03175552 = weight(_text_:design in 2541) [ClassicSimilarity], result of:
            0.03175552 = score(doc=2541,freq=2.0), product of:
              0.15288728 = queryWeight, product of:
                3.7598698 = idf(docFreq=2798, maxDocs=44218)
                0.04066292 = queryNorm
              0.20770542 = fieldWeight in 2541, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.7598698 = idf(docFreq=2798, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2541)
          0.03895638 = weight(_text_:22 in 2541) [ClassicSimilarity], result of:
            0.03895638 = score(doc=2541,freq=4.0), product of:
              0.14239462 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.04066292 = queryNorm
              0.27358043 = fieldWeight in 2541, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2541)
      0.14285715 = coord(1/7)
    
    Abstract
    The Specialized Information Services Division (SIS) of the National Library of Medicine (NLM) provides Web access to more than a dozen scientific databases on toxicology and the environment on TOXNET . Search queries on TOXNET often include misspelled or variant English words, medical and scientific jargon and chemical names. Following the example of search engines like Google and ClinicalTrials.gov, we set out to develop a spelling "suggestion" system for increased recall and precision in TOXNET searching. This paper describes development of dictionary technology that can be used in a variety of applications such as orthographic verification, writing aid, natural language processing, and information storage and retrieval. The design of the technology allows building complex applications using the components developed in the earlier phases of the work in a modular fashion without extensive rewriting of computer code. Since many of the potential applications envisioned for this work have on-line or web-based interfaces, the dictionaries and other computer components must have fast response, and must be adaptable to open-ended database vocabularies, including chemical nomenclature. The dictionary vocabulary for this work was derived from SIS and other databases and specialized resources, such as NLM's Unified Medical Language Systems (UMLS) . The resulting technology, A-Z Dictionary (AZdict), has three major constituents: 1) the vocabulary list, 2) the word attributes that define part of speech and morphological relationships between words in the list, and 3) a set of programs that implements the retrieval of words and their attributes, and determines similarity between words (ChemSpell). These three components can be used in various applications such as spelling verification, spelling aid, part-of-speech tagging, paraphrasing, and many other natural language processing functions.
    Date
    14. 8.2004 17:22:56
    Source
    Online. 28(2004) no.3, S.22-29
  12. Sprachtechnologie für eine dynamische Wirtschaft im Medienzeitalter - Language technologies for dynamic business in the age of the media - L'ingénierie linguistique au service de la dynamisation économique à l'ère du multimédia : Tagungsakten der XXVI. Jahrestagung der Internationalen Vereinigung Sprache und Wirtschaft e.V., 23.-25.11.2000 Fachhochschule Köln (2000) 0.01
    0.008769733 = product of:
      0.061388128 = sum of:
        0.061388128 = weight(_text_:sites in 5527) [ClassicSimilarity], result of:
          0.061388128 = score(doc=5527,freq=2.0), product of:
            0.21257097 = queryWeight, product of:
              5.227637 = idf(docFreq=644, maxDocs=44218)
              0.04066292 = queryNorm
            0.28878886 = fieldWeight in 5527, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.227637 = idf(docFreq=644, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5527)
      0.14285715 = coord(1/7)
    
    Content
    Enthält die Beiträge: WRIGHT, S.E.: Leveraging terminology resources across application boundaries: accessing resources in future integrated environments; PALME, K.: E-Commerce: Verhindert Sprache Business-to-business?; RÜEGGER, R.: Die qualität der virtuellen Information als Wettbewerbsvorteil: Information im Internet ist Sprache - noch; SCHIRMER, K. u. J. HALLER: Zugang zu mehrsprachigen Nachrichten im Internet; WEISS, A. u. W. WIEDEN: Die Herstellung mehrsprachiger Informations- und Wissensressourcen in Unternehmen; FULFORD, H.: Monolingual or multilingual web sites? An exploratory study of UK SMEs; SCHMIDTKE-NIKELLA, M.: Effiziente Hypermediaentwicklung: Die Autorenentlastung durch eine Engine; SCHMIDT, R.: Maschinelle Text-Ton-Synchronisation in Wissenschaft und Wirtschaft; HELBIG, H. u.a.: Natürlichsprachlicher Zugang zu Informationsanbietern im Internet und zu lokalen Datenbanken; SIENEL, J. u.a.: Sprachtechnologien für die Informationsgesellschaft des 21. Jahrhunderts; ERBACH, G.: Sprachdialogsysteme für Telefondienste: Stand der Technik und zukünftige Entwicklungen; SUSEN, A.: Spracherkennung: Akteulle Einsatzmöglichkeiten im Bereich der Telekommunikation; BENZMÜLLER, R.: Logox WebSpeech: die neue Technologie für sprechende Internetseiten; JAARANEN, K. u.a.: Webtran tools for in-company language support; SCHMITZ, K.-D.: Projektforschung und Infrastrukturen im Bereich der Terminologie: Wie kann die Wirtschaft davon profitieren?; SCHRÖTER, F. u. U. MEYER: Entwicklung sprachlicher Handlungskompetenz in englisch mit hilfe eines Multimedia-Sprachlernsystems; KLEIN, A.: Der Einsatz von Sprachverarbeitungstools beim Sprachenlernen im Intranet; HAUER, M.: Knowledge Management braucht Terminologie Management; HEYER, G. u.a.: Texttechnologische Anwendungen am Beispiel Text Mining
  13. Anizi, M.; Dichy, J.: Improving information retrieval in Arabic through a multi-agent approach and a rich lexical resource (2011) 0.01
    0.008769733 = product of:
      0.061388128 = sum of:
        0.061388128 = weight(_text_:sites in 4738) [ClassicSimilarity], result of:
          0.061388128 = score(doc=4738,freq=2.0), product of:
            0.21257097 = queryWeight, product of:
              5.227637 = idf(docFreq=644, maxDocs=44218)
              0.04066292 = queryNorm
            0.28878886 = fieldWeight in 4738, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.227637 = idf(docFreq=644, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4738)
      0.14285715 = coord(1/7)
    
    Abstract
    This paper addresses the optimization of information retrieval in Arabic. The results derived from the expanding development of sites in Arabic are often spectacular. Nevertheless, several observations indicate that the responses remain disappointing, particularly upon comparing users' requests and quality of responses. One of the problems encountered by users is the loss of time when navigating between different URLs to find adequate responses. This, in many cases, is due to the absence of forms morphologically related to the research keyword. Such problems can be approached through a morphological analyzer drawing on the DIINAR.1 morpho-lexical resource. A second problem concerns the formulation of the query, which may prove ambiguous, as in everyday language. We then focus on contextual disambiguation based on a rich lexical resource that includes collocations and set expressions. The overall scheme of such a resource will only be hinted at here. Our approach leads to the elaboration of a multi-agent system, motivated by a need to solve problems encountered when using conventional methods of analysis, and to improve the results of queries thanks to a better collaboration between different levels of analysis. We suggest resorting to four agents: morphological, morpho-lexical, contextualization, and an interface agent. These agents 'negotiate' and 'cooperate' throughout the analysis process, starting from the submission of the initial query, and going on until an adequate query is obtained.
  14. Warner, A.J.: Natural language processing (1987) 0.01
    0.006296302 = product of:
      0.044074114 = sum of:
        0.044074114 = product of:
          0.08814823 = sum of:
            0.08814823 = weight(_text_:22 in 337) [ClassicSimilarity], result of:
              0.08814823 = score(doc=337,freq=2.0), product of:
                0.14239462 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04066292 = queryNorm
                0.61904186 = fieldWeight in 337, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=337)
          0.5 = coord(1/2)
      0.14285715 = coord(1/7)
    
    Source
    Annual review of information science and technology. 22(1987), S.79-108
  15. Hsinchun, C.: Knowledge-based document retrieval framework and design (1992) 0.01
    0.0062859626 = product of:
      0.044001736 = sum of:
        0.044001736 = product of:
          0.08800347 = sum of:
            0.08800347 = weight(_text_:design in 6686) [ClassicSimilarity], result of:
              0.08800347 = score(doc=6686,freq=6.0), product of:
                0.15288728 = queryWeight, product of:
                  3.7598698 = idf(docFreq=2798, maxDocs=44218)
                  0.04066292 = queryNorm
                0.57561016 = fieldWeight in 6686, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.7598698 = idf(docFreq=2798, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6686)
          0.5 = coord(1/2)
      0.14285715 = coord(1/7)
    
    Abstract
    Presents research on the design of knowledge-based document retrieval systems in which a semantic network was adopted to represent subject knowledge and classification scheme knowledge and experts' search strategies and user modelling capability were modelled as procedural knowledge. These functionalities were incorporated into a prototype knowledge-based retrieval system, Metacat. Describes a system, the design of which was based on the blackboard architecture, which was able to create a user profile, identify task requirements, suggest heuristics-based search strategies, perform semantic-based search assistance, and assist online query refinement
  16. Deventer, J.P. van; Kruger, C.J.; Johnson, R.D.: Delineating knowledge management through lexical analysis : a retrospective (2015) 0.01
    0.005930184 = product of:
      0.041511286 = sum of:
        0.041511286 = sum of:
          0.022228861 = weight(_text_:design in 3807) [ClassicSimilarity], result of:
            0.022228861 = score(doc=3807,freq=2.0), product of:
              0.15288728 = queryWeight, product of:
                3.7598698 = idf(docFreq=2798, maxDocs=44218)
                0.04066292 = queryNorm
              0.14539379 = fieldWeight in 3807, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.7598698 = idf(docFreq=2798, maxDocs=44218)
                0.02734375 = fieldNorm(doc=3807)
          0.019282425 = weight(_text_:22 in 3807) [ClassicSimilarity], result of:
            0.019282425 = score(doc=3807,freq=2.0), product of:
              0.14239462 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.04066292 = queryNorm
              0.1354154 = fieldWeight in 3807, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.02734375 = fieldNorm(doc=3807)
      0.14285715 = coord(1/7)
    
    Abstract
    Purpose Academic authors tend to define terms that meet their own needs. Knowledge Management (KM) is a term that comes to mind and is examined in this study. Lexicographical research identified KM terms used by authors from 1996 to 2006 in academic outlets to define KM. Data were collected based on strict criteria which included that definitions should be unique instances. From 2006 onwards, these authors could not identify new unique instances of definitions with repetitive usage of such definition instances. Analysis revealed that KM is directly defined by People (Person and Organisation), Processes (Codify, Share, Leverage, and Process) and Contextualised Content (Information). The paper aims to discuss these issues. Design/methodology/approach The aim of this paper is to add to the body of knowledge in the KM discipline and supply KM practitioners and scholars with insight into what is commonly regarded to be KM so as to reignite the debate on what one could consider as KM. The lexicon used by KM scholars was evaluated though the application of lexicographical research methods as extended though Knowledge Discovery and Text Analysis methods. Findings By simplifying term relationships through the application of lexicographical research methods, as extended though Knowledge Discovery and Text Analysis methods, it was found that KM is directly defined by People (Person and Organisation), Processes (Codify, Share, Leverage, Process) and Contextualised Content (Information). One would therefore be able to indicate that KM, from an academic point of view, refers to people processing contextualised content.
    Date
    20. 1.2015 18:30:22
  17. McMahon, J.G.; Smith, F.J.: Improved statistical language model performance with automatic generated word hierarchies (1996) 0.01
    0.0055092643 = product of:
      0.03856485 = sum of:
        0.03856485 = product of:
          0.0771297 = sum of:
            0.0771297 = weight(_text_:22 in 3164) [ClassicSimilarity], result of:
              0.0771297 = score(doc=3164,freq=2.0), product of:
                0.14239462 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04066292 = queryNorm
                0.5416616 = fieldWeight in 3164, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=3164)
          0.5 = coord(1/2)
      0.14285715 = coord(1/7)
    
    Source
    Computational linguistics. 22(1996) no.2, S.217-248
  18. Ruge, G.: ¬A spreading activation network for automatic generation of thesaurus relationships (1991) 0.01
    0.0055092643 = product of:
      0.03856485 = sum of:
        0.03856485 = product of:
          0.0771297 = sum of:
            0.0771297 = weight(_text_:22 in 4506) [ClassicSimilarity], result of:
              0.0771297 = score(doc=4506,freq=2.0), product of:
                0.14239462 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04066292 = queryNorm
                0.5416616 = fieldWeight in 4506, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=4506)
          0.5 = coord(1/2)
      0.14285715 = coord(1/7)
    
    Date
    8.10.2000 11:52:22
  19. Somers, H.: Example-based machine translation : Review article (1999) 0.01
    0.0055092643 = product of:
      0.03856485 = sum of:
        0.03856485 = product of:
          0.0771297 = sum of:
            0.0771297 = weight(_text_:22 in 6672) [ClassicSimilarity], result of:
              0.0771297 = score(doc=6672,freq=2.0), product of:
                0.14239462 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04066292 = queryNorm
                0.5416616 = fieldWeight in 6672, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=6672)
          0.5 = coord(1/2)
      0.14285715 = coord(1/7)
    
    Date
    31. 7.1996 9:22:19
  20. New tools for human translators (1997) 0.01
    0.0055092643 = product of:
      0.03856485 = sum of:
        0.03856485 = product of:
          0.0771297 = sum of:
            0.0771297 = weight(_text_:22 in 1179) [ClassicSimilarity], result of:
              0.0771297 = score(doc=1179,freq=2.0), product of:
                0.14239462 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04066292 = queryNorm
                0.5416616 = fieldWeight in 1179, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=1179)
          0.5 = coord(1/2)
      0.14285715 = coord(1/7)
    
    Date
    31. 7.1996 9:22:19

Languages

  • e 82
  • d 20
  • ru 2
  • m 1
  • More… Less…

Types

  • a 85
  • el 11
  • m 10
  • s 6
  • p 3
  • x 2
  • d 1
  • More… Less…