Search (500 results, page 1 of 25)

Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.38

0.3754033 = product of:
  0.6569558 = sum of:
    0.043041058 = weight(_text_:web in 563) [ClassicSimilarity], result of:
      0.043041058 = score(doc=563,freq=8.0), product of:
        0.09947448 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.030480823 = queryNorm
        0.43268442 = fieldWeight in 563, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=563)
    0.14523487 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
      0.14523487 = score(doc=563,freq=2.0), product of:
        0.25841674 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.030480823 = queryNorm
        0.56201804 = fieldWeight in 563, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=563)
    0.14523487 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
      0.14523487 = score(doc=563,freq=2.0), product of:
        0.25841674 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.030480823 = queryNorm
        0.56201804 = fieldWeight in 563, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=563)
    0.006226926 = weight(_text_:information in 563) [ClassicSimilarity], result of:
      0.006226926 = score(doc=563,freq=2.0), product of:
        0.053508412 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.030480823 = queryNorm
        0.116372846 = fieldWeight in 563, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=563)
    0.018488824 = weight(_text_:retrieval in 563) [ClassicSimilarity], result of:
      0.018488824 = score(doc=563,freq=2.0), product of:
        0.092201896 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.030480823 = queryNorm
        0.20052543 = fieldWeight in 563, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=563)
    0.14523487 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
      0.14523487 = score(doc=563,freq=2.0), product of:
        0.25841674 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.030480823 = queryNorm
        0.56201804 = fieldWeight in 563, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=563)
    0.14523487 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
      0.14523487 = score(doc=563,freq=2.0), product of:
        0.25841674 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.030480823 = queryNorm
        0.56201804 = fieldWeight in 563, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=563)
    0.008259461 = product of:
      0.024778383 = sum of:
        0.024778383 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
          0.024778383 = score(doc=563,freq=2.0), product of:
            0.10673865 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.030480823 = queryNorm
            0.23214069 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
      0.33333334 = coord(1/3)
  0.5714286 = coord(8/14)

Abstract: In this thesis we propose three new word association measures for multi-word term extraction. We combine these association measures with LocalMaxs algorithm in our extraction model and compare the results of different multi-word term extraction methods. Our approach is language and domain independent and requires no training data. It can be applied to such tasks as text summarization, information retrieval, and document classification. We further explore the potential of using multi-word terms as an effective representation for general web-page summarization. We extract multi-word terms from human written summaries in a large collection of web-pages, and generate the summaries by aligning document words with these multi-word terms. Our system applies machine translation technology to learn the aligning process from a training set and focuses on selecting high quality multi-word terms from human written summaries to generate suitable results for web-page summarization.
Content: A Thesis presented to The University of Guelph In partial fulfilment of requirements for the degree of Master of Science in Computer Science. Vgl. Unter: http://www.inf.ufrgs.br%2F~ceramisch%2Fdownload_files%2Fpublications%2F2009%2Fp01.pdf.
Date: 10. 1.2013 19:22:47

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.33

0.32956555 = product of:
  0.6591311 = sum of:
    0.048411623 = product of:
      0.14523487 = sum of:
        0.14523487 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
          0.14523487 = score(doc=562,freq=2.0), product of:
            0.25841674 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.030480823 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.33333334 = coord(1/3)
    0.021520529 = weight(_text_:web in 562) [ClassicSimilarity], result of:
      0.021520529 = score(doc=562,freq=2.0), product of:
        0.09947448 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.030480823 = queryNorm
        0.21634221 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.14523487 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
      0.14523487 = score(doc=562,freq=2.0), product of:
        0.25841674 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.030480823 = queryNorm
        0.56201804 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.14523487 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
      0.14523487 = score(doc=562,freq=2.0), product of:
        0.25841674 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.030480823 = queryNorm
        0.56201804 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.14523487 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
      0.14523487 = score(doc=562,freq=2.0), product of:
        0.25841674 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.030480823 = queryNorm
        0.56201804 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.14523487 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
      0.14523487 = score(doc=562,freq=2.0), product of:
        0.25841674 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.030480823 = queryNorm
        0.56201804 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.008259461 = product of:
      0.024778383 = sum of:
        0.024778383 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
          0.024778383 = score(doc=562,freq=2.0), product of:
            0.10673865 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.030480823 = queryNorm
            0.23214069 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.33333334 = coord(1/3)
  0.5 = coord(7/14)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32

Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.22

0.22476827 = product of:
  0.62935114 = sum of:
    0.048411623 = product of:
      0.14523487 = sum of:
        0.14523487 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
          0.14523487 = score(doc=862,freq=2.0), product of:
            0.25841674 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.030480823 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
      0.33333334 = coord(1/3)
    0.14523487 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
      0.14523487 = score(doc=862,freq=2.0), product of:
        0.25841674 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.030480823 = queryNorm
        0.56201804 = fieldWeight in 862, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=862)
    0.14523487 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
      0.14523487 = score(doc=862,freq=2.0), product of:
        0.25841674 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.030480823 = queryNorm
        0.56201804 = fieldWeight in 862, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=862)
    0.14523487 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
      0.14523487 = score(doc=862,freq=2.0), product of:
        0.25841674 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.030480823 = queryNorm
        0.56201804 = fieldWeight in 862, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=862)
    0.14523487 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
      0.14523487 = score(doc=862,freq=2.0), product of:
        0.25841674 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.030480823 = queryNorm
        0.56201804 = fieldWeight in 862, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=862)
  0.35714287 = coord(5/14)

Source: https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN

Symonds, M.; Bruza, P.; Zuccon, G.; Koopman, B.; Sitbon, L.; Turner, I.: Automatic query expansion : a structural linguistic perspective (2014) 0.03

0.02866677 = product of:
  0.10033369 = sum of:
    0.033056572 = weight(_text_:wide in 1338) [ClassicSimilarity], result of:
      0.033056572 = score(doc=1338,freq=2.0), product of:
        0.13505316 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.030480823 = queryNorm
        0.24476713 = fieldWeight in 1338, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1338)
    0.017933775 = weight(_text_:web in 1338) [ClassicSimilarity], result of:
      0.017933775 = score(doc=1338,freq=2.0), product of:
        0.09947448 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.030480823 = queryNorm
        0.18028519 = fieldWeight in 1338, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1338)
    0.0116031915 = weight(_text_:information in 1338) [ClassicSimilarity], result of:
      0.0116031915 = score(doc=1338,freq=10.0), product of:
        0.053508412 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.030480823 = queryNorm
        0.21684799 = fieldWeight in 1338, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1338)
    0.037740156 = weight(_text_:retrieval in 1338) [ClassicSimilarity], result of:
      0.037740156 = score(doc=1338,freq=12.0), product of:
        0.092201896 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.030480823 = queryNorm
        0.40932083 = fieldWeight in 1338, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1338)
  0.2857143 = coord(4/14)

Abstract: A user's query is considered to be an imprecise description of their information need. Automatic query expansion is the process of reformulating the original query with the goal of improving retrieval effectiveness. Many successful query expansion techniques model syntagmatic associations that infer two terms co-occur more often than by chance in natural language. However, structural linguistics relies on both syntagmatic and paradigmatic associations to deduce the meaning of a word. Given the success of dependency-based approaches to query expansion and the reliance on word meanings in the query formulation process, we argue that modeling both syntagmatic and paradigmatic information in the query expansion process improves retrieval effectiveness. This article develops and evaluates a new query expansion technique that is based on a formal, corpus-based model of word meaning that models syntagmatic and paradigmatic associations. We demonstrate that when sufficient statistical information exists, as in the case of longer queries, including paradigmatic information alone provides significant improvements in retrieval effectiveness across a wide variety of data sets. More generally, when our new query expansion approach is applied to large-scale web retrieval it demonstrates significant improvements in retrieval effectiveness over a strong baseline system, based on a commercial search engine.
Source: Journal of the Association for Information Science and Technology. 65(2014) no.8, S.1577-1596
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Yang, C.C.; Luk, J.: Automatic generation of English/Chinese thesaurus based on a parallel corpus in laws (2003) 0.03
```
0.027870284 = product of:
  0.07803679 = sum of:
    0.023139602 = weight(_text_:wide in 1616) [ClassicSimilarity], result of:
      0.023139602 = score(doc=1616,freq=2.0), product of:
        0.13505316 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.030480823 = queryNorm
        0.171337 = fieldWeight in 1616, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1616)
    0.025107287 = weight(_text_:web in 1616) [ClassicSimilarity], result of:
      0.025107287 = score(doc=1616,freq=8.0), product of:
        0.09947448 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.030480823 = queryNorm
        0.25239927 = fieldWeight in 1616, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1616)
    0.006291456 = weight(_text_:information in 1616) [ClassicSimilarity], result of:
      0.006291456 = score(doc=1616,freq=6.0), product of:
        0.053508412 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.030480823 = queryNorm
        0.11757882 = fieldWeight in 1616, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1616)
    0.018680425 = weight(_text_:retrieval in 1616) [ClassicSimilarity], result of:
      0.018680425 = score(doc=1616,freq=6.0), product of:
        0.092201896 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.030480823 = queryNorm
        0.20260347 = fieldWeight in 1616, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1616)
    0.0048180195 = product of:
      0.014454057 = sum of:
        0.014454057 = weight(_text_:22 in 1616) [ClassicSimilarity], result of:
          0.014454057 = score(doc=1616,freq=2.0), product of:
            0.10673865 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.030480823 = queryNorm
            0.1354154 = fieldWeight in 1616, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1616)
      0.33333334 = coord(1/3)
  0.35714287 = coord(5/14)
```
Abstract

The information available in languages other than English in the World Wide Web is increasing significantly. According to a report from Computer Economics in 1999, 54% of Internet users are English speakers ("English Will Dominate Web for Only Three More Years," Computer Economics, July 9, 1999, http://www.computereconomics. com/new4/pr/pr990610.html). However, it is predicted that there will be only 60% increase in Internet users among English speakers verses a 150% growth among nonEnglish speakers for the next five years. By 2005, 57% of Internet users will be non-English speakers. A report by CNN.com in 2000 showed that the number of Internet users in China had been increased from 8.9 million to 16.9 million from January to June in 2000 ("Report: China Internet users double to 17 million," CNN.com, July, 2000, http://cnn.org/2000/TECH/computing/07/27/ china.internet.reut/index.html). According to Nielsen/ NetRatings, there was a dramatic leap from 22.5 millions to 56.6 millions Internet users from 2001 to 2002. China had become the second largest global at-home Internet population in 2002 (US's Internet population was 166 millions) (Robyn Greenspan, "China Pulls Ahead of Japan," Internet.com, April 22, 2002, http://cyberatias.internet.com/big-picture/geographics/article/0,,5911_1013841,00. html). All of the evidences reveal the importance of crosslingual research to satisfy the needs in the near future. Digital library research has been focusing in structural and semantic interoperability in the past. Searching and retrieving objects across variations in protocols, formats and disciplines are widely explored (Schatz, B., & Chen, H. (1999). Digital libraries: technological advances and social impacts. IEEE Computer, Special Issue an Digital Libraries, February, 32(2), 45-50.; Chen, H., Yen, J., & Yang, C.C. (1999). International activities: development of Asian digital libraries. IEEE Computer, Special Issue an Digital Libraries, 32(2), 48-49.). However, research in crossing language boundaries, especially across European languages and Oriental languages, is still in the initial stage. In this proposal, we put our focus an cross-lingual semantic interoperability by developing automatic generation of a cross-lingual thesaurus based an English/Chinese parallel corpus. When the searchers encounter retrieval problems, Professional librarians usually consult the thesaurus to identify other relevant vocabularies. In the problem of searching across language boundaries, a cross-lingual thesaurus, which is generated by co-occurrence analysis and Hopfield network, can be used to generate additional semantically relevant terms that cannot be obtained from dictionary. In particular, the automatically generated cross-lingual thesaurus is able to capture the unknown words that do not exist in a dictionary, such as names of persons, organizations, and events. Due to Hong Kong's unique history background, both English and Chinese are used as official languages in all legal documents. Therefore, English/Chinese cross-lingual information retrieval is critical for applications in courts and the government. In this paper, we develop an automatic thesaurus by the Hopfield network based an a parallel corpus collected from the Web site of the Department of Justice of the Hong Kong Special Administrative Region (HKSAR) Government. Experiments are conducted to measure the precision and recall of the automatic generated English/Chinese thesaurus. The result Shows that such thesaurus is a promising tool to retrieve relevant terms, especially in the language that is not the same as the input term. The direct translation of the input term can also be retrieved in most of the cases.

Footnote

Teil eines Themenheftes: "Web retrieval and mining: A machine learning perspective"

Source

Journal of the American Society for Information Science and technology. 54(2003) no.7, S.671-682

Chowdhury, G.G.: Natural language processing (2002) 0.03

0.025846457 = product of:
  0.090462595 = sum of:
    0.039667886 = weight(_text_:wide in 4284) [ClassicSimilarity], result of:
      0.039667886 = score(doc=4284,freq=2.0), product of:
        0.13505316 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.030480823 = queryNorm
        0.29372054 = fieldWeight in 4284, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.046875 = fieldNorm(doc=4284)
    0.021520529 = weight(_text_:web in 4284) [ClassicSimilarity], result of:
      0.021520529 = score(doc=4284,freq=2.0), product of:
        0.09947448 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.030480823 = queryNorm
        0.21634221 = fieldWeight in 4284, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=4284)
    0.010785352 = weight(_text_:information in 4284) [ClassicSimilarity], result of:
      0.010785352 = score(doc=4284,freq=6.0), product of:
        0.053508412 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.030480823 = queryNorm
        0.20156369 = fieldWeight in 4284, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=4284)
    0.018488824 = weight(_text_:retrieval in 4284) [ClassicSimilarity], result of:
      0.018488824 = score(doc=4284,freq=2.0), product of:
        0.092201896 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.030480823 = queryNorm
        0.20052543 = fieldWeight in 4284, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=4284)
  0.2857143 = coord(4/14)

Abstract: Natural Language Processing (NLP) is an area of research and application that explores how computers can be used to understand and manipulate natural language text or speech to do useful things. NLP researchers aim to gather knowledge an how human beings understand and use language so that appropriate tools and techniques can be developed to make computer systems understand and manipulate natural languages to perform desired tasks. The foundations of NLP lie in a number of disciplines, namely, computer and information sciences, linguistics, mathematics, electrical and electronic engineering, artificial intelligence and robotics, and psychology. Applications of NLP include a number of fields of study, such as machine translation, natural language text processing and summarization, user interfaces, multilingual and cross-language information retrieval (CLIR), speech recognition, artificial intelligence, and expert systems. One important application area that is relatively new and has not been covered in previous ARIST chapters an NLP relates to the proliferation of the World Wide Web and digital libraries.
Source: Annual review of information science and technology. 37(2003), S.51-90

Semantik, Lexikographie und Computeranwendungen : Workshop ... (Bonn) : 1995.01.27-28 (1996) 0.02

0.0245534 = product of:
  0.11458253 = sum of:
    0.005189105 = weight(_text_:information in 190) [ClassicSimilarity], result of:
      0.005189105 = score(doc=190,freq=2.0), product of:
        0.053508412 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.030480823 = queryNorm
        0.09697737 = fieldWeight in 190, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=190)
    0.10251054 = weight(_text_:kongress in 190) [ClassicSimilarity], result of:
      0.10251054 = score(doc=190,freq=4.0), product of:
        0.19998738 = queryWeight, product of:
          6.5610886 = idf(docFreq=169, maxDocs=44218)
          0.030480823 = queryNorm
        0.51258504 = fieldWeight in 190, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          6.5610886 = idf(docFreq=169, maxDocs=44218)
          0.0390625 = fieldNorm(doc=190)
    0.006882885 = product of:
      0.020648655 = sum of:
        0.020648655 = weight(_text_:22 in 190) [ClassicSimilarity], result of:
          0.020648655 = score(doc=190,freq=2.0), product of:
            0.10673865 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.030480823 = queryNorm
            0.19345059 = fieldWeight in 190, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=190)
      0.33333334 = coord(1/3)
  0.21428572 = coord(3/14)

Date: 14. 4.2007 10:04:22
RSWK: Lexikographie / Semantik / Kongress / Bonn <1995>
Series: Sprache und Information ; 33
Subject: Lexikographie / Semantik / Kongress / Bonn <1995>

Egger, W.: Helferlein für jedermann : Elektronische Wörterbücher (2004) 0.02

0.023324488 = product of:
  0.16327141 = sum of:
    0.10650778 = weight(_text_:elektronische in 1501) [ClassicSimilarity], result of:
      0.10650778 = score(doc=1501,freq=4.0), product of:
        0.14414315 = queryWeight, product of:
          4.728978 = idf(docFreq=1061, maxDocs=44218)
          0.030480823 = queryNorm
        0.7389028 = fieldWeight in 1501, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.728978 = idf(docFreq=1061, maxDocs=44218)
          0.078125 = fieldNorm(doc=1501)
    0.056763638 = weight(_text_:bibliothek in 1501) [ClassicSimilarity], result of:
      0.056763638 = score(doc=1501,freq=2.0), product of:
        0.12513994 = queryWeight, product of:
          4.1055303 = idf(docFreq=1980, maxDocs=44218)
          0.030480823 = queryNorm
        0.4536013 = fieldWeight in 1501, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.1055303 = idf(docFreq=1980, maxDocs=44218)
          0.078125 = fieldNorm(doc=1501)
  0.14285715 = coord(2/14)

Abstract: Zahllose online-dictionaries und einzelne, teilweise ausgezeichnete elektronische Wörterbücher wollen hier nicht erwähnt werden, da ihre Vorzüge teilweise folgenden Nachteilen gegenüber stehen: Internet-Verbindung, CD-Rom, bzw. zeitaufwändiges Aufrufen der Wörterbücher oder Wechsel der Sprachrichtung sind erforderlich.
Object: PC-Bibliothek

Mustafa el Hadi, W.; Jouis, C.: Natural language processing-based systems for terminological construction and their contribution to information retrieval (1996) 0.02

0.023310844 = product of:
  0.10878394 = sum of:
    0.010273904 = weight(_text_:information in 6331) [ClassicSimilarity], result of:
      0.010273904 = score(doc=6331,freq=4.0), product of:
        0.053508412 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.030480823 = queryNorm
        0.1920054 = fieldWeight in 6331, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=6331)
    0.021570295 = weight(_text_:retrieval in 6331) [ClassicSimilarity], result of:
      0.021570295 = score(doc=6331,freq=2.0), product of:
        0.092201896 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.030480823 = queryNorm
        0.23394634 = fieldWeight in 6331, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=6331)
    0.07693974 = weight(_text_:wien in 6331) [ClassicSimilarity], result of:
      0.07693974 = score(doc=6331,freq=2.0), product of:
        0.17413543 = queryWeight, product of:
          5.7129507 = idf(docFreq=396, maxDocs=44218)
          0.030480823 = queryNorm
        0.4418385 = fieldWeight in 6331, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.7129507 = idf(docFreq=396, maxDocs=44218)
          0.0546875 = fieldNorm(doc=6331)
  0.21428572 = coord(3/14)

Abstract: This paper will survey the capacity of natural language processing (NLP) systems to identify terms or concept names related to a specific field of knowledge (construction of a reference terminology) and the logico-semantic relations they entertain. The scope of our study will be limited to French language NLP systems whose purpose is automatic terms identification with textual area-grounded terms providing access keys to information
Source: TKE'96: Terminology and knowledge engineering. Proceedings 4th International Congress on Terminology and Knowledge Engineering, 26.-28.8.1996, Wien. Ed.: C. Galinski u. K.-D. Schmitz

Linguistik und neue Medien (1998) 0.02

0.023154773 = product of:
  0.3241668 = sum of:
    0.3241668 = weight(_text_:kongress in 5770) [ClassicSimilarity], result of:
      0.3241668 = score(doc=5770,freq=10.0), product of:
        0.19998738 = queryWeight, product of:
          6.5610886 = idf(docFreq=169, maxDocs=44218)
          0.030480823 = queryNorm
        1.6209363 = fieldWeight in 5770, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          6.5610886 = idf(docFreq=169, maxDocs=44218)
          0.078125 = fieldNorm(doc=5770)
  0.071428575 = coord(1/14)

Footnote: Publikation zu einem Kongress 1997 in Leipzig
RSWK: Lexikographie / Neue Medien / Kongress / Leipzig <1997> (2134)
Syntaktische Analyse / Neue Medien / Kongress / Leipzig <1997> (2134)
Subject: Lexikographie / Neue Medien / Kongress / Leipzig <1997> (2134)
Syntaktische Analyse / Neue Medien / Kongress / Leipzig <1997> (2134)

Kreymer, O.: ¬An evaluation of help mechanisms in natural language information retrieval systems (2002) 0.02

0.020889655 = product of:
  0.09748505 = sum of:
    0.039667886 = weight(_text_:wide in 2557) [ClassicSimilarity], result of:
      0.039667886 = score(doc=2557,freq=2.0), product of:
        0.13505316 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.030480823 = queryNorm
        0.29372054 = fieldWeight in 2557, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.046875 = fieldNorm(doc=2557)
    0.016474897 = weight(_text_:information in 2557) [ClassicSimilarity], result of:
      0.016474897 = score(doc=2557,freq=14.0), product of:
        0.053508412 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.030480823 = queryNorm
        0.3078936 = fieldWeight in 2557, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=2557)
    0.04134227 = weight(_text_:retrieval in 2557) [ClassicSimilarity], result of:
      0.04134227 = score(doc=2557,freq=10.0), product of:
        0.092201896 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.030480823 = queryNorm
        0.44838852 = fieldWeight in 2557, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2557)
  0.21428572 = coord(3/14)

Abstract: The field of natural language processing (NLP) demonstrates rapid changes in the design of information retrieval systems and human-computer interaction. While natural language is being looked on as the most effective tool for information retrieval in a contemporary information environment, the systems using it are only beginning to emerge. This study attempts to evaluate the current state of NLP information retrieval systems from the user's point of view: what techniques are used by these systems to guide their users through the search process? The analysis focused on the structure and components of the systems' help mechanisms. Results of the study demonstrated that systems which claimed to be using natural language searching in fact used a wide range of information retrieval techniques from real natural language processing to Boolean searching. As a result, the user assistance mechanisms of these systems also varied. While pseudo-NLP systems would suit a more traditional method of instruction, real NLP systems primarily utilised the methods of explanation and user-system dialogue.
Source: Online information review. 26(2002) no.1, S.30-39

Doszkocs, T.E.; Zamora, A.: Dictionary services and spelling aids for Web searching (2004) 0.02

0.01997825 = product of:
  0.06992387 = sum of:
    0.03106221 = weight(_text_:web in 2541) [ClassicSimilarity], result of:
      0.03106221 = score(doc=2541,freq=6.0), product of:
        0.09947448 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.030480823 = queryNorm
        0.3122631 = fieldWeight in 2541, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2541)
    0.007338503 = weight(_text_:information in 2541) [ClassicSimilarity], result of:
      0.007338503 = score(doc=2541,freq=4.0), product of:
        0.053508412 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.030480823 = queryNorm
        0.13714671 = fieldWeight in 2541, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2541)
    0.021789288 = weight(_text_:retrieval in 2541) [ClassicSimilarity], result of:
      0.021789288 = score(doc=2541,freq=4.0), product of:
        0.092201896 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.030480823 = queryNorm
        0.23632148 = fieldWeight in 2541, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2541)
    0.009733869 = product of:
      0.029201606 = sum of:
        0.029201606 = weight(_text_:22 in 2541) [ClassicSimilarity], result of:
          0.029201606 = score(doc=2541,freq=4.0), product of:
            0.10673865 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.030480823 = queryNorm
            0.27358043 = fieldWeight in 2541, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2541)
      0.33333334 = coord(1/3)
  0.2857143 = coord(4/14)

Abstract: The Specialized Information Services Division (SIS) of the National Library of Medicine (NLM) provides Web access to more than a dozen scientific databases on toxicology and the environment on TOXNET . Search queries on TOXNET often include misspelled or variant English words, medical and scientific jargon and chemical names. Following the example of search engines like Google and ClinicalTrials.gov, we set out to develop a spelling "suggestion" system for increased recall and precision in TOXNET searching. This paper describes development of dictionary technology that can be used in a variety of applications such as orthographic verification, writing aid, natural language processing, and information storage and retrieval. The design of the technology allows building complex applications using the components developed in the earlier phases of the work in a modular fashion without extensive rewriting of computer code. Since many of the potential applications envisioned for this work have on-line or web-based interfaces, the dictionaries and other computer components must have fast response, and must be adaptable to open-ended database vocabularies, including chemical nomenclature. The dictionary vocabulary for this work was derived from SIS and other databases and specialized resources, such as NLM's Unified Medical Language Systems (UMLS) . The resulting technology, A-Z Dictionary (AZdict), has three major constituents: 1) the vocabulary list, 2) the word attributes that define part of speech and morphological relationships between words in the list, and 3) a set of programs that implements the retrieval of words and their attributes, and determines similarity between words (ChemSpell). These three components can be used in various applications such as spelling verification, spelling aid, part-of-speech tagging, paraphrasing, and many other natural language processing functions.
Date: 14. 8.2004 17:22:56
Source: Online. 28(2004) no.3, S.22-29

Bian, G.-W.; Chen, H.-H.: Cross-language information access to multilingual collections on the Internet (2000) 0.02

0.019419504 = product of:
  0.067968264 = sum of:
    0.030434625 = weight(_text_:web in 4436) [ClassicSimilarity], result of:
      0.030434625 = score(doc=4436,freq=4.0), product of:
        0.09947448 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.030480823 = queryNorm
        0.3059541 = fieldWeight in 4436, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=4436)
    0.010785352 = weight(_text_:information in 4436) [ClassicSimilarity], result of:
      0.010785352 = score(doc=4436,freq=6.0), product of:
        0.053508412 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.030480823 = queryNorm
        0.20156369 = fieldWeight in 4436, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=4436)
    0.018488824 = weight(_text_:retrieval in 4436) [ClassicSimilarity], result of:
      0.018488824 = score(doc=4436,freq=2.0), product of:
        0.092201896 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.030480823 = queryNorm
        0.20052543 = fieldWeight in 4436, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=4436)
    0.008259461 = product of:
      0.024778383 = sum of:
        0.024778383 = weight(_text_:22 in 4436) [ClassicSimilarity], result of:
          0.024778383 = score(doc=4436,freq=2.0), product of:
            0.10673865 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.030480823 = queryNorm
            0.23214069 = fieldWeight in 4436, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=4436)
      0.33333334 = coord(1/3)
  0.2857143 = coord(4/14)

Abstract: Language barrier is the major problem that people face in searching for, retrieving, and understanding multilingual collections on the Internet. This paper deals with query translation and document translation in a Chinese-English information retrieval system called MTIR. Bilingual dictionary and monolingual corpus-based approaches are adopted to select suitable tranlated query terms. A machine transliteration algorithm is introduced to resolve proper name searching. We consider several design issues for document translation, including which material is translated, what roles the HTML tags play in translation, what the tradeoff is between the speed performance and the translation performance, and what from the translated result is presented in. About 100.000 Web pages translated in the last 4 months of 1997 are used for quantitative study of online and real-time Web page translation
Date: 16. 2.2000 14:22:39
Source: Journal of the American Society for Information Science. 51(2000) no.3, S.281-296

Yang, C.C.; Li, K.W.: Automatic construction of English/Chinese parallel corpora (2003) 0.02
```
0.01900751 = product of:
  0.06652628 = sum of:
    0.026445258 = weight(_text_:wide in 1683) [ClassicSimilarity], result of:
      0.026445258 = score(doc=1683,freq=2.0), product of:
        0.13505316 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.030480823 = queryNorm
        0.1958137 = fieldWeight in 1683, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.03125 = fieldNorm(doc=1683)
    0.014347021 = weight(_text_:web in 1683) [ClassicSimilarity], result of:
      0.014347021 = score(doc=1683,freq=2.0), product of:
        0.09947448 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.030480823 = queryNorm
        0.14422815 = fieldWeight in 1683, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.03125 = fieldNorm(doc=1683)
    0.008302568 = weight(_text_:information in 1683) [ClassicSimilarity], result of:
      0.008302568 = score(doc=1683,freq=8.0), product of:
        0.053508412 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.030480823 = queryNorm
        0.1551638 = fieldWeight in 1683, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.03125 = fieldNorm(doc=1683)
    0.01743143 = weight(_text_:retrieval in 1683) [ClassicSimilarity], result of:
      0.01743143 = score(doc=1683,freq=4.0), product of:
        0.092201896 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.030480823 = queryNorm
        0.18905719 = fieldWeight in 1683, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=1683)
  0.2857143 = coord(4/14)
```
Abstract

As the demand for global information increases significantly, multilingual corpora has become a valuable linguistic resource for applications to cross-lingual information retrieval and natural language processing. In order to cross the boundaries that exist between different languages, dictionaries are the most typical tools. However, the general-purpose dictionary is less sensitive in both genre and domain. It is also impractical to manually construct tailored bilingual dictionaries or sophisticated multilingual thesauri for large applications. Corpusbased approaches, which do not have the limitation of dictionaries, provide a statistical translation model with which to cross the language boundary. There are many domain-specific parallel or comparable corpora that are employed in machine translation and cross-lingual information retrieval. Most of these are corpora between Indo-European languages, such as English/French and English/Spanish. The Asian/Indo-European corpus, especially English/Chinese corpus, is relatively sparse. The objective of the present research is to construct English/ Chinese parallel corpus automatically from the World Wide Web. In this paper, an alignment method is presented which is based an dynamic programming to identify the one-to-one Chinese and English title pairs. The method includes alignment at title level, word level and character level. The longest common subsequence (LCS) is applied to find the most reliabie Chinese translation of an English word. As one word for a language may translate into two or more words repetitively in another language, the edit operation, deletion, is used to resolve redundancy. A score function is then proposed to determine the optimal title pairs. Experiments have been conducted to investigate the performance of the proposed method using the daily press release articles by the Hong Kong SAR government as the test bed. The precision of the result is 0.998 while the recall is 0.806. The release articles and speech articles, published by Hongkong & Shanghai Banking Corporation Limited, are also used to test our method, the precision is 1.00, and the recall is 0.948.

Source

Journal of the American Society for Information Science and technology. 54(2003) no.8, S.730-742

Li, Q.; Chen, Y.P.; Myaeng, S.-H.; Jin, Y.; Kang, B.-Y.: Concept unification of terms in different languages via web mining for Information Retrieval (2009) 0.02

0.018496793 = product of:
  0.086318366 = sum of:
    0.03586755 = weight(_text_:web in 4215) [ClassicSimilarity], result of:
      0.03586755 = score(doc=4215,freq=8.0), product of:
        0.09947448 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.030480823 = queryNorm
        0.36057037 = fieldWeight in 4215, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4215)
    0.01271066 = weight(_text_:information in 4215) [ClassicSimilarity], result of:
      0.01271066 = score(doc=4215,freq=12.0), product of:
        0.053508412 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.030480823 = queryNorm
        0.23754507 = fieldWeight in 4215, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4215)
    0.037740156 = weight(_text_:retrieval in 4215) [ClassicSimilarity], result of:
      0.037740156 = score(doc=4215,freq=12.0), product of:
        0.092201896 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.030480823 = queryNorm
        0.40932083 = fieldWeight in 4215, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4215)
  0.21428572 = coord(3/14)

Abstract: For historical and cultural reasons, English phrases, especially proper nouns and new words, frequently appear in Web pages written primarily in East Asian languages such as Chinese, Korean, and Japanese. Although such English terms and their equivalences in these East Asian languages refer to the same concept, they are often erroneously treated as independent index units in traditional Information Retrieval (IR). This paper describes the degree to which the problem arises in IR and proposes a novel technique to solve it. Our method first extracts English terms from native Web documents in an East Asian language, and then unifies the extracted terms and their equivalences in the native language as one index unit. For Cross-Language Information Retrieval (CLIR), one of the major hindrances to achieving retrieval performance at the level of Mono-Lingual Information Retrieval (MLIR) is the translation of terms in search queries which can not be found in a bilingual dictionary. The Web mining approach proposed in this paper for concept unification of terms in different languages can also be applied to solve this well-known challenge in CLIR. Experimental results based on NTCIR and KT-Set test collections show that the high translation precision of our approach greatly improves performance of both Mono-Lingual and Cross-Language Information Retrieval.
Source: Information processing and management. 45(2009) no.2, S.246-262

Rahmstorf, G.: Rückkehr von Ordnung in die Informationstechnik? (2000) 0.02

0.018462798 = product of:
  0.12923957 = sum of:
    0.006226926 = weight(_text_:information in 5504) [ClassicSimilarity], result of:
      0.006226926 = score(doc=5504,freq=2.0), product of:
        0.053508412 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.030480823 = queryNorm
        0.116372846 = fieldWeight in 5504, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=5504)
    0.12301265 = weight(_text_:kongress in 5504) [ClassicSimilarity], result of:
      0.12301265 = score(doc=5504,freq=4.0), product of:
        0.19998738 = queryWeight, product of:
          6.5610886 = idf(docFreq=169, maxDocs=44218)
          0.030480823 = queryNorm
        0.61510205 = fieldWeight in 5504, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          6.5610886 = idf(docFreq=169, maxDocs=44218)
          0.046875 = fieldNorm(doc=5504)
  0.14285715 = coord(2/14)

Series: Gemeinsamer Kongress der Bundesvereinigung Deutscher Bibliotheksverbände e.V. (BDB) und der Deutschen Gesellschaft für Informationswissenschaft und Informationspraxis e.V. (DGI); Bd.1)(Tagungen der Deutschen Gesellschaft für Informationswissenschaft und Informationspraxis e.V.; Bd.3
Source: Information und Öffentlichkeit: 1. Gemeinsamer Kongress der Bundesvereinigung Deutscher Bibliotheksverbände e.V. (BDB) und der Deutschen Gesellschaft für Informationswissenschaft und Informationspraxis e.V. (DGI), Leipzig, 20.-23.3.2000. Zugleich 90. Deutscher Bibliothekartag, 52. Jahrestagung der Deutschen Gesellschaft für Informationswissenschaft und Informationspraxis e.V. (DGI). Hrsg.: G. Ruppelt u. H. Neißer

Thelwall, M.; Price, L.: Language evolution and the spread of ideas on the Web : a procedure for identifying emergent hybrid word (2006) 0.02

0.01782203 = product of:
  0.08316947 = sum of:
    0.039667886 = weight(_text_:wide in 5896) [ClassicSimilarity], result of:
      0.039667886 = score(doc=5896,freq=2.0), product of:
        0.13505316 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.030480823 = queryNorm
        0.29372054 = fieldWeight in 5896, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.046875 = fieldNorm(doc=5896)
    0.037274655 = weight(_text_:web in 5896) [ClassicSimilarity], result of:
      0.037274655 = score(doc=5896,freq=6.0), product of:
        0.09947448 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.030480823 = queryNorm
        0.37471575 = fieldWeight in 5896, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=5896)
    0.006226926 = weight(_text_:information in 5896) [ClassicSimilarity], result of:
      0.006226926 = score(doc=5896,freq=2.0), product of:
        0.053508412 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.030480823 = queryNorm
        0.116372846 = fieldWeight in 5896, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=5896)
  0.21428572 = coord(3/14)

Abstract: Word usage is of interest to linguists for its own sake as well as to social scientists and others who seek to track the spread of ideas, for example, in public debates over political decisions. The historical evolution of language can be analyzed with the tools of corpus linguistics through evolving corpora and the Web. But word usage statistics can only be gathered for known words. In this article, techniques are described and tested for identifying new words from the Web, focusing on the case when the words are related to a topic and have a hybrid form with a common sequence of letters. The results highlight the need to employ a combination of search techniques and show the wide potential of hybrid word family investigations in linguistics and social science.
Source: Journal of the American Society for Information Science and Technology. 57(2006) no.10, S.1326-1337

Airio, E.: Who benefits from CLIR in web retrieval? (2008) 0.02

0.017798252 = product of:
  0.083058506 = sum of:
    0.037274655 = weight(_text_:web in 2342) [ClassicSimilarity], result of:
      0.037274655 = score(doc=2342,freq=6.0), product of:
        0.09947448 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.030480823 = queryNorm
        0.37471575 = fieldWeight in 2342, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=2342)
    0.0088062035 = weight(_text_:information in 2342) [ClassicSimilarity], result of:
      0.0088062035 = score(doc=2342,freq=4.0), product of:
        0.053508412 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.030480823 = queryNorm
        0.16457605 = fieldWeight in 2342, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=2342)
    0.03697765 = weight(_text_:retrieval in 2342) [ClassicSimilarity], result of:
      0.03697765 = score(doc=2342,freq=8.0), product of:
        0.092201896 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.030480823 = queryNorm
        0.40105087 = fieldWeight in 2342, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2342)
  0.21428572 = coord(3/14)

Abstract: Purpose - The aim of the current paper is to test whether query translation is beneficial in web retrieval. Design/methodology/approach - The language pairs were Finnish-Swedish, English-German and Finnish-French. A total of 12-18 participants were recruited for each language pair. Each participant performed four retrieval tasks. The author's aim was to compare the performance of the translated queries with that of the target language queries. Thus, the author asked participants to formulate a source language query and a target language query for each task. The source language queries were translated into the target language utilizing a dictionary-based system. In English-German, also machine translation was utilized. The author used Google as the search engine. Findings - The results differed depending on the language pair. The author concluded that the dictionary coverage had an effect on the results. On average, the results of query-translation were better than in the traditional laboratory tests. Originality/value - This research shows that query translation in web is beneficial especially for users with moderate and non-active language skills. This is valuable information for developers of cross-language information retrieval systems.

Sprachtechnologie für eine dynamische Wirtschaft im Medienzeitalter - Language technologies for dynamic business in the age of the media - L'ingénierie linguistique au service de la dynamisation économique à l'ère du multimédia : Tagungsakten der XXVI. Jahrestagung der Internationalen Vereinigung Sprache und Wirtschaft e.V., 23.-25.11.2000 Fachhochschule Köln (2000) 0.02
```
0.017545398 = product of:
  0.08187852 = sum of:
    0.017933775 = weight(_text_:web in 5527) [ClassicSimilarity], result of:
      0.017933775 = score(doc=5527,freq=2.0), product of:
        0.09947448 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.030480823 = queryNorm
        0.18028519 = fieldWeight in 5527, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5527)
    0.008987795 = weight(_text_:information in 5527) [ClassicSimilarity], result of:
      0.008987795 = score(doc=5527,freq=6.0), product of:
        0.053508412 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.030480823 = queryNorm
        0.16796975 = fieldWeight in 5527, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5527)
    0.05495695 = weight(_text_:wien in 5527) [ClassicSimilarity], result of:
      0.05495695 = score(doc=5527,freq=2.0), product of:
        0.17413543 = queryWeight, product of:
          5.7129507 = idf(docFreq=396, maxDocs=44218)
          0.030480823 = queryNorm
        0.3155989 = fieldWeight in 5527, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.7129507 = idf(docFreq=396, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5527)
  0.21428572 = coord(3/14)
```
Content

Enthält die Beiträge: WRIGHT, S.E.: Leveraging terminology resources across application boundaries: accessing resources in future integrated environments; PALME, K.: E-Commerce: Verhindert Sprache Business-to-business?; RÜEGGER, R.: Die qualität der virtuellen Information als Wettbewerbsvorteil: Information im Internet ist Sprache - noch; SCHIRMER, K. u. J. HALLER: Zugang zu mehrsprachigen Nachrichten im Internet; WEISS, A. u. W. WIEDEN: Die Herstellung mehrsprachiger Informations- und Wissensressourcen in Unternehmen; FULFORD, H.: Monolingual or multilingual web sites? An exploratory study of UK SMEs; SCHMIDTKE-NIKELLA, M.: Effiziente Hypermediaentwicklung: Die Autorenentlastung durch eine Engine; SCHMIDT, R.: Maschinelle Text-Ton-Synchronisation in Wissenschaft und Wirtschaft; HELBIG, H. u.a.: Natürlichsprachlicher Zugang zu Informationsanbietern im Internet und zu lokalen Datenbanken; SIENEL, J. u.a.: Sprachtechnologien für die Informationsgesellschaft des 21. Jahrhunderts; ERBACH, G.: Sprachdialogsysteme für Telefondienste: Stand der Technik und zukünftige Entwicklungen; SUSEN, A.: Spracherkennung: Akteulle Einsatzmöglichkeiten im Bereich der Telekommunikation; BENZMÜLLER, R.: Logox WebSpeech: die neue Technologie für sprechende Internetseiten; JAARANEN, K. u.a.: Webtran tools for in-company language support; SCHMITZ, K.-D.: Projektforschung und Infrastrukturen im Bereich der Terminologie: Wie kann die Wirtschaft davon profitieren?; SCHRÖTER, F. u. U. MEYER: Entwicklung sprachlicher Handlungskompetenz in englisch mit hilfe eines Multimedia-Sprachlernsystems; KLEIN, A.: Der Einsatz von Sprachverarbeitungstools beim Sprachenlernen im Intranet; HAUER, M.: Knowledge Management braucht Terminologie Management; HEYER, G. u.a.: Texttechnologische Anwendungen am Beispiel Text Mining

Imprint

Wien : Termnet

Theme

Information Resources Management
Wright, S.E.: Leveraging terminology resources across application boundaries : accessing resources in future integrated environments (2000) 0.02
```
0.017545398 = product of:
  0.08187852 = sum of:
    0.017933775 = weight(_text_:web in 5528) [ClassicSimilarity], result of:
      0.017933775 = score(doc=5528,freq=2.0), product of:
        0.09947448 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.030480823 = queryNorm
        0.18028519 = fieldWeight in 5528, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5528)
    0.008987795 = weight(_text_:information in 5528) [ClassicSimilarity], result of:
      0.008987795 = score(doc=5528,freq=6.0), product of:
        0.053508412 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.030480823 = queryNorm
        0.16796975 = fieldWeight in 5528, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5528)
    0.05495695 = weight(_text_:wien in 5528) [ClassicSimilarity], result of:
      0.05495695 = score(doc=5528,freq=2.0), product of:
        0.17413543 = queryWeight, product of:
          5.7129507 = idf(docFreq=396, maxDocs=44218)
          0.030480823 = queryNorm
        0.3155989 = fieldWeight in 5528, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.7129507 = idf(docFreq=396, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5528)
  0.21428572 = coord(3/14)
```
Abstract

The title for this conference, stated in English, is Language Technology for a Dynamic Economy - y in the Media Age - The question arises as to what the media are we are dealing with and to what extent we are moving away from tile reality of different media to a world in which all sub-categories flow together into a unified stream of information that is constantly resealed to appear in different hardware configurations. A few years ago, people who were interested in sharing data or getting different electronic "boxes" to talk to each other were focused on two major aspects: I ) developing data conversion technology, and 2) convincing potential users that sharing information was an even remotely interesting option. Although some content "owners" are still reticent about releasing their data, it has become dramatically apparent in the Web environment that a broad range of users does indeed want this technology. Even as researchers struggle with the remaining technical, legal, and ethical impediments that stand in the way of unlimited information access to existing multi-platform resources, the future view of the world will no longer be as obsessed with conversion capability as it will be with creating content, with ,in eye to morphing technologies that will enable the delivery of that content from ail open-standards-based format such as XML (eXtensibic Markup Language), MPEG (Moving Picture Experts Group), or WAP (Wireless Application Protocol) to a rich variety of display Options

Imprint

Wien : Termnet

Search (500 results, page 1 of 25)

Authors

Years

Languages

Types

Themes

Subjects

Classifications