Search (1 results, page 1 of 1)

Did you mean:
editor's%3a%22Neuroth%2c H.%2c A. O%c3%9Fwald%2c R. scheffel%2c S. strathmann u. K. huth%22 1
editor's%3a%22Neuroth%2c H.%2c A. O%c3%9Fwald%2c R. scheffel%2c S. strathmann u. K. hugh%22 1
editor's%3a%22Neuroth%2c H.%2c A. O%c3%9Fwald%2c R. scheffel%2c S. stratmann u. K. huth%22 1
editor's%3a%22Neuroth%2c H.%2c A. O%c3%9Fwald%2c R. scheffelt%2c S. strathmann u. K. huth%22 1
editores%3a%22Neuroth%2c H.%2c A. O%c3%9Fwald%2c R. scheffel%2c S. strathmann u. K. huth%22 1

Yu, L.-C.; Wu, C.-H.; Chang, R.-Y.; Liu, C.-H.; Hovy, E.H.: Annotation and verification of sense pools in OntoNotes (2010) 0.01

0.0110593 = product of:
  0.030413074 = sum of:
    0.0064098886 = product of:
      0.012819777 = sum of:
        0.012819777 = weight(_text_:h in 4236) [ClassicSimilarity], result of:
          0.012819777 = score(doc=4236,freq=4.0), product of:
            0.0660481 = queryWeight, product of:
              2.4844491 = idf(docFreq=10020, maxDocs=44218)
              0.026584605 = queryNorm
            0.1940976 = fieldWeight in 4236, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.4844491 = idf(docFreq=10020, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4236)
      0.5 = coord(1/2)
    0.0061744633 = weight(_text_:a in 4236) [ClassicSimilarity], result of:
      0.0061744633 = score(doc=4236,freq=20.0), product of:
        0.030653298 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.026584605 = queryNorm
        0.20142901 = fieldWeight in 4236, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4236)
    0.016092705 = weight(_text_:r in 4236) [ClassicSimilarity], result of:
      0.016092705 = score(doc=4236,freq=2.0), product of:
        0.088001914 = queryWeight, product of:
          3.3102584 = idf(docFreq=4387, maxDocs=44218)
          0.026584605 = queryNorm
        0.18286766 = fieldWeight in 4236, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.3102584 = idf(docFreq=4387, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4236)
    0.0017360178 = weight(_text_:s in 4236) [ClassicSimilarity], result of:
      0.0017360178 = score(doc=4236,freq=2.0), product of:
        0.028903782 = queryWeight, product of:
          1.0872376 = idf(docFreq=40523, maxDocs=44218)
          0.026584605 = queryNorm
        0.060061958 = fieldWeight in 4236, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.0872376 = idf(docFreq=40523, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4236)
  0.36363637 = coord(4/11)

Abstract: The paper describes the OntoNotes, a multilingual (English, Chinese and Arabic) corpus with large-scale semantic annotations, including predicate-argument structure, word senses, ontology linking, and coreference. The underlying semantic model of OntoNotes involves word senses that are grouped into so-called sense pools, i.e., sets of near-synonymous senses of words. Such information is useful for many applications, including query expansion for information retrieval (IR) systems, (near-)duplicate detection for text summarization systems, and alternative word selection for writing support systems. Although a sense pool provides a set of near-synonymous senses of words, there is still no knowledge about whether two words in a pool are interchangeable in practical use. Therefore, this paper devises an unsupervised algorithm that incorporates Google n-grams and a statistical test to determine whether a word in a pool can be substituted by other words in the same pool. The n-gram features are used to measure the degree of context mismatch for a substitution. The statistical test is then applied to determine whether the substitution is adequate based on the degree of mismatch. The proposed method is compared with a supervised method, namely Linear Discriminant Analysis (LDA). Experimental results show that the proposed unsupervised method can achieve comparable performance with the supervised method.
Source: Information processing and management. 46(2010) no.4, S.436-447
Type: a