Search (2 results, page 1 of 1)

  • × author_ss:"Song, D."
  • × theme_ss:"Computerlinguistik"
  1. Hoenkamp, E.; Bruza, P.D.; Song, D.; Huang, Q.: ¬An effective approach to verbose queries using a limited dependencies language model (2009) 0.00
    0.0036906586 = product of:
      0.011071975 = sum of:
        0.011071975 = weight(_text_:in in 2122) [ClassicSimilarity], result of:
          0.011071975 = score(doc=2122,freq=14.0), product of:
            0.069613084 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.051176514 = queryNorm
            0.15905021 = fieldWeight in 2122, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.03125 = fieldNorm(doc=2122)
      0.33333334 = coord(1/3)
    
    Abstract
    Intuitively, any 'bag of words' approach in IR should benefit from taking term dependencies into account. Unfortunately, for years the results of exploiting such dependencies have been mixed or inconclusive. To improve the situation, this paper shows how the natural language properties of the target documents can be used to transform and enrich the term dependencies to more useful statistics. This is done in three steps. The term co-occurrence statistics of queries and documents are each represented by a Markov chain. The paper proves that such a chain is ergodic, and therefore its asymptotic behavior is unique, stationary, and independent of the initial state. Next, the stationary distribution is taken to model queries and documents, rather than their initial distributions. Finally, ranking is achieved following the customary language modeling paradigm. The main contribution of this paper is to argue why the asymptotic behavior of the document model is a better representation then just the document's initial distribution. A secondary contribution is to investigate the practical application of this representation in case the queries become increasingly verbose. In the experiments (based on Lemur's search engine substrate) the default query model was replaced by the stable distribution of the query. Just modeling the query this way already resulted in significant improvements over a standard language model baseline. The results were on a par or better than more sophisticated algorithms that use fine-tuned parameters or extensive training. Moreover, the more verbose the query, the more effective the approach seems to become.
    Series
    Lecture notes in computer science : advances in information retrieval theory; 5766
  2. Clark, M.; Kim, Y.; Kruschwitz, U.; Song, D.; Albakour, D.; Dignum, S.; Beresi, U.C.; Fasli, M.; Roeck, A De: Automatically structuring domain knowledge from text : an overview of current research (2012) 0.00
    0.0036241547 = product of:
      0.010872464 = sum of:
        0.010872464 = weight(_text_:in in 2738) [ClassicSimilarity], result of:
          0.010872464 = score(doc=2738,freq=6.0), product of:
            0.069613084 = queryWeight, product of:
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.051176514 = queryNorm
            0.1561842 = fieldWeight in 2738, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.3602545 = idf(docFreq=30841, maxDocs=44218)
              0.046875 = fieldNorm(doc=2738)
      0.33333334 = coord(1/3)
    
    Abstract
    This paper presents an overview of automatic methods for building domain knowledge structures (domain models) from text collections. Applications of domain models have a long history within knowledge engineering and artificial intelligence. In the last couple of decades they have surfaced noticeably as a useful tool within natural language processing, information retrieval and semantic web technology. Inspired by the ubiquitous propagation of domain model structures that are emerging in several research disciplines, we give an overview of the current research landscape and some techniques and approaches. We will also discuss trade-offs between different approaches and point to some recent trends.
    Content
    Beitrag in einem Themenheft "Soft Approaches to IA on the Web". Vgl.: doi:10.1016/j.ipm.2011.07.002.