Search (2 results, page 1 of 1)

  • × author_ss:"Xu, D."
  1. Xu, D.; Cheng, G.; Qu, Y.: Preferences in Wikipedia abstracts : empirical findings and implications for automatic entity summarization (2014) 0.01
    0.0063353376 = product of:
      0.02534135 = sum of:
        0.02534135 = weight(_text_:data in 2700) [ClassicSimilarity], result of:
          0.02534135 = score(doc=2700,freq=2.0), product of:
            0.120893985 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03823278 = queryNorm
            0.2096163 = fieldWeight in 2700, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=2700)
      0.25 = coord(1/4)
    
    Abstract
    The volume of entity-centric structured data grows rapidly on the Web. The description of an entity, composed of property-value pairs (a.k.a. features), has become very large in many applications. To avoid information overload, efforts have been made to automatically select a limited number of features to be shown to the user based on certain criteria, which is called automatic entity summarization. However, to the best of our knowledge, there is a lack of extensive studies on how humans rank and select features in practice, which can provide empirical support and inspire future research. In this article, we present a large-scale statistical analysis of the descriptions of entities provided by DBpedia and the abstracts of their corresponding Wikipedia articles, to empirically study, along several different dimensions, which kinds of features are preferable when humans summarize. Implications for automatic entity summarization are drawn from the findings.
  2. Liu, B.; Yuan, Q.; Cong, G.; Xu, D.: Where your photo is taken : geolocation prediction for social images (2014) 0.01
    0.0052794483 = product of:
      0.021117793 = sum of:
        0.021117793 = weight(_text_:data in 1290) [ClassicSimilarity], result of:
          0.021117793 = score(doc=1290,freq=2.0), product of:
            0.120893985 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03823278 = queryNorm
            0.17468026 = fieldWeight in 1290, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1290)
      0.25 = coord(1/4)
    
    Abstract
    Social image-sharing websites have attracted a large number of users. These systems allow users to associate geolocation information with their images, which is essential for many interesting applications. However, only a small fraction of social images have geolocation information. Thus, an automated tool for suggesting geolocation is essential to help users geotag their images. In this article, we use a large data set consisting of 221 million Flickr images uploaded by 2.2 million users. For the first time, we analyze user uploading patterns, user geotagging behaviors, and the relationship between the taken-time gap and the geographical distance between two images from the same user. Based on the findings, we represent a user profile by historical tags for the user and build a multinomial model on the user profile for geotagging. We further propose a unified framework to suggest geolocations for images, which combines the information from both image tags and the user profile. Experimental results show that for images uploaded by users who have never done geotagging, our method outperforms the state-of-the-art method by 10.6 to 34.2%, depending on the granularity of the prediction. For images from users who have done geotagging, a simple method is able to achieve very high accuracy.