Search (9 results, page 1 of 1)

  • × author_ss:"Yang, J."
  1. Wan, X.; Yang, J.; Xiao, J.: Towards a unified approach to document similarity search using manifold-ranking of blocks (2008) 0.01
    0.0127842445 = product of:
      0.05965981 = sum of:
        0.02465703 = weight(_text_:web in 2081) [ClassicSimilarity], result of:
          0.02465703 = score(doc=2081,freq=4.0), product of:
            0.09670874 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.029633347 = queryNorm
            0.25496176 = fieldWeight in 2081, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2081)
        0.0050448296 = weight(_text_:information in 2081) [ClassicSimilarity], result of:
          0.0050448296 = score(doc=2081,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.09697737 = fieldWeight in 2081, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2081)
        0.029957948 = weight(_text_:retrieval in 2081) [ClassicSimilarity], result of:
          0.029957948 = score(doc=2081,freq=8.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.33420905 = fieldWeight in 2081, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2081)
      0.21428572 = coord(3/14)
    
    Abstract
    Document similarity search (i.e. query by example) aims to retrieve a ranked list of documents similar to a query document in a text corpus or on the Web. Most existing approaches to similarity search first compute the pairwise similarity score between each document and the query using a retrieval function or similarity measure (e.g. Cosine), and then rank the documents by the similarity scores. In this paper, we propose a novel retrieval approach based on manifold-ranking of document blocks (i.e. a block of coherent text about a subtopic) to re-rank a small set of documents initially retrieved by some existing retrieval function. The proposed approach can make full use of the intrinsic global manifold structure of the document blocks by propagating the ranking scores between the blocks on a weighted graph. First, the TextTiling algorithm and the VIPS algorithm are respectively employed to segment text documents and web pages into blocks. Then, each block is assigned with a ranking score by the manifold-ranking algorithm. Lastly, a document gets its final ranking score by fusing the scores of its blocks. Experimental results on the TDT data and the ODP data demonstrate that the proposed approach can significantly improve the retrieval performances over baseline approaches. Document block is validated to be a better unit than the whole document in the manifold-ranking process.
    Source
    Information processing and management. 44(2008) no.3, S.1032-1048
  2. Gachot, D.A.; Lange, E.; Yang, J.: ¬The SYSTRAN NLP browser : an application of machine translation technology in cross-language information retrieval (1998) 0.01
    0.011891057 = product of:
      0.083237395 = sum of:
        0.020970963 = weight(_text_:information in 6213) [ClassicSimilarity], result of:
          0.020970963 = score(doc=6213,freq=6.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.40312737 = fieldWeight in 6213, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.09375 = fieldNorm(doc=6213)
        0.06226643 = weight(_text_:retrieval in 6213) [ClassicSimilarity], result of:
          0.06226643 = score(doc=6213,freq=6.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.6946405 = fieldWeight in 6213, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.09375 = fieldNorm(doc=6213)
      0.14285715 = coord(2/14)
    
    Series
    The Kluwer International series on information retrieval
    Source
    Cross-language information retrieval. Ed.: G. Grefenstette
  3. Wang, J.; Clements, M.; Yang, J.; Vries, A.P. de; Reinders, M.J.T.: Personalization of tagging systems (2010) 0.01
    0.009632302 = product of:
      0.044950746 = sum of:
        0.020922182 = weight(_text_:web in 4229) [ClassicSimilarity], result of:
          0.020922182 = score(doc=4229,freq=2.0), product of:
            0.09670874 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.029633347 = queryNorm
            0.21634221 = fieldWeight in 4229, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=4229)
        0.0060537956 = weight(_text_:information in 4229) [ClassicSimilarity], result of:
          0.0060537956 = score(doc=4229,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.116372846 = fieldWeight in 4229, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=4229)
        0.01797477 = weight(_text_:retrieval in 4229) [ClassicSimilarity], result of:
          0.01797477 = score(doc=4229,freq=2.0), product of:
            0.08963835 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.029633347 = queryNorm
            0.20052543 = fieldWeight in 4229, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=4229)
      0.21428572 = coord(3/14)
    
    Abstract
    Social media systems have encouraged end user participation in the Internet, for the purpose of storing and distributing Internet content, sharing opinions and maintaining relationships. Collaborative tagging allows users to annotate the resulting user-generated content, and enables effective retrieval of otherwise uncategorised data. However, compared to professional web content production, collaborative tagging systems face the challenge that end-users assign tags in an uncontrolled manner, resulting in unsystematic and inconsistent metadata. This paper introduces a framework for the personalization of social media systems. We pinpoint three tasks that would benefit from personalization: collaborative tagging, collaborative browsing and collaborative search. We propose a ranking model for each task that integrates the individual user's tagging history in the recommendation of tags and content, to align its suggestions to the individual user preferences. We demonstrate on two real data sets that for all three tasks, the personalized ranking should take into account both the user's own preference and the opinion of others.
    Source
    Information processing and management. 46(2010) no.1, S.58-70
  4. Filo, D.; Yang, J.: Yahoo! unplugged : Your discovery guide to the Web (1995) 0.00
    0.0043140817 = product of:
      0.06039714 = sum of:
        0.06039714 = weight(_text_:web in 6618) [ClassicSimilarity], result of:
          0.06039714 = score(doc=6618,freq=6.0), product of:
            0.09670874 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.029633347 = queryNorm
            0.6245262 = fieldWeight in 6618, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.078125 = fieldNorm(doc=6618)
      0.071428575 = coord(1/14)
    
    LCSH
    Web search engines
    Subject
    Web search engines
  5. Zhang, L.; Lu, W.; Yang, J.: LAGOS-AND : a large gold standard dataset for scholarly author name disambiguation (2023) 0.00
    0.001676621 = product of:
      0.011736346 = sum of:
        0.0050448296 = weight(_text_:information in 883) [ClassicSimilarity], result of:
          0.0050448296 = score(doc=883,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.09697737 = fieldWeight in 883, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=883)
        0.0066915164 = product of:
          0.020074548 = sum of:
            0.020074548 = weight(_text_:22 in 883) [ClassicSimilarity], result of:
              0.020074548 = score(doc=883,freq=2.0), product of:
                0.103770934 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.029633347 = queryNorm
                0.19345059 = fieldWeight in 883, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=883)
          0.33333334 = coord(1/3)
      0.14285715 = coord(2/14)
    
    Date
    22. 1.2023 18:40:36
    Source
    Journal of the Association for Information Science and Technology. 74(2023) no.2, S.168-185
  6. Wan, X.; Yang, J.; Xiao, J.: Incorporating cross-document relationships between sentences for single document summarizations (2006) 0.00
    5.735585E-4 = product of:
      0.008029819 = sum of:
        0.008029819 = product of:
          0.024089456 = sum of:
            0.024089456 = weight(_text_:22 in 2421) [ClassicSimilarity], result of:
              0.024089456 = score(doc=2421,freq=2.0), product of:
                0.103770934 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.029633347 = queryNorm
                0.23214069 = fieldWeight in 2421, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2421)
          0.33333334 = coord(1/3)
      0.071428575 = coord(1/14)
    
    Source
    Research and advanced technology for digital libraries : 10th European conference, proceedings / ECDL 2006, Alicante, Spain, September 17 - 22, 2006
  7. Tang, X.-B.; Liu, G.-C.; Yang, J.; Wei, W.: Knowledge-based financial statement fraud detection system : based on an ontology and a decision tree (2018) 0.00
    5.735585E-4 = product of:
      0.008029819 = sum of:
        0.008029819 = product of:
          0.024089456 = sum of:
            0.024089456 = weight(_text_:22 in 4306) [ClassicSimilarity], result of:
              0.024089456 = score(doc=4306,freq=2.0), product of:
                0.103770934 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.029633347 = queryNorm
                0.23214069 = fieldWeight in 4306, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4306)
          0.33333334 = coord(1/3)
      0.071428575 = coord(1/14)
    
    Date
    21. 6.2018 10:22:43
  8. Wang, F.; Yang, J.; Wu, Y.: Non-synchronism in theoretical research of information science (2021) 0.00
    5.0960475E-4 = product of:
      0.0071344664 = sum of:
        0.0071344664 = weight(_text_:information in 602) [ClassicSimilarity], result of:
          0.0071344664 = score(doc=602,freq=4.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.13714671 = fieldWeight in 602, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=602)
      0.071428575 = coord(1/14)
    
    Abstract
    Purpose This paper aims to reveal the global non-synchronism that exists in the theoretical research of information science (IS) by analyzing and comparing the distribution of theory use, creation and borrowing in four representative journals from the USA, the UK and China. Design/methodology/approach Quantitative content analysis is adopted as the research method. First, an analytical framework for non-synchronism in theoretical research of IS is constructed. Second, theories mentioned in the full texts of the research papers of four journals are extracted according to a theory dictionary made before. Third, the non-synchronism in the theoretical research of IS is analyzed. Findings Non-synchronism exists in many aspects of the theoretical research of IS between journals, subject areas and countries/regions. The theoretical underdevelopment still exists in some subject areas of IS. IS presents obvious interdisciplinary characteristics. The theoretical distance from IS to social sciences is shorter than that to natural sciences. Research limitations/implications This study investigates the theoretical research of IS from the perspective of non-synchronism theory, reveals the theoretical distance from IS to other sciences, deepens the communication between different subject and regional sub-communities of IS and provides new evidences for the necessity of developing domestic theories and theorists of IS. Originality/value This study introduces the theory of non-synchronism to IS research for the first time, investigates the new advances in theoretical research of IS and provides new quantitative evidences for the understanding of the interdisciplinary characteristics of IS and the necessity of better communication between sub-communities of IS.
  9. Huang, S.; Qian, J.; Huang, Y.; Lu, W.; Bu, Y.; Yang, J.; Cheng, Q.: Disclosing the relationship between citation structure and future impact of a publication (2022) 0.00
    3.6034497E-4 = product of:
      0.0050448296 = sum of:
        0.0050448296 = weight(_text_:information in 621) [ClassicSimilarity], result of:
          0.0050448296 = score(doc=621,freq=2.0), product of:
            0.052020688 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.029633347 = queryNorm
            0.09697737 = fieldWeight in 621, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=621)
      0.071428575 = coord(1/14)
    
    Source
    Journal of the Association for Information Science and Technology. 73(2022) no.7, S.1025-1042