Search (6 results, page 1 of 1)

  • × author_ss:"Lu, W."
  • × language_ss:"e"
  1. Lu, W.; MacFarlane, A.; Venuti, F.: Okapi-based XML indexing (2009) 0.02
    0.018958557 = product of:
      0.05687567 = sum of:
        0.05687567 = product of:
          0.11375134 = sum of:
            0.11375134 = weight(_text_:indexing in 3629) [ClassicSimilarity], result of:
              0.11375134 = score(doc=3629,freq=16.0), product of:
                0.19018644 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.049684696 = queryNorm
                0.59810436 = fieldWeight in 3629, product of:
                  4.0 = tf(freq=16.0), with freq of:
                    16.0 = termFreq=16.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3629)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Purpose - Being an important data exchange and information storage standard, XML has generated a great deal of interest and particular attention has been paid to the issue of XML indexing. Clear use cases for structured search in XML have been established. However, most of the research in the area is either based on relational database systems or specialized semi-structured data management systems. This paper aims to propose a method for XML indexing based on the information retrieval (IR) system Okapi. Design/methodology/approach - First, the paper reviews the structure of inverted files and gives an overview of the issues of why this indexing mechanism cannot properly support XML retrieval, using the underlying data structures of Okapi as an example. Then the paper explores a revised method implemented on Okapi using path indexing structures. The paper evaluates these index structures through the metrics of indexing run time, path search run time and space costs using the INEX and Reuters RVC1 collections. Findings - Initial results on the INEX collections show that there is a substantial overhead in space costs for the method, but this increase does not affect run time adversely. Indexing results on differing sized Reuters RVC1 sub-collections show that the increase in space costs with increasing the size of a collection is significant, but in terms of run time the increase is linear. Path search results show sub-millisecond run times, demonstrating minimal overhead for XML search. Practical implications - Overall, the results show the method implemented to support XML search in a traditional IR system such as Okapi is viable. Originality/value - The paper provides useful information on a method for XML indexing based on the IR system Okapi.
  2. Lu, W.; Li, X.; Liu, Z.; Cheng, Q.: How do author-selected keywords function semantically in scientific manuscripts? (2019) 0.01
    0.0067028617 = product of:
      0.020108584 = sum of:
        0.020108584 = product of:
          0.04021717 = sum of:
            0.04021717 = weight(_text_:indexing in 5453) [ClassicSimilarity], result of:
              0.04021717 = score(doc=5453,freq=2.0), product of:
                0.19018644 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.049684696 = queryNorm
                0.21146181 = fieldWeight in 5453, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5453)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Author-selected keywords have been widely utilized for indexing, information retrieval, bibliometrics and knowledge organization in previous studies. However, few studies exist con-cerning how author-selected keywords function semantically in scientific manuscripts. In this paper, we investigated this problem from the perspective of term function (TF) by devising indica-tors of the diversity and symmetry of keyword term functions in papers, as well as the intensity of individual term functions in papers. The data obtained from the whole Journal of Informetrics(JOI) were manually processed by an annotation scheme of key-word term functions, including "research topic," "research method," "research object," "research area," "data" and "others," based on empirical work in content analysis. The results show, quantitatively, that the diversity of keyword term function de-creases, and the irregularity increases with the number of author-selected keywords in a paper. Moreover, the distribution of the intensity of individual keyword term function indicated that no significant difference exists between the ranking of the five term functions with the increase of the number of author-selected keywords (i.e., "research topic" > "research method" > "research object" > "research area" > "data"). The findings indicate that precise keyword related research must take into account the dis-tinct types of author-selected keywords.
  3. Huang, S.; Qian, J.; Huang, Y.; Lu, W.; Bu, Y.; Yang, J.; Cheng, Q.: Disclosing the relationship between citation structure and future impact of a publication (2022) 0.01
    0.0067028617 = product of:
      0.020108584 = sum of:
        0.020108584 = product of:
          0.04021717 = sum of:
            0.04021717 = weight(_text_:indexing in 621) [ClassicSimilarity], result of:
              0.04021717 = score(doc=621,freq=2.0), product of:
                0.19018644 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.049684696 = queryNorm
                0.21146181 = fieldWeight in 621, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=621)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Theme
    Citation indexing
  4. Lu, W.; Ding, H.; Jiang, J.: ¬A document expansion framework for tag-based image retrieval (2018) 0.01
    0.005609659 = product of:
      0.016828977 = sum of:
        0.016828977 = product of:
          0.033657953 = sum of:
            0.033657953 = weight(_text_:22 in 4630) [ClassicSimilarity], result of:
              0.033657953 = score(doc=4630,freq=2.0), product of:
                0.17398734 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049684696 = queryNorm
                0.19345059 = fieldWeight in 4630, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4630)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    20. 1.2015 18:30:22
  5. Zhang, L.; Lu, W.; Yang, J.: LAGOS-AND : a large gold standard dataset for scholarly author name disambiguation (2023) 0.01
    0.005609659 = product of:
      0.016828977 = sum of:
        0.016828977 = product of:
          0.033657953 = sum of:
            0.033657953 = weight(_text_:22 in 883) [ClassicSimilarity], result of:
              0.033657953 = score(doc=883,freq=2.0), product of:
                0.17398734 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049684696 = queryNorm
                0.19345059 = fieldWeight in 883, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=883)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    22. 1.2023 18:40:36
  6. Jiang, Y.; Meng, R.; Huang, Y.; Lu, W.; Liu, J.: Generating keyphrases for readers : a controllable keyphrase generation framework (2023) 0.01
    0.005609659 = product of:
      0.016828977 = sum of:
        0.016828977 = product of:
          0.033657953 = sum of:
            0.033657953 = weight(_text_:22 in 1012) [ClassicSimilarity], result of:
              0.033657953 = score(doc=1012,freq=2.0), product of:
                0.17398734 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.049684696 = queryNorm
                0.19345059 = fieldWeight in 1012, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1012)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    22. 6.2023 14:55:20