Search (5 results, page 1 of 1)

  • × author_ss:"Si, L."
  1. Cetintas, S.; Si, L.: Effective query generation and postprocessing strategies for prior art patent search (2012) 0.01
    0.006491486 = product of:
      0.06491486 = sum of:
        0.06491486 = weight(_text_:log in 71) [ClassicSimilarity], result of:
          0.06491486 = score(doc=71,freq=2.0), product of:
            0.18335998 = queryWeight, product of:
              6.4086204 = idf(docFreq=197, maxDocs=44218)
              0.028611459 = queryNorm
            0.3540296 = fieldWeight in 71, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.4086204 = idf(docFreq=197, maxDocs=44218)
              0.0390625 = fieldNorm(doc=71)
      0.1 = coord(1/10)
    
    Abstract
    Rapid increase in global competition demands increased protection of intellectual property rights and underlines the importance of patents as major intellectual property documents. Prior art patent search is the task of identifying related patents for a given patent file, and is an essential step in judging the validity of a patent application. This article proposes an automated query generation and postprocessing method for prior art patent search. The proposed approach first constructs structured queries by combining terms extracted from different fields of a query patent and then reranks the retrieved patents by utilizing the International Patent Classification (IPC) code similarities between the query patent and the retrieved patents along with the retrieval score. An extensive set of empirical results carried out on a large-scale, real-world dataset shows that utilizing 20 or 30 query terms extracted from all fields of an original query patent according to their log(tf)idf values helps form a representative search query out of the query patent and is found to be more effective than is using any number of query terms from any single field. It is shown that combining terms extracted from different fields of the query patent by giving higher importance to terms extracted from the abstract, claims, and description fields than to terms extracted from the title field is more effective than treating all extracted terms equally while forming the search query. Finally, utilizing the similarities between the IPC codes of the query patent and retrieved patents is shown to be beneficial to improve the effectiveness of the prior art search.
  2. Avrahami, T.T.; Yau, L.; Si, L.; Callan, J.P.: ¬The FedLemur project : Federated search in the real world (2006) 0.01
    0.005590722 = product of:
      0.02795361 = sum of:
        0.020200694 = weight(_text_:web in 5271) [ClassicSimilarity], result of:
          0.020200694 = score(doc=5271,freq=2.0), product of:
            0.0933738 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.028611459 = queryNorm
            0.21634221 = fieldWeight in 5271, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=5271)
        0.0077529154 = product of:
          0.023258746 = sum of:
            0.023258746 = weight(_text_:22 in 5271) [ClassicSimilarity], result of:
              0.023258746 = score(doc=5271,freq=2.0), product of:
                0.10019246 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.028611459 = queryNorm
                0.23214069 = fieldWeight in 5271, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5271)
          0.33333334 = coord(1/3)
      0.2 = coord(2/10)
    
    Abstract
    Federated search and distributed information retrieval systems provide a single user interface for searching multiple full-text search engines. They have been an active area of research for more than a decade, but in spite of their success as a research topic, they are still rare in operational environments. This article discusses a prototype federated search system developed for the U.S. government's FedStats Web portal, and the issues addressed in adapting research solutions to this operational environment. A series of experiments explore how well prior research results, parameter settings, and heuristics apply in the FedStats environment. The article concludes with a set of lessons learned from this technology transfer effort, including observations about search engine quality in the real world.
    Date
    22. 7.2006 16:02:07
  3. Si, L.: ¬The status quo and future development of cataloging and classification education in China (2005) 0.00
    0.0041536554 = product of:
      0.041536555 = sum of:
        0.041536555 = product of:
          0.06230483 = sum of:
            0.031293165 = weight(_text_:29 in 3544) [ClassicSimilarity], result of:
              0.031293165 = score(doc=3544,freq=2.0), product of:
                0.10064617 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.028611459 = queryNorm
                0.31092256 = fieldWeight in 3544, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3544)
            0.031011663 = weight(_text_:22 in 3544) [ClassicSimilarity], result of:
              0.031011663 = score(doc=3544,freq=2.0), product of:
                0.10019246 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.028611459 = queryNorm
                0.30952093 = fieldWeight in 3544, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3544)
          0.6666667 = coord(2/3)
      0.1 = coord(1/10)
    
    Date
    29. 9.2008 19:01:22
  4. Ren, P.; Chen, Z.; Ma, J.; Zhang, Z.; Si, L.; Wang, S.: Detecting temporal patterns of user queries (2017) 0.00
    0.0020200694 = product of:
      0.020200694 = sum of:
        0.020200694 = weight(_text_:web in 3315) [ClassicSimilarity], result of:
          0.020200694 = score(doc=3315,freq=2.0), product of:
            0.0933738 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.028611459 = queryNorm
            0.21634221 = fieldWeight in 3315, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=3315)
      0.1 = coord(1/10)
    
    Abstract
    Query classification is an important part of exploring the characteristics of web queries. Existing studies are mainly based on Broder's classification scheme and classify user queries into navigational, informational, and transactional categories according to users' information needs. In this article, we present a novel classification scheme from the perspective of queries' temporal patterns. Queries' temporal patterns are inherent time series patterns of the search volumes of queries that reflect the evolution of the popularity of a query over time. By analyzing the temporal patterns of queries, search engines can more deeply understand the users' search intents and thus improve performance. Furthermore, we extract three groups of features based on the queries' search volume time series and use a support vector machine (SVM) to automatically detect the temporal patterns of user queries. Extensive experiments on the Million Query Track data sets of the Text REtrieval Conference (TREC) demonstrate the effectiveness of our approach.
  5. Si, L.: Encoding formats and consideration of requirements for mapping (2007) 0.00
    9.045068E-4 = product of:
      0.009045068 = sum of:
        0.009045068 = product of:
          0.027135205 = sum of:
            0.027135205 = weight(_text_:22 in 540) [ClassicSimilarity], result of:
              0.027135205 = score(doc=540,freq=2.0), product of:
                0.10019246 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.028611459 = queryNorm
                0.2708308 = fieldWeight in 540, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=540)
          0.33333334 = coord(1/3)
      0.1 = coord(1/10)
    
    Date
    26.12.2011 13:22:27