Search (3 results, page 1 of 1)

  • × author_ss:"Fan, W."
  • × year_i:[2000 TO 2010}
  1. Radev, D.R.; Libner, K.; Fan, W.: Getting answers to natural language questions on the Web (2002) 0.02
    0.020299202 = product of:
      0.040598404 = sum of:
        0.040598404 = product of:
          0.08119681 = sum of:
            0.08119681 = weight(_text_:90 in 5204) [ClassicSimilarity], result of:
              0.08119681 = score(doc=5204,freq=2.0), product of:
                0.2733978 = queryWeight, product of:
                  5.376119 = idf(docFreq=555, maxDocs=44218)
                  0.050854117 = queryNorm
                0.29699144 = fieldWeight in 5204, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.376119 = idf(docFreq=555, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5204)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Seven hundred natural language questions from TREC-8 and TREC-9 were sent by Radev, Libner, and Fan to each of nine web search engines. The top 40 sites returned by each system were stored for evaluation of their productivity of correct answers. Each question per engine was scored as the sum of the reciprocal ranks of identified correct answers. The large number of zero scores gave a positive skew violating the normality assumption for ANOVA, so values were transformed to zero for no hit and one for one or more hits. The non-zero values were then square-root transformed to remove the remaining positive skew. Interactions were observed between search engine and answer type (name, place, date, et cetera), search engine and number of proper nouns in the query, search engine and the need for time limitation, and search engine and total query words. All effects were significant. Shortest queries had the highest mean scores. One or more proper nouns present provides a significant advantage. Non-time dependent queries have an advantage. Place, name, person, and text description had mean scores between .85 and .9 with date at .81 and number at .59. There were significant differences in score by search engine. Search engines found at least one correct answer in between 87.7 and 75.45 of the cases. Google and Northern Light were just short of a 90% hit rate. No evidence indicated that a particular engine was better at answering any particular sort of question.
  2. Fan, W.; Fox, E.A.; Pathak, P.; Wu, H.: ¬The effects of fitness functions an genetic programming-based ranking discovery for Web search (2004) 0.01
    0.010335046 = product of:
      0.020670092 = sum of:
        0.020670092 = product of:
          0.041340183 = sum of:
            0.041340183 = weight(_text_:22 in 2239) [ClassicSimilarity], result of:
              0.041340183 = score(doc=2239,freq=2.0), product of:
                0.17808245 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050854117 = queryNorm
                0.23214069 = fieldWeight in 2239, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2239)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    31. 5.2004 19:22:06
  3. Zeng, M.L.; Fan, W.; Lin, X.: SKOS for an integrated vocabulary structure (2008) 0.01
    0.0097439755 = product of:
      0.019487951 = sum of:
        0.019487951 = product of:
          0.038975902 = sum of:
            0.038975902 = weight(_text_:22 in 2654) [ClassicSimilarity], result of:
              0.038975902 = score(doc=2654,freq=4.0), product of:
                0.17808245 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050854117 = queryNorm
                0.21886435 = fieldWeight in 2654, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2654)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In order to transfer the Chinese Classified Thesaurus (CCT) into a machine-processable format and provide CCT-based Web services, a pilot study has been conducted in which a variety of selected CCT classes and mapped thesaurus entries are encoded with SKOS. OWL and RDFS are also used to encode the same contents for the purposes of feasibility and cost-benefit comparison. CCT is a collected effort led by the National Library of China. It is an integration of the national standards Chinese Library Classification (CLC) 4th edition and Chinese Thesaurus (CT). As a manually created mapping product, CCT provides for each of the classes the corresponding thesaurus terms, and vice versa. The coverage of CCT includes four major clusters: philosophy, social sciences and humanities, natural sciences and technologies, and general works. There are 22 main-classes, 52,992 sub-classes and divisions, 110,837 preferred thesaurus terms, 35,690 entry terms (non-preferred terms), and 59,738 pre-coordinated headings (Chinese Classified Thesaurus, 2005) Major challenges of encoding this large vocabulary comes from its integrated structure. CCT is a result of the combination of two structures (illustrated in Figure 1): a thesaurus that uses ISO-2788 standardized structure and a classification scheme that is basically enumerative, but provides some flexibility for several kinds of synthetic mechanisms Other challenges include the complex relationships caused by differences of granularities of two original schemes and their presentation with various levels of SKOS elements; as well as the diverse coordination of entries due to the use of auxiliary tables and pre-coordinated headings derived from combining classes, subdivisions, and thesaurus terms, which do not correspond to existing unique identifiers. The poster reports the progress, shares the sample SKOS entries, and summarizes problems identified during the SKOS encoding process. Although OWL Lite and OWL Full provide richer expressiveness, the cost-benefit issues and the final purposes of encoding CCT raise questions of using such approaches.
    Source
    Metadata for semantic and social applications : proceedings of the International Conference on Dublin Core and Metadata Applications, Berlin, 22 - 26 September 2008, DC 2008: Berlin, Germany / ed. by Jane Greenberg and Wolfgang Klas