Document (#33346)

Author
Kuo, J.-S.
Li, H.
Yang, Y.-K.
Title
Active learning for constructing transliteration lexicons from the Web
Source
Journal of the American Society for Information Science and Technology. 59(2008) no.1, S.126-135
Year
2008
Abstract
This article presents an adaptive learning framework for Phonetic Similarity Modeling (PSM) that supports the automatic construction of transliteration lexicons. The learning algorithm starts with minimum prior knowledge about machine transliteration and acquires knowledge iteratively from the Web. We study the unsupervised learning and the active learning strategies that minimize human supervision in terms of data labeling. The learning process refines the PSM and constructs a transliteration lexicon at the same time. We evaluate the proposed PSM and its learning algorithm through a series of systematic experiments, which show that the proposed framework is reliably effective on two independent databases.
Theme
Computerlinguistik

Similar documents (author)

  1. Yang, S.C.: ¬An interpretive and situated approach to an evaluation of Perseus digital libraries (2001) 4.50
    4.4981737 = sum of:
      4.4981737 = weight(author_txt:yang in 6933) [ClassicSimilarity], result of:
        4.4981737 = fieldWeight in 6933, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.1970778 = idf(docFreq=89, maxDocs=44218)
          0.625 = fieldNorm(doc=6933)
    
  2. Yang, K.: Information retrieval on the Web (2004) 4.50
    4.4981737 = sum of:
      4.4981737 = weight(author_txt:yang in 4278) [ClassicSimilarity], result of:
        4.4981737 = fieldWeight in 4278, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.1970778 = idf(docFreq=89, maxDocs=44218)
          0.625 = fieldNorm(doc=4278)
    
  3. Yang, C.C.: Content-based image retrievaI : a comparison between query by example and image browsing map approaches (2005) 4.50
    4.4981737 = sum of:
      4.4981737 = weight(author_txt:yang in 4649) [ClassicSimilarity], result of:
        4.4981737 = fieldWeight in 4649, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.1970778 = idf(docFreq=89, maxDocs=44218)
          0.625 = fieldNorm(doc=4649)
    
  4. Salton, G.; Yang, C.S.: On the specification of term values in automatic indexing (1973) 3.60
    3.5985389 = sum of:
      3.5985389 = weight(author_txt:yang in 5476) [ClassicSimilarity], result of:
        3.5985389 = fieldWeight in 5476, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.1970778 = idf(docFreq=89, maxDocs=44218)
          0.5 = fieldNorm(doc=5476)
    
  5. Yang, Y.; Chute, C.G.A.: ¬A schematic analysis of the Unified Medical Language System (1992) 3.60
    3.5985389 = sum of:
      3.5985389 = weight(author_txt:yang in 6445) [ClassicSimilarity], result of:
        3.5985389 = fieldWeight in 6445, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.1970778 = idf(docFreq=89, maxDocs=44218)
          0.5 = fieldNorm(doc=6445)
    

Similar documents (content)

  1. Malo, P.; Sinha, A.; Korhonen, P.; Wallenius, J.; Takala, P.: Good debt or bad debt : detecting semantic orientations in economic texts (2014) 0.16
    0.15617187 = sum of:
      0.15617187 = product of:
        0.55775666 = sum of:
          0.0133694 = weight(abstract_txt:that in 1226) [ClassicSimilarity], result of:
            0.0133694 = score(doc=1226,freq=6.0), product of:
              0.036855653 = queryWeight, product of:
                1.1373208 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.013676312 = queryNorm
              0.36275032 = fieldWeight in 1226, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=1226)
          0.063084565 = weight(abstract_txt:lexicon in 1226) [ClassicSimilarity], result of:
            0.063084565 = score(doc=1226,freq=1.0), product of:
              0.13063361 = queryWeight, product of:
                1.2362256 = boost
                7.7265954 = idf(docFreq=52, maxDocs=44218)
                0.013676312 = queryNorm
              0.4829122 = fieldWeight in 1226, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7265954 = idf(docFreq=52, maxDocs=44218)
                0.0625 = fieldNorm(doc=1226)
          0.025779903 = weight(abstract_txt:framework in 1226) [ClassicSimilarity], result of:
            0.025779903 = score(doc=1226,freq=1.0), product of:
              0.09063662 = queryWeight, product of:
                1.456254 = boost
                4.550903 = idf(docFreq=1268, maxDocs=44218)
                0.013676312 = queryNorm
              0.28443143 = fieldWeight in 1226, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.550903 = idf(docFreq=1268, maxDocs=44218)
                0.0625 = fieldNorm(doc=1226)
          0.03788021 = weight(abstract_txt:proposed in 1226) [ClassicSimilarity], result of:
            0.03788021 = score(doc=1226,freq=2.0), product of:
              0.09297819 = queryWeight, product of:
                1.474945 = boost
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.013676312 = queryNorm
              0.4074096 = fieldWeight in 1226, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.0625 = fieldNorm(doc=1226)
          0.050798647 = weight(abstract_txt:algorithm in 1226) [ClassicSimilarity], result of:
            0.050798647 = score(doc=1226,freq=1.0), product of:
              0.14245716 = queryWeight, product of:
                1.8256916 = boost
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.013676312 = queryNorm
              0.35658893 = fieldWeight in 1226, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.0625 = fieldNorm(doc=1226)
          0.26418972 = weight(abstract_txt:lexicons in 1226) [ClassicSimilarity], result of:
            0.26418972 = score(doc=1226,freq=2.0), product of:
              0.33940387 = queryWeight, product of:
                2.8180175 = boost
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.013676312 = queryNorm
              0.7783933 = fieldWeight in 1226, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.0625 = fieldNorm(doc=1226)
          0.10265424 = weight(abstract_txt:learning in 1226) [ClassicSimilarity], result of:
            0.10265424 = score(doc=1226,freq=1.0), product of:
              0.34571916 = queryWeight, product of:
                5.3208504 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.013676312 = queryNorm
              0.29692957 = fieldWeight in 1226, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.0625 = fieldNorm(doc=1226)
        0.28 = coord(7/25)
    
  2. Li, M.; Li, H.; Zhou, Z.-H.: Semi-supervised document retrieval (2009) 0.15
    0.1484035 = sum of:
      0.1484035 = product of:
        0.5300125 = sum of:
          0.04720591 = weight(abstract_txt:constructing in 4218) [ClassicSimilarity], result of:
            0.04720591 = score(doc=4218,freq=1.0), product of:
              0.107672244 = queryWeight, product of:
                1.1223341 = boost
                7.014756 = idf(docFreq=107, maxDocs=44218)
                0.013676312 = queryNorm
              0.43842226 = fieldWeight in 4218, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.014756 = idf(docFreq=107, maxDocs=44218)
                0.0625 = fieldNorm(doc=4218)
          0.0054580346 = weight(abstract_txt:that in 4218) [ClassicSimilarity], result of:
            0.0054580346 = score(doc=4218,freq=1.0), product of:
              0.036855653 = queryWeight, product of:
                1.1373208 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.013676312 = queryNorm
              0.1480922 = fieldWeight in 4218, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=4218)
          0.060494006 = weight(abstract_txt:unsupervised in 4218) [ClassicSimilarity], result of:
            0.060494006 = score(doc=4218,freq=1.0), product of:
              0.12703237 = queryWeight, product of:
                1.2190667 = boost
                7.61935 = idf(docFreq=58, maxDocs=44218)
                0.013676312 = queryNorm
              0.47620937 = fieldWeight in 4218, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.61935 = idf(docFreq=58, maxDocs=44218)
                0.0625 = fieldNorm(doc=4218)
          0.0926917 = weight(abstract_txt:labeling in 4218) [ClassicSimilarity], result of:
            0.0926917 = score(doc=4218,freq=2.0), product of:
              0.13400574 = queryWeight, product of:
                1.2520797 = boost
                7.825686 = idf(docFreq=47, maxDocs=44218)
                0.013676312 = queryNorm
              0.69169945 = fieldWeight in 4218, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.825686 = idf(docFreq=47, maxDocs=44218)
                0.0625 = fieldNorm(doc=4218)
          0.025779903 = weight(abstract_txt:framework in 4218) [ClassicSimilarity], result of:
            0.025779903 = score(doc=4218,freq=1.0), product of:
              0.09063662 = queryWeight, product of:
                1.456254 = boost
                4.550903 = idf(docFreq=1268, maxDocs=44218)
                0.013676312 = queryNorm
              0.28443143 = fieldWeight in 4218, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.550903 = idf(docFreq=1268, maxDocs=44218)
                0.0625 = fieldNorm(doc=4218)
          0.026785351 = weight(abstract_txt:proposed in 4218) [ClassicSimilarity], result of:
            0.026785351 = score(doc=4218,freq=1.0), product of:
              0.09297819 = queryWeight, product of:
                1.474945 = boost
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.013676312 = queryNorm
              0.2880821 = fieldWeight in 4218, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.0625 = fieldNorm(doc=4218)
          0.2715976 = weight(abstract_txt:learning in 4218) [ClassicSimilarity], result of:
            0.2715976 = score(doc=4218,freq=7.0), product of:
              0.34571916 = queryWeight, product of:
                5.3208504 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.013676312 = queryNorm
              0.7856018 = fieldWeight in 4218, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.0625 = fieldNorm(doc=4218)
        0.28 = coord(7/25)
    
  3. Xing, F.Z.; Pallucchini, F.; Cambria, E.: Cognitive-inspired domain adaptation of sentiment lexicons (2019) 0.14
    0.14322601 = sum of:
      0.14322601 = product of:
        0.71613 = sum of:
          0.010916069 = weight(abstract_txt:that in 5104) [ClassicSimilarity], result of:
            0.010916069 = score(doc=5104,freq=4.0), product of:
              0.036855653 = queryWeight, product of:
                1.1373208 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.013676312 = queryNorm
              0.2961844 = fieldWeight in 5104, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=5104)
          0.08921504 = weight(abstract_txt:lexicon in 5104) [ClassicSimilarity], result of:
            0.08921504 = score(doc=5104,freq=2.0), product of:
              0.13063361 = queryWeight, product of:
                1.2362256 = boost
                7.7265954 = idf(docFreq=52, maxDocs=44218)
                0.013676312 = queryNorm
              0.68294096 = fieldWeight in 5104, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.7265954 = idf(docFreq=52, maxDocs=44218)
                0.0625 = fieldNorm(doc=5104)
          0.09720325 = weight(abstract_txt:supervision in 5104) [ClassicSimilarity], result of:
            0.09720325 = score(doc=5104,freq=1.0), product of:
              0.17427163 = queryWeight, product of:
                1.4278535 = boost
                8.924298 = idf(docFreq=15, maxDocs=44218)
                0.013676312 = queryNorm
              0.55776864 = fieldWeight in 5104, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.924298 = idf(docFreq=15, maxDocs=44218)
                0.0625 = fieldNorm(doc=5104)
          0.3736207 = weight(abstract_txt:lexicons in 5104) [ClassicSimilarity], result of:
            0.3736207 = score(doc=5104,freq=4.0), product of:
              0.33940387 = queryWeight, product of:
                2.8180175 = boost
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.013676312 = queryNorm
              1.1008145 = fieldWeight in 5104, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.0625 = fieldNorm(doc=5104)
          0.14517501 = weight(abstract_txt:learning in 5104) [ClassicSimilarity], result of:
            0.14517501 = score(doc=5104,freq=2.0), product of:
              0.34571916 = queryWeight, product of:
                5.3208504 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.013676312 = queryNorm
              0.41992182 = fieldWeight in 5104, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.0625 = fieldNorm(doc=5104)
        0.2 = coord(5/25)
    
  4. Silva, R.M.; Gonçalves, M.A.; Veloso, A.: ¬A Two-stage active learning method for learning to rank (2014) 0.14
    0.136443 = sum of:
      0.136443 = product of:
        0.5685125 = sum of:
          0.010916069 = weight(abstract_txt:that in 1184) [ClassicSimilarity], result of:
            0.010916069 = score(doc=1184,freq=4.0), product of:
              0.036855653 = queryWeight, product of:
                1.1373208 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.013676312 = queryNorm
              0.2961844 = fieldWeight in 1184, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=1184)
          0.0926917 = weight(abstract_txt:labeling in 1184) [ClassicSimilarity], result of:
            0.0926917 = score(doc=1184,freq=2.0), product of:
              0.13400574 = queryWeight, product of:
                1.2520797 = boost
                7.825686 = idf(docFreq=47, maxDocs=44218)
                0.013676312 = queryNorm
              0.69169945 = fieldWeight in 1184, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.825686 = idf(docFreq=47, maxDocs=44218)
                0.0625 = fieldNorm(doc=1184)
          0.08858558 = weight(abstract_txt:iteratively in 1184) [ClassicSimilarity], result of:
            0.08858558 = score(doc=1184,freq=1.0), product of:
              0.16381294 = queryWeight, product of:
                1.3843452 = boost
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.013676312 = queryNorm
              0.5407728 = fieldWeight in 1184, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.0625 = fieldNorm(doc=1184)
          0.050798647 = weight(abstract_txt:algorithm in 1184) [ClassicSimilarity], result of:
            0.050798647 = score(doc=1184,freq=1.0), product of:
              0.14245716 = queryWeight, product of:
                1.8256916 = boost
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.013676312 = queryNorm
              0.35658893 = fieldWeight in 1184, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.0625 = fieldNorm(doc=1184)
          0.12021202 = weight(abstract_txt:active in 1184) [ClassicSimilarity], result of:
            0.12021202 = score(doc=1184,freq=3.0), product of:
              0.17540462 = queryWeight, product of:
                2.0258431 = boost
                6.330911 = idf(docFreq=213, maxDocs=44218)
                0.013676312 = queryNorm
              0.68534124 = fieldWeight in 1184, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.330911 = idf(docFreq=213, maxDocs=44218)
                0.0625 = fieldNorm(doc=1184)
          0.20530848 = weight(abstract_txt:learning in 1184) [ClassicSimilarity], result of:
            0.20530848 = score(doc=1184,freq=4.0), product of:
              0.34571916 = queryWeight, product of:
                5.3208504 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.013676312 = queryNorm
              0.59385914 = fieldWeight in 1184, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.0625 = fieldNorm(doc=1184)
        0.24 = coord(6/25)
    
  5. Engerer, V.: Control and syntagmatization : vocabulary requirements in information retrieval thesauri and natural language lexicons (2017) 0.13
    0.12778638 = sum of:
      0.12778638 = product of:
        0.5324433 = sum of:
          0.015332278 = weight(abstract_txt:knowledge in 3678) [ClassicSimilarity], result of:
            0.015332278 = score(doc=3678,freq=1.0), product of:
              0.05523919 = queryWeight, product of:
                1.1368651 = boost
                3.5527887 = idf(docFreq=3442, maxDocs=44218)
                0.013676312 = queryNorm
              0.2775616 = fieldWeight in 3678, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5527887 = idf(docFreq=3442, maxDocs=44218)
                0.078125 = fieldNorm(doc=3678)
          0.009648533 = weight(abstract_txt:that in 3678) [ClassicSimilarity], result of:
            0.009648533 = score(doc=3678,freq=2.0), product of:
              0.036855653 = queryWeight, product of:
                1.1373208 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.013676312 = queryNorm
              0.26179248 = fieldWeight in 3678, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.078125 = fieldNorm(doc=3678)
          0.1115188 = weight(abstract_txt:lexicon in 3678) [ClassicSimilarity], result of:
            0.1115188 = score(doc=3678,freq=2.0), product of:
              0.13063361 = queryWeight, product of:
                1.2362256 = boost
                7.7265954 = idf(docFreq=52, maxDocs=44218)
                0.013676312 = queryNorm
              0.8536762 = fieldWeight in 3678, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.7265954 = idf(docFreq=52, maxDocs=44218)
                0.078125 = fieldNorm(doc=3678)
          0.03222488 = weight(abstract_txt:framework in 3678) [ClassicSimilarity], result of:
            0.03222488 = score(doc=3678,freq=1.0), product of:
              0.09063662 = queryWeight, product of:
                1.456254 = boost
                4.550903 = idf(docFreq=1268, maxDocs=44218)
                0.013676312 = queryNorm
              0.3555393 = fieldWeight in 3678, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.550903 = idf(docFreq=1268, maxDocs=44218)
                0.078125 = fieldNorm(doc=3678)
          0.033481687 = weight(abstract_txt:proposed in 3678) [ClassicSimilarity], result of:
            0.033481687 = score(doc=3678,freq=1.0), product of:
              0.09297819 = queryWeight, product of:
                1.474945 = boost
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.013676312 = queryNorm
              0.36010262 = fieldWeight in 3678, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.078125 = fieldNorm(doc=3678)
          0.33023712 = weight(abstract_txt:lexicons in 3678) [ClassicSimilarity], result of:
            0.33023712 = score(doc=3678,freq=2.0), product of:
              0.33940387 = queryWeight, product of:
                2.8180175 = boost
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.013676312 = queryNorm
              0.97299165 = fieldWeight in 3678, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.078125 = fieldNorm(doc=3678)
        0.24 = coord(6/25)