Document (#33352)

Author
Lin, M.
Zhang, Z.
Title
Question-driven segmentation of lecture speech text : towards intelligent e-learning systems
Source
Journal of the American Society for Information Science and Technology. 59(2008) no.2, S.186-200
Year
2008
Abstract
Recently, lecture videos have been widely used in e-learning systems. Envisioning intelligent e-learning systems, this article addresses the challenge of information seeking in lecture videos by retrieving relevant video segments based on user queries, through dynamic segmentation of lecture speech text. In the proposed approach, shallow parsing such as part of-speech tagging and noun phrase chunking are used to parse both questions and Automated Speech Recognition (ASR) transcripts. A sliding-window algorithm is proposed to identify the start and ending boundaries of returned segments. Phonetic and partial matching is utilized to correct the errors from automated speech recognition and noun phrase chunking. Furthermore, extra knowledge such as lecture slides is used to facilitate the ASR transcript error correction. The approach also makes use of proximity to approximate the deep parsing and structure match between question and sentences in ASR transcripts. The experimental results showed that both phonetic and partial matching improved the segmentation performance, slides-based ASR transcript correction improves information coverage, and proximity is also effective in improving the overall performance.
Theme
Computer Based Training

Similar documents (author)

  1. Zhang, M.; Zhang, Y.: Professional organizations in Twittersphere : an empirical study of U.S. library and information science professional organizations-related Tweets (2020) 4.54
    4.5423746 = sum of:
      4.5423746 = weight(author_txt:zhang in 5775) [ClassicSimilarity], result of:
        4.5423746 = fieldWeight in 5775, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.4238877 = idf(docFreq=194, maxDocs=44218)
          0.5 = fieldNorm(doc=5775)
    
  2. Zhang, Y.; Zhang, C.: Enhancing keyphrase extraction from microblogs using human reading time (2021) 4.54
    4.5423746 = sum of:
      4.5423746 = weight(author_txt:zhang in 237) [ClassicSimilarity], result of:
        4.5423746 = fieldWeight in 237, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.4238877 = idf(docFreq=194, maxDocs=44218)
          0.5 = fieldNorm(doc=237)
    
  3. Zhang, J.: TOFIR: A tool of facilitating information retrieval : introduce a visual retrieval model (2001) 4.01
    4.01493 = sum of:
      4.01493 = weight(author_txt:zhang in 7711) [ClassicSimilarity], result of:
        4.01493 = fieldWeight in 7711, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.4238877 = idf(docFreq=194, maxDocs=44218)
          0.625 = fieldNorm(doc=7711)
    
  4. Zhang, A.: Multimedia file formats on the Internet : a beginner's guide for PC users (1995) 4.01
    4.01493 = sum of:
      4.01493 = weight(author_txt:zhang in 3212) [ClassicSimilarity], result of:
        4.01493 = fieldWeight in 3212, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.4238877 = idf(docFreq=194, maxDocs=44218)
          0.625 = fieldNorm(doc=3212)
    
  5. Zhang, J.: ¬A representational analysis of relational information displays (1996) 4.01
    4.01493 = sum of:
      4.01493 = weight(author_txt:zhang in 6403) [ClassicSimilarity], result of:
        4.01493 = fieldWeight in 6403, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.4238877 = idf(docFreq=194, maxDocs=44218)
          0.625 = fieldNorm(doc=6403)
    

Similar documents (content)

  1. Brill, E.: ¬An overview of empirical natural language processing (1997) 0.13
    0.13142858 = sum of:
      0.13142858 = product of:
        0.6571429 = sum of:
          0.02270156 = weight(abstract_txt:systems in 3249) [ClassicSimilarity], result of:
            0.02270156 = score(doc=3249,freq=1.0), product of:
              0.05322947 = queryWeight, product of:
                1.1052598 = boost
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.014115434 = queryNorm
              0.4264848 = fieldWeight in 3249, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.125 = fieldNorm(doc=3249)
          0.08738519 = weight(abstract_txt:recognition in 3249) [ClassicSimilarity], result of:
            0.08738519 = score(doc=3249,freq=1.0), product of:
              0.1142115 = queryWeight, product of:
                1.3218969 = boost
                6.1209383 = idf(docFreq=263, maxDocs=44218)
                0.014115434 = queryNorm
              0.7651173 = fieldWeight in 3249, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1209383 = idf(docFreq=263, maxDocs=44218)
                0.125 = fieldNorm(doc=3249)
          0.06129081 = weight(abstract_txt:learning in 3249) [ClassicSimilarity], result of:
            0.06129081 = score(doc=3249,freq=1.0), product of:
              0.10320766 = queryWeight, product of:
                1.5390201 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.014115434 = queryNorm
              0.59385914 = fieldWeight in 3249, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.125 = fieldNorm(doc=3249)
          0.17707464 = weight(abstract_txt:parsing in 3249) [ClassicSimilarity], result of:
            0.17707464 = score(doc=3249,freq=1.0), product of:
              0.18288954 = queryWeight, product of:
                1.6727734 = boost
                7.7456436 = idf(docFreq=51, maxDocs=44218)
                0.014115434 = queryNorm
              0.96820545 = fieldWeight in 3249, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7456436 = idf(docFreq=51, maxDocs=44218)
                0.125 = fieldNorm(doc=3249)
          0.3086907 = weight(abstract_txt:speech in 3249) [ClassicSimilarity], result of:
            0.3086907 = score(doc=3249,freq=1.0), product of:
              0.3595398 = queryWeight, product of:
                3.7083967 = boost
                6.8685737 = idf(docFreq=124, maxDocs=44218)
                0.014115434 = queryNorm
              0.8585717 = fieldWeight in 3249, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8685737 = idf(docFreq=124, maxDocs=44218)
                0.125 = fieldNorm(doc=3249)
        0.2 = coord(5/25)
    
  2. Stolcke, A.: Linguistic knowledge and empirical methods in speech recognition (1997) 0.13
    0.12947476 = sum of:
      0.12947476 = product of:
        0.6473738 = sum of:
          0.047288448 = weight(abstract_txt:performance in 2660) [ClassicSimilarity], result of:
            0.047288448 = score(doc=2660,freq=1.0), product of:
              0.06536039 = queryWeight, product of:
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.014115434 = queryNorm
              0.7235032 = fieldWeight in 2660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.15625 = fieldNorm(doc=2660)
          0.028376948 = weight(abstract_txt:systems in 2660) [ClassicSimilarity], result of:
            0.028376948 = score(doc=2660,freq=1.0), product of:
              0.05322947 = queryWeight, product of:
                1.1052598 = boost
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.014115434 = queryNorm
              0.53310597 = fieldWeight in 2660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.15625 = fieldNorm(doc=2660)
          0.10923149 = weight(abstract_txt:recognition in 2660) [ClassicSimilarity], result of:
            0.10923149 = score(doc=2660,freq=1.0), product of:
              0.1142115 = queryWeight, product of:
                1.3218969 = boost
                6.1209383 = idf(docFreq=263, maxDocs=44218)
                0.014115434 = queryNorm
              0.9563966 = fieldWeight in 2660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1209383 = idf(docFreq=263, maxDocs=44218)
                0.15625 = fieldNorm(doc=2660)
          0.076613516 = weight(abstract_txt:learning in 2660) [ClassicSimilarity], result of:
            0.076613516 = score(doc=2660,freq=1.0), product of:
              0.10320766 = queryWeight, product of:
                1.5390201 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.014115434 = queryNorm
              0.74232394 = fieldWeight in 2660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.15625 = fieldNorm(doc=2660)
          0.3858634 = weight(abstract_txt:speech in 2660) [ClassicSimilarity], result of:
            0.3858634 = score(doc=2660,freq=1.0), product of:
              0.3595398 = queryWeight, product of:
                3.7083967 = boost
                6.8685737 = idf(docFreq=124, maxDocs=44218)
                0.014115434 = queryNorm
              1.0732147 = fieldWeight in 2660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8685737 = idf(docFreq=124, maxDocs=44218)
                0.15625 = fieldNorm(doc=2660)
        0.2 = coord(5/25)
    
  3. Benitez, A.B.; Zhong, D.; Chang, S.-F.: Enabling MPEG-7 structural and semantic descriptions in retrieval applications (2007) 0.11
    0.10994077 = sum of:
      0.10994077 = product of:
        0.45808655 = sum of:
          0.013542573 = weight(abstract_txt:used in 518) [ClassicSimilarity], result of:
            0.013542573 = score(doc=518,freq=1.0), product of:
              0.051601518 = queryWeight, product of:
                1.0882272 = boost
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.014115434 = queryNorm
              0.26244524 = fieldWeight in 518, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.078125 = fieldNorm(doc=518)
          0.014188474 = weight(abstract_txt:systems in 518) [ClassicSimilarity], result of:
            0.014188474 = score(doc=518,freq=1.0), product of:
              0.05322947 = queryWeight, product of:
                1.1052598 = boost
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.014115434 = queryNorm
              0.26655298 = fieldWeight in 518, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.078125 = fieldNorm(doc=518)
          0.05826869 = weight(abstract_txt:intelligent in 518) [ClassicSimilarity], result of:
            0.05826869 = score(doc=518,freq=1.0), product of:
              0.119249 = queryWeight, product of:
                1.3507347 = boost
                6.2544694 = idf(docFreq=230, maxDocs=44218)
                0.014115434 = queryNorm
              0.4886304 = fieldWeight in 518, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2544694 = idf(docFreq=230, maxDocs=44218)
                0.078125 = fieldNorm(doc=518)
          0.0803194 = weight(abstract_txt:videos in 518) [ClassicSimilarity], result of:
            0.0803194 = score(doc=518,freq=1.0), product of:
              0.14769922 = queryWeight, product of:
                1.503252 = boost
                6.9606886 = idf(docFreq=113, maxDocs=44218)
                0.014115434 = queryNorm
              0.5438038 = fieldWeight in 518, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9606886 = idf(docFreq=113, maxDocs=44218)
                0.078125 = fieldNorm(doc=518)
          0.11323842 = weight(abstract_txt:segments in 518) [ClassicSimilarity], result of:
            0.11323842 = score(doc=518,freq=1.0), product of:
              0.1857065 = queryWeight, product of:
                1.6856066 = boost
                7.805067 = idf(docFreq=48, maxDocs=44218)
                0.014115434 = queryNorm
              0.6097709 = fieldWeight in 518, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.805067 = idf(docFreq=48, maxDocs=44218)
                0.078125 = fieldNorm(doc=518)
          0.17852898 = weight(abstract_txt:segmentation in 518) [ClassicSimilarity], result of:
            0.17852898 = score(doc=518,freq=1.0), product of:
              0.2879613 = queryWeight, product of:
                2.5707235 = boost
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.014115434 = queryNorm
              0.61997557 = fieldWeight in 518, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.078125 = fieldNorm(doc=518)
        0.24 = coord(6/25)
    
  4. Multilingual information management : current levels and future abilities. A report Commissioned by the US National Science Foundation and also delivered to the European Commission's Language Engineering Office and the US Defense Advanced Research Projects Agency, April 1999 (1999) 0.11
    0.1088875 = sum of:
      0.1088875 = product of:
        0.45369792 = sum of:
          0.016550956 = weight(abstract_txt:performance in 6068) [ClassicSimilarity], result of:
            0.016550956 = score(doc=6068,freq=1.0), product of:
              0.06536039 = queryWeight, product of:
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.014115434 = queryNorm
              0.2532261 = fieldWeight in 6068, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6068)
          0.013406463 = weight(abstract_txt:used in 6068) [ClassicSimilarity], result of:
            0.013406463 = score(doc=6068,freq=2.0), product of:
              0.051601518 = queryWeight, product of:
                1.0882272 = boost
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.014115434 = queryNorm
              0.25980753 = fieldWeight in 6068, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6068)
          0.014045873 = weight(abstract_txt:systems in 6068) [ClassicSimilarity], result of:
            0.014045873 = score(doc=6068,freq=2.0), product of:
              0.05322947 = queryWeight, product of:
                1.1052598 = boost
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.014115434 = queryNorm
              0.263874 = fieldWeight in 6068, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4118783 = idf(docFreq=3963, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6068)
          0.06621807 = weight(abstract_txt:recognition in 6068) [ClassicSimilarity], result of:
            0.06621807 = score(doc=6068,freq=3.0), product of:
              0.1142115 = queryWeight, product of:
                1.3218969 = boost
                6.1209383 = idf(docFreq=263, maxDocs=44218)
                0.014115434 = queryNorm
              0.57978463 = fieldWeight in 6068, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.1209383 = idf(docFreq=263, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6068)
          0.10955934 = weight(abstract_txt:parsing in 6068) [ClassicSimilarity], result of:
            0.10955934 = score(doc=6068,freq=2.0), product of:
              0.18288954 = queryWeight, product of:
                1.6727734 = boost
                7.7456436 = idf(docFreq=51, maxDocs=44218)
                0.014115434 = queryNorm
              0.5990465 = fieldWeight in 6068, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.7456436 = idf(docFreq=51, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6068)
          0.23391722 = weight(abstract_txt:speech in 6068) [ClassicSimilarity], result of:
            0.23391722 = score(doc=6068,freq=3.0), product of:
              0.3595398 = queryWeight, product of:
                3.7083967 = boost
                6.8685737 = idf(docFreq=124, maxDocs=44218)
                0.014115434 = queryNorm
              0.65060174 = fieldWeight in 6068, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.8685737 = idf(docFreq=124, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6068)
        0.24 = coord(6/25)
    
  5. Çelebi, A.; Özgür, A.: Segmenting hashtags and analyzing their grammatical structure (2018) 0.10
    0.101607226 = sum of:
      0.101607226 = product of:
        0.50803614 = sum of:
          0.015321672 = weight(abstract_txt:used in 4221) [ClassicSimilarity], result of:
            0.015321672 = score(doc=4221,freq=2.0), product of:
              0.051601518 = queryWeight, product of:
                1.0882272 = boost
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.014115434 = queryNorm
              0.2969229 = fieldWeight in 4221, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.0625 = fieldNorm(doc=4221)
          0.030645406 = weight(abstract_txt:learning in 4221) [ClassicSimilarity], result of:
            0.030645406 = score(doc=4221,freq=1.0), product of:
              0.10320766 = queryWeight, product of:
                1.5390201 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.014115434 = queryNorm
              0.29692957 = fieldWeight in 4221, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.0625 = fieldNorm(doc=4221)
          0.08853732 = weight(abstract_txt:parsing in 4221) [ClassicSimilarity], result of:
            0.08853732 = score(doc=4221,freq=1.0), product of:
              0.18288954 = queryWeight, product of:
                1.6727734 = boost
                7.7456436 = idf(docFreq=51, maxDocs=44218)
                0.014115434 = queryNorm
              0.48410273 = fieldWeight in 4221, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7456436 = idf(docFreq=51, maxDocs=44218)
                0.0625 = fieldNorm(doc=4221)
          0.12615472 = weight(abstract_txt:noun in 4221) [ClassicSimilarity], result of:
            0.12615472 = score(doc=4221,freq=2.0), product of:
              0.18380767 = queryWeight, product of:
                1.6769669 = boost
                7.7650614 = idf(docFreq=50, maxDocs=44218)
                0.014115434 = queryNorm
              0.6863409 = fieldWeight in 4221, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.7650614 = idf(docFreq=50, maxDocs=44218)
                0.0625 = fieldNorm(doc=4221)
          0.247377 = weight(abstract_txt:segmentation in 4221) [ClassicSimilarity], result of:
            0.247377 = score(doc=4221,freq=3.0), product of:
              0.2879613 = queryWeight, product of:
                2.5707235 = boost
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.014115434 = queryNorm
              0.8590633 = fieldWeight in 4221, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.0625 = fieldNorm(doc=4221)
        0.2 = coord(5/25)