Document (#38151)

Author
Bedathur, S.
Narang, A.
Title
Mind your language : effects of spoken query formulation on retrieval effectiveness
Source
http://arxiv.org/abs/1312.4036
Year
2013
Abstract
Voice search is becoming a popular mode for interacting with search engines. As a result, research has gone into building better voice transcription engines, interfaces, and search engines that better handle inherent verbosity of queries. However, when one considers its use by non- native speakers of English, another aspect that becomes important is the formulation of the query by users. In this paper, we present the results of a preliminary study that we conducted with non-native English speakers who formulate queries for given retrieval tasks. Our results show that the current search engines are sensitive in their rankings to the query formulation, and thus highlights the need for developing more robust ranking methods.
Theme
Computerlinguistik

Similar documents (content)

  1. Crestani, F.; Du, H.: Written versus spoken queries : a qualitative and quantitative comparative analysis (2006) 0.31
    0.31332946 = sum of:
      0.31332946 = product of:
        0.9791546 = sum of:
          0.03658567 = weight(abstract_txt:retrieval in 5047) [ClassicSimilarity], result of:
            0.03658567 = score(doc=5047,freq=4.0), product of:
              0.067378104 = queryWeight, product of:
                1.0384492 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.01867073 = queryNorm
              0.5429905 = fieldWeight in 5047, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.078125 = fieldNorm(doc=5047)
          0.018408086 = weight(abstract_txt:results in 5047) [ClassicSimilarity], result of:
            0.018408086 = score(doc=5047,freq=1.0), product of:
              0.06766081 = queryWeight, product of:
                1.0406255 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.01867073 = queryNorm
              0.27206424 = fieldWeight in 5047, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.078125 = fieldNorm(doc=5047)
          0.084866844 = weight(abstract_txt:formulate in 5047) [ClassicSimilarity], result of:
            0.084866844 = score(doc=5047,freq=1.0), product of:
              0.14875793 = queryWeight, product of:
                1.0910658 = boost
                7.3024383 = idf(docFreq=80, maxDocs=44218)
                0.01867073 = queryNorm
              0.570503 = fieldWeight in 5047, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3024383 = idf(docFreq=80, maxDocs=44218)
                0.078125 = fieldNorm(doc=5047)
          0.2961126 = weight(abstract_txt:spoken in 5047) [ClassicSimilarity], result of:
            0.2961126 = score(doc=5047,freq=7.0), product of:
              0.178893 = queryWeight, product of:
                1.1964858 = boost
                8.008008 = idf(docFreq=39, maxDocs=44218)
                0.01867073 = queryNorm
              1.6552497 = fieldWeight in 5047, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                8.008008 = idf(docFreq=39, maxDocs=44218)
                0.078125 = fieldNorm(doc=5047)
          0.15176642 = weight(abstract_txt:transcription in 5047) [ClassicSimilarity], result of:
            0.15176642 = score(doc=5047,freq=1.0), product of:
              0.21916534 = queryWeight, product of:
                1.3243318 = boost
                8.863674 = idf(docFreq=16, maxDocs=44218)
                0.01867073 = queryNorm
              0.69247454 = fieldWeight in 5047, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.863674 = idf(docFreq=16, maxDocs=44218)
                0.078125 = fieldNorm(doc=5047)
          0.016400829 = weight(abstract_txt:that in 5047) [ClassicSimilarity], result of:
            0.016400829 = score(doc=5047,freq=2.0), product of:
              0.06264821 = queryWeight, product of:
                1.4161041 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.01867073 = queryNorm
              0.26179248 = fieldWeight in 5047, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.078125 = fieldNorm(doc=5047)
          0.1835508 = weight(abstract_txt:queries in 5047) [ClassicSimilarity], result of:
            0.1835508 = score(doc=5047,freq=10.0), product of:
              0.14549083 = queryWeight, product of:
                1.5259618 = boost
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.01867073 = queryNorm
              1.2615972 = fieldWeight in 5047, product of:
                3.1622777 = tf(freq=10.0), with freq of:
                  10.0 = termFreq=10.0
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.078125 = fieldNorm(doc=5047)
          0.19146337 = weight(abstract_txt:formulation in 5047) [ClassicSimilarity], result of:
            0.19146337 = score(doc=5047,freq=1.0), product of:
              0.36905038 = queryWeight, product of:
                2.9765577 = boost
                6.640641 = idf(docFreq=156, maxDocs=44218)
                0.01867073 = queryNorm
              0.5188001 = fieldWeight in 5047, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.640641 = idf(docFreq=156, maxDocs=44218)
                0.078125 = fieldNorm(doc=5047)
        0.32 = coord(8/25)
    
  2. Teixera Lopes, C.; Ribeiro, C.: Measuring the value of health query translation : An analysis by user language proficiency (2013) 0.27
    0.2691725 = sum of:
      0.2691725 = product of:
        0.74770135 = sum of:
          0.014726468 = weight(abstract_txt:results in 739) [ClassicSimilarity], result of:
            0.014726468 = score(doc=739,freq=1.0), product of:
              0.06766081 = queryWeight, product of:
                1.0406255 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.01867073 = queryNorm
              0.21765138 = fieldWeight in 739, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.0625 = fieldNorm(doc=739)
          0.08953603 = weight(abstract_txt:spoken in 739) [ClassicSimilarity], result of:
            0.08953603 = score(doc=739,freq=1.0), product of:
              0.178893 = queryWeight, product of:
                1.1964858 = boost
                8.008008 = idf(docFreq=39, maxDocs=44218)
                0.01867073 = queryNorm
              0.5005005 = fieldWeight in 739, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.008008 = idf(docFreq=39, maxDocs=44218)
                0.0625 = fieldNorm(doc=739)
          0.013120663 = weight(abstract_txt:that in 739) [ClassicSimilarity], result of:
            0.013120663 = score(doc=739,freq=2.0), product of:
              0.06264821 = queryWeight, product of:
                1.4161041 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.01867073 = queryNorm
              0.20943399 = fieldWeight in 739, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=739)
          0.037666608 = weight(abstract_txt:better in 739) [ClassicSimilarity], result of:
            0.037666608 = score(doc=739,freq=1.0), product of:
              0.12654425 = queryWeight, product of:
                1.4231381 = boost
                4.76249 = idf(docFreq=1026, maxDocs=44218)
                0.01867073 = queryNorm
              0.2976556 = fieldWeight in 739, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.76249 = idf(docFreq=1026, maxDocs=44218)
                0.0625 = fieldNorm(doc=739)
          0.08042793 = weight(abstract_txt:queries in 739) [ClassicSimilarity], result of:
            0.08042793 = score(doc=739,freq=3.0), product of:
              0.14549083 = queryWeight, product of:
                1.5259618 = boost
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.01867073 = queryNorm
              0.5528041 = fieldWeight in 739, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.0625 = fieldNorm(doc=739)
          0.17084111 = weight(abstract_txt:english in 739) [ClassicSimilarity], result of:
            0.17084111 = score(doc=739,freq=8.0), product of:
              0.1733683 = queryWeight, product of:
                1.6657534 = boost
                5.574394 = idf(docFreq=455, maxDocs=44218)
                0.01867073 = queryNorm
              0.98542297 = fieldWeight in 739, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                5.574394 = idf(docFreq=455, maxDocs=44218)
                0.0625 = fieldNorm(doc=739)
          0.09732384 = weight(abstract_txt:query in 739) [ClassicSimilarity], result of:
            0.09732384 = score(doc=739,freq=3.0), product of:
              0.18912151 = queryWeight, product of:
                2.1307964 = boost
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.01867073 = queryNorm
              0.5146101 = fieldWeight in 739, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.0625 = fieldNorm(doc=739)
          0.03413769 = weight(abstract_txt:search in 739) [ClassicSimilarity], result of:
            0.03413769 = score(doc=739,freq=1.0), product of:
              0.14931525 = queryWeight, product of:
                2.1862154 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.01867073 = queryNorm
              0.22862828 = fieldWeight in 739, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.0625 = fieldNorm(doc=739)
          0.20992102 = weight(abstract_txt:native in 739) [ClassicSimilarity], result of:
            0.20992102 = score(doc=739,freq=2.0), product of:
              0.31571755 = queryWeight, product of:
                2.247891 = boost
                7.5225 = idf(docFreq=64, maxDocs=44218)
                0.01867073 = queryNorm
              0.6649013 = fieldWeight in 739, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.5225 = idf(docFreq=64, maxDocs=44218)
                0.0625 = fieldNorm(doc=739)
        0.36 = coord(9/25)
    
  3. Bast, H.; Bäurle, F.; Buchhold, B.; Haussmann, E.: Broccoli: semantic full-text search at your fingertips (2012) 0.22
    0.2159129 = sum of:
      0.2159129 = product of:
        0.77111745 = sum of:
          0.29289722 = weight(title_txt:your in 704) [ClassicSimilarity], result of:
            0.29289722 = score(doc=704,freq=1.0), product of:
              0.13482122 = queryWeight, product of:
                1.0386996 = boost
                6.9519553 = idf(docFreq=114, maxDocs=44218)
                0.01867073 = queryNorm
              2.172486 = fieldWeight in 704, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9519553 = idf(docFreq=114, maxDocs=44218)
                0.3125 = fieldNorm(doc=704)
          0.016069464 = weight(abstract_txt:that in 704) [ClassicSimilarity], result of:
            0.016069464 = score(doc=704,freq=3.0), product of:
              0.06264821 = queryWeight, product of:
                1.4161041 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.01867073 = queryNorm
              0.2565032 = fieldWeight in 704, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=704)
          0.065669134 = weight(abstract_txt:queries in 704) [ClassicSimilarity], result of:
            0.065669134 = score(doc=704,freq=2.0), product of:
              0.14549083 = queryWeight, product of:
                1.5259618 = boost
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.01867073 = queryNorm
              0.4513627 = fieldWeight in 704, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.0625 = fieldNorm(doc=704)
          0.060401455 = weight(abstract_txt:english in 704) [ClassicSimilarity], result of:
            0.060401455 = score(doc=704,freq=1.0), product of:
              0.1733683 = queryWeight, product of:
                1.6657534 = boost
                5.574394 = idf(docFreq=455, maxDocs=44218)
                0.01867073 = queryNorm
              0.34839964 = fieldWeight in 704, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.574394 = idf(docFreq=455, maxDocs=44218)
                0.0625 = fieldNorm(doc=704)
          0.09732384 = weight(abstract_txt:query in 704) [ClassicSimilarity], result of:
            0.09732384 = score(doc=704,freq=3.0), product of:
              0.18912151 = queryWeight, product of:
                2.1307964 = boost
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.01867073 = queryNorm
              0.5146101 = fieldWeight in 704, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.0625 = fieldNorm(doc=704)
          0.090319835 = weight(abstract_txt:search in 704) [ClassicSimilarity], result of:
            0.090319835 = score(doc=704,freq=7.0), product of:
              0.14931525 = queryWeight, product of:
                2.1862154 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.01867073 = queryNorm
              0.60489357 = fieldWeight in 704, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.0625 = fieldNorm(doc=704)
          0.14843658 = weight(abstract_txt:native in 704) [ClassicSimilarity], result of:
            0.14843658 = score(doc=704,freq=1.0), product of:
              0.31571755 = queryWeight, product of:
                2.247891 = boost
                7.5225 = idf(docFreq=64, maxDocs=44218)
                0.01867073 = queryNorm
              0.47015625 = fieldWeight in 704, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.5225 = idf(docFreq=64, maxDocs=44218)
                0.0625 = fieldNorm(doc=704)
        0.28 = coord(7/25)
    
  4. Chau, M.; Fang, X.; Rittman, C.C.: Web searching in Chinese : a study of a search engine in Hong Kong (2007) 0.21
    0.211265 = sum of:
      0.211265 = product of:
        0.75451785 = sum of:
          0.018555421 = weight(abstract_txt:that in 336) [ClassicSimilarity], result of:
            0.018555421 = score(doc=336,freq=4.0), product of:
              0.06264821 = queryWeight, product of:
                1.4161041 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.01867073 = queryNorm
              0.2961844 = fieldWeight in 336, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=336)
          0.08042793 = weight(abstract_txt:queries in 336) [ClassicSimilarity], result of:
            0.08042793 = score(doc=336,freq=3.0), product of:
              0.14549083 = queryWeight, product of:
                1.5259618 = boost
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.01867073 = queryNorm
              0.5528041 = fieldWeight in 336, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.0625 = fieldNorm(doc=336)
          0.13506176 = weight(abstract_txt:english in 336) [ClassicSimilarity], result of:
            0.13506176 = score(doc=336,freq=5.0), product of:
              0.1733683 = queryWeight, product of:
                1.6657534 = boost
                5.574394 = idf(docFreq=455, maxDocs=44218)
                0.01867073 = queryNorm
              0.7790453 = fieldWeight in 336, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.574394 = idf(docFreq=455, maxDocs=44218)
                0.0625 = fieldNorm(doc=336)
          0.11237989 = weight(abstract_txt:query in 336) [ClassicSimilarity], result of:
            0.11237989 = score(doc=336,freq=4.0), product of:
              0.18912151 = queryWeight, product of:
                2.1307964 = boost
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.01867073 = queryNorm
              0.5942206 = fieldWeight in 336, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.0625 = fieldNorm(doc=336)
          0.107952856 = weight(abstract_txt:search in 336) [ClassicSimilarity], result of:
            0.107952856 = score(doc=336,freq=10.0), product of:
              0.14931525 = queryWeight, product of:
                2.1862154 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.01867073 = queryNorm
              0.7229861 = fieldWeight in 336, product of:
                3.1622777 = tf(freq=10.0), with freq of:
                  10.0 = termFreq=10.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.0625 = fieldNorm(doc=336)
          0.1531707 = weight(abstract_txt:formulation in 336) [ClassicSimilarity], result of:
            0.1531707 = score(doc=336,freq=1.0), product of:
              0.36905038 = queryWeight, product of:
                2.9765577 = boost
                6.640641 = idf(docFreq=156, maxDocs=44218)
                0.01867073 = queryNorm
              0.41504008 = fieldWeight in 336, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.640641 = idf(docFreq=156, maxDocs=44218)
                0.0625 = fieldNorm(doc=336)
          0.14696927 = weight(abstract_txt:engines in 336) [ClassicSimilarity], result of:
            0.14696927 = score(doc=336,freq=2.0), product of:
              0.3136335 = queryWeight, product of:
                3.168488 = boost
                5.3016257 = idf(docFreq=598, maxDocs=44218)
                0.01867073 = queryNorm
              0.46860194 = fieldWeight in 336, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.3016257 = idf(docFreq=598, maxDocs=44218)
                0.0625 = fieldNorm(doc=336)
        0.28 = coord(7/25)
    
  5. DiMartino, D.; Ferns, W.J.; Swacker, S.: ¬A study of CD-ROM search techniques by English-as-a-second-language (ESL) students (1993) 0.21
    0.2068712 = sum of:
      0.2068712 = product of:
        0.7388257 = sum of:
          0.018292835 = weight(abstract_txt:retrieval in 547) [ClassicSimilarity], result of:
            0.018292835 = score(doc=547,freq=1.0), product of:
              0.067378104 = queryWeight, product of:
                1.0384492 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.01867073 = queryNorm
              0.27149525 = fieldWeight in 547, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.078125 = fieldNorm(doc=547)
          0.018408086 = weight(abstract_txt:results in 547) [ClassicSimilarity], result of:
            0.018408086 = score(doc=547,freq=1.0), product of:
              0.06766081 = queryWeight, product of:
                1.0406255 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.01867073 = queryNorm
              0.27206424 = fieldWeight in 547, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.078125 = fieldNorm(doc=547)
          0.016400829 = weight(abstract_txt:that in 547) [ClassicSimilarity], result of:
            0.016400829 = score(doc=547,freq=2.0), product of:
              0.06264821 = queryWeight, product of:
                1.4161041 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.01867073 = queryNorm
              0.26179248 = fieldWeight in 547, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.078125 = fieldNorm(doc=547)
          0.10677569 = weight(abstract_txt:english in 547) [ClassicSimilarity], result of:
            0.10677569 = score(doc=547,freq=2.0), product of:
              0.1733683 = queryWeight, product of:
                1.6657534 = boost
                5.574394 = idf(docFreq=455, maxDocs=44218)
                0.01867073 = queryNorm
              0.6158894 = fieldWeight in 547, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.574394 = idf(docFreq=455, maxDocs=44218)
                0.078125 = fieldNorm(doc=547)
          0.085344225 = weight(abstract_txt:search in 547) [ClassicSimilarity], result of:
            0.085344225 = score(doc=547,freq=4.0), product of:
              0.14931525 = queryWeight, product of:
                2.1862154 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.01867073 = queryNorm
              0.5715707 = fieldWeight in 547, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.078125 = fieldNorm(doc=547)
          0.18554571 = weight(abstract_txt:native in 547) [ClassicSimilarity], result of:
            0.18554571 = score(doc=547,freq=1.0), product of:
              0.31571755 = queryWeight, product of:
                2.247891 = boost
                7.5225 = idf(docFreq=64, maxDocs=44218)
                0.01867073 = queryNorm
              0.5876953 = fieldWeight in 547, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.5225 = idf(docFreq=64, maxDocs=44218)
                0.078125 = fieldNorm(doc=547)
          0.3080583 = weight(abstract_txt:speakers in 547) [ClassicSimilarity], result of:
            0.3080583 = score(doc=547,freq=2.0), product of:
              0.35135275 = queryWeight, product of:
                2.3713603 = boost
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.01867073 = queryNorm
              0.8767778 = fieldWeight in 547, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.078125 = fieldNorm(doc=547)
        0.28 = coord(7/25)