Document (#33621)

Author
Efron, M.
Title
Shannon meets Shortz : a probabilistic model of crossword puzzle difficulty
Source
Journal of the American Society for Information Science and Technology. 59(2008) no.6, S.875-886
Year
2008
Abstract
This article is concerned with the difficulty of crossword puzzles. A model is proposed that quantifies the difficulty of a Puzzle P with respect to its clues. Given a clue-answer pair (c,a), we model the difficulty of guessing a based on c using the conditional probability P(a based on c); easier mappings should enjoy a higher conditional probability. The model is tested by two experiments, each of which involves estimating the difficulty of puzzles taken from The New York Times. Additionally, we discuss how the notion of information implicit in our model relates to more easily quantifiable types of information that figure into crossword puzzles.

Similar documents (author)

  1. Efron, M.: Eigenvalue-based model selection during Latent Semantic Indexing (2005) 6.09
    6.094361 = sum of:
      6.094361 = weight(author_txt:efron in 3685) [ClassicSimilarity], result of:
        6.094361 = fieldWeight in 3685, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.7509775 = idf(docFreq=6, maxDocs=44218)
          0.625 = fieldNorm(doc=3685)
    
  2. Efron, M.: Query expansion and dimensionality reduction : Notions of optimality in Rocchio relevance feedback and latent semantic indexing (2008) 6.09
    6.094361 = sum of:
      6.094361 = weight(author_txt:efron in 2020) [ClassicSimilarity], result of:
        6.094361 = fieldWeight in 2020, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.7509775 = idf(docFreq=6, maxDocs=44218)
          0.625 = fieldNorm(doc=2020)
    
  3. Efron, M.: Linear time series models for term weighting in information retrieval (2010) 6.09
    6.094361 = sum of:
      6.094361 = weight(author_txt:efron in 3688) [ClassicSimilarity], result of:
        6.094361 = fieldWeight in 3688, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.7509775 = idf(docFreq=6, maxDocs=44218)
          0.625 = fieldNorm(doc=3688)
    
  4. Efron, M.: Information search and retrieval in microblogs (2011) 6.09
    6.094361 = sum of:
      6.094361 = weight(author_txt:efron in 4455) [ClassicSimilarity], result of:
        6.094361 = fieldWeight in 4455, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.7509775 = idf(docFreq=6, maxDocs=44218)
          0.625 = fieldNorm(doc=4455)
    
  5. Efron, M.; Winget, M.: Query polyrepresentation for ranking retrieval systems without relevance judgments (2010) 4.88
    4.8754888 = sum of:
      4.8754888 = weight(author_txt:efron in 3469) [ClassicSimilarity], result of:
        4.8754888 = fieldWeight in 3469, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.7509775 = idf(docFreq=6, maxDocs=44218)
          0.5 = fieldNorm(doc=3469)
    

Similar documents (content)

  1. Bodoff, D.; Robertson, S.: ¬A new unified probabilistic model (2004) 0.09
    0.08883041 = sum of:
      0.08883041 = product of:
        0.5551901 = sum of:
          0.084030434 = weight(abstract_txt:probabilistic in 2129) [ClassicSimilarity], result of:
            0.084030434 = score(doc=2129,freq=3.0), product of:
              0.091633536 = queryWeight, product of:
                1.1198832 = boost
                6.7769065 = idf(docFreq=136, maxDocs=44218)
                0.012073974 = queryNorm
              0.91702706 = fieldWeight in 2129, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.7769065 = idf(docFreq=136, maxDocs=44218)
                0.078125 = fieldNorm(doc=2129)
          0.10173423 = weight(abstract_txt:probability in 2129) [ClassicSimilarity], result of:
            0.10173423 = score(doc=2129,freq=1.0), product of:
              0.18914369 = queryWeight, product of:
                2.275393 = boost
                6.8847027 = idf(docFreq=122, maxDocs=44218)
                0.012073974 = queryNorm
              0.5378674 = fieldWeight in 2129, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8847027 = idf(docFreq=122, maxDocs=44218)
                0.078125 = fieldNorm(doc=2129)
          0.14810206 = weight(abstract_txt:model in 2129) [ClassicSimilarity], result of:
            0.14810206 = score(doc=2129,freq=9.0), product of:
              0.15852109 = queryWeight, product of:
                3.2936242 = boost
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.012073974 = queryNorm
              0.9342736 = fieldWeight in 2129, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.078125 = fieldNorm(doc=2129)
          0.22132336 = weight(abstract_txt:difficulty in 2129) [ClassicSimilarity], result of:
            0.22132336 = score(doc=2129,freq=1.0), product of:
              0.43100137 = queryWeight, product of:
                5.4308753 = boost
                6.572923 = idf(docFreq=167, maxDocs=44218)
                0.012073974 = queryNorm
              0.51350963 = fieldWeight in 2129, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.572923 = idf(docFreq=167, maxDocs=44218)
                0.078125 = fieldNorm(doc=2129)
        0.16 = coord(4/25)
    
  2. Bruza, P.D.; Huibers, T.W.C.: ¬A study of aboutness in information retrieval (1996) 0.09
    0.08657765 = sum of:
      0.08657765 = product of:
        0.4328882 = sum of:
          0.012120434 = weight(abstract_txt:based in 7705) [ClassicSimilarity], result of:
            0.012120434 = score(doc=7705,freq=1.0), product of:
              0.04055444 = queryWeight, product of:
                1.0536096 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.012073974 = queryNorm
              0.29886824 = fieldWeight in 7705, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.09375 = fieldNorm(doc=7705)
          0.05821799 = weight(abstract_txt:probabilistic in 7705) [ClassicSimilarity], result of:
            0.05821799 = score(doc=7705,freq=1.0), product of:
              0.091633536 = queryWeight, product of:
                1.1198832 = boost
                6.7769065 = idf(docFreq=136, maxDocs=44218)
                0.012073974 = queryNorm
              0.63533497 = fieldWeight in 7705, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7769065 = idf(docFreq=136, maxDocs=44218)
                0.09375 = fieldNorm(doc=7705)
          0.07247284 = weight(abstract_txt:relates in 7705) [ClassicSimilarity], result of:
            0.07247284 = score(doc=7705,freq=1.0), product of:
              0.10603922 = queryWeight, product of:
                1.2046996 = boost
                7.290168 = idf(docFreq=81, maxDocs=44218)
                0.012073974 = queryNorm
              0.6834532 = fieldWeight in 7705, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.290168 = idf(docFreq=81, maxDocs=44218)
                0.09375 = fieldNorm(doc=7705)
          0.2062978 = weight(abstract_txt:conditional in 7705) [ClassicSimilarity], result of:
            0.2062978 = score(doc=7705,freq=1.0), product of:
              0.26834244 = queryWeight, product of:
                2.710224 = boost
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.012073974 = queryNorm
              0.7687856 = fieldWeight in 7705, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.09375 = fieldNorm(doc=7705)
          0.08377917 = weight(abstract_txt:model in 7705) [ClassicSimilarity], result of:
            0.08377917 = score(doc=7705,freq=2.0), product of:
              0.15852109 = queryWeight, product of:
                3.2936242 = boost
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.012073974 = queryNorm
              0.5285049 = fieldWeight in 7705, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.09375 = fieldNorm(doc=7705)
        0.2 = coord(5/25)
    
  3. Liu, X.; Croft, W.B.: Statistical language modeling for information retrieval (2004) 0.08
    0.08425722 = sum of:
      0.08425722 = product of:
        0.35107177 = sum of:
          0.026696293 = weight(abstract_txt:involves in 4277) [ClassicSimilarity], result of:
            0.026696293 = score(doc=4277,freq=1.0), product of:
              0.07804991 = queryWeight, product of:
                1.0335505 = boost
                6.2544694 = idf(docFreq=230, maxDocs=44218)
                0.012073974 = queryNorm
              0.34204128 = fieldWeight in 4277, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2544694 = idf(docFreq=230, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4277)
          0.033960495 = weight(abstract_txt:probabilistic in 4277) [ClassicSimilarity], result of:
            0.033960495 = score(doc=4277,freq=1.0), product of:
              0.091633536 = queryWeight, product of:
                1.1198832 = boost
                6.7769065 = idf(docFreq=136, maxDocs=44218)
                0.012073974 = queryNorm
              0.37061208 = fieldWeight in 4277, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7769065 = idf(docFreq=136, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4277)
          0.08241345 = weight(abstract_txt:estimating in 4277) [ClassicSimilarity], result of:
            0.08241345 = score(doc=4277,freq=2.0), product of:
              0.13133904 = queryWeight, product of:
                1.3407334 = boost
                8.113368 = idf(docFreq=35, maxDocs=44218)
                0.012073974 = queryNorm
              0.6274863 = fieldWeight in 4277, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.113368 = idf(docFreq=35, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4277)
          0.059515446 = weight(abstract_txt:shannon in 4277) [ClassicSimilarity], result of:
            0.059515446 = score(doc=4277,freq=1.0), product of:
              0.13319612 = queryWeight, product of:
                1.3501788 = boost
                8.1705265 = idf(docFreq=33, maxDocs=44218)
                0.012073974 = queryNorm
              0.44682568 = fieldWeight in 4277, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.1705265 = idf(docFreq=33, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4277)
          0.07121395 = weight(abstract_txt:probability in 4277) [ClassicSimilarity], result of:
            0.07121395 = score(doc=4277,freq=1.0), product of:
              0.18914369 = queryWeight, product of:
                2.275393 = boost
                6.8847027 = idf(docFreq=122, maxDocs=44218)
                0.012073974 = queryNorm
              0.37650716 = fieldWeight in 4277, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8847027 = idf(docFreq=122, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4277)
          0.07727213 = weight(abstract_txt:model in 4277) [ClassicSimilarity], result of:
            0.07727213 = score(doc=4277,freq=5.0), product of:
              0.15852109 = queryWeight, product of:
                3.2936242 = boost
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.012073974 = queryNorm
              0.4874565 = fieldWeight in 4277, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4277)
        0.24 = coord(6/25)
    
  4. Dominich, S.; Góth, J.; Kiezer, T.; Szlávik, Z.: ¬An entropy-based interpretation of retrieval status value-based retrieval, and its application to the computation of term and query discrimination value (2004) 0.07
    0.0736023 = sum of:
      0.0736023 = product of:
        0.30667624 = sum of:
          0.017318513 = weight(abstract_txt:based in 2237) [ClassicSimilarity], result of:
            0.017318513 = score(doc=2237,freq=6.0), product of:
              0.04055444 = queryWeight, product of:
                1.0536096 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.012073974 = queryNorm
              0.42704356 = fieldWeight in 2237, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2237)
          0.030901412 = weight(abstract_txt:easier in 2237) [ClassicSimilarity], result of:
            0.030901412 = score(doc=2237,freq=1.0), product of:
              0.08604468 = queryWeight, product of:
                1.0851943 = boost
                6.5669885 = idf(docFreq=168, maxDocs=44218)
                0.012073974 = queryNorm
              0.35913217 = fieldWeight in 2237, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5669885 = idf(docFreq=168, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2237)
          0.033960495 = weight(abstract_txt:probabilistic in 2237) [ClassicSimilarity], result of:
            0.033960495 = score(doc=2237,freq=1.0), product of:
              0.091633536 = queryWeight, product of:
                1.1198832 = boost
                6.7769065 = idf(docFreq=136, maxDocs=44218)
                0.012073974 = queryNorm
              0.37061208 = fieldWeight in 2237, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7769065 = idf(docFreq=136, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2237)
          0.08416755 = weight(abstract_txt:shannon in 2237) [ClassicSimilarity], result of:
            0.08416755 = score(doc=2237,freq=2.0), product of:
              0.13319612 = queryWeight, product of:
                1.3501788 = boost
                8.1705265 = idf(docFreq=33, maxDocs=44218)
                0.012073974 = queryNorm
              0.6319069 = fieldWeight in 2237, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.1705265 = idf(docFreq=33, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2237)
          0.07121395 = weight(abstract_txt:probability in 2237) [ClassicSimilarity], result of:
            0.07121395 = score(doc=2237,freq=1.0), product of:
              0.18914369 = queryWeight, product of:
                2.275393 = boost
                6.8847027 = idf(docFreq=122, maxDocs=44218)
                0.012073974 = queryNorm
              0.37650716 = fieldWeight in 2237, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8847027 = idf(docFreq=122, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2237)
          0.0691143 = weight(abstract_txt:model in 2237) [ClassicSimilarity], result of:
            0.0691143 = score(doc=2237,freq=4.0), product of:
              0.15852109 = queryWeight, product of:
                3.2936242 = boost
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.012073974 = queryNorm
              0.43599433 = fieldWeight in 2237, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2237)
        0.24 = coord(6/25)
    
  5. Torvik, V.I.; Weeber, M.; Swanson, D.R.; Smalheiser, N.R.: ¬A probabilistic similarity metric for medline mecords : a model for author name disambiguation (2005) 0.07
    0.06756141 = sum of:
      0.06756141 = product of:
        0.33780703 = sum of:
          0.00808029 = weight(abstract_txt:based in 3308) [ClassicSimilarity], result of:
            0.00808029 = score(doc=3308,freq=1.0), product of:
              0.04055444 = queryWeight, product of:
                1.0536096 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.012073974 = queryNorm
              0.19924548 = fieldWeight in 3308, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.0625 = fieldNorm(doc=3308)
          0.07962206 = weight(abstract_txt:pair in 3308) [ClassicSimilarity], result of:
            0.07962206 = score(doc=3308,freq=2.0), product of:
              0.11742379 = queryWeight, product of:
                1.2677206 = boost
                7.6715355 = idf(docFreq=55, maxDocs=44218)
                0.012073974 = queryNorm
              0.67807436 = fieldWeight in 3308, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.6715355 = idf(docFreq=55, maxDocs=44218)
                0.0625 = fieldNorm(doc=3308)
          0.06660012 = weight(abstract_txt:estimating in 3308) [ClassicSimilarity], result of:
            0.06660012 = score(doc=3308,freq=1.0), product of:
              0.13133904 = queryWeight, product of:
                1.3407334 = boost
                8.113368 = idf(docFreq=35, maxDocs=44218)
                0.012073974 = queryNorm
              0.5070855 = fieldWeight in 3308, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.113368 = idf(docFreq=35, maxDocs=44218)
                0.0625 = fieldNorm(doc=3308)
          0.11509913 = weight(abstract_txt:probability in 3308) [ClassicSimilarity], result of:
            0.11509913 = score(doc=3308,freq=2.0), product of:
              0.18914369 = queryWeight, product of:
                2.275393 = boost
                6.8847027 = idf(docFreq=122, maxDocs=44218)
                0.012073974 = queryNorm
              0.6085275 = fieldWeight in 3308, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.8847027 = idf(docFreq=122, maxDocs=44218)
                0.0625 = fieldNorm(doc=3308)
          0.06840541 = weight(abstract_txt:model in 3308) [ClassicSimilarity], result of:
            0.06840541 = score(doc=3308,freq=3.0), product of:
              0.15852109 = queryWeight, product of:
                3.2936242 = boost
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.012073974 = queryNorm
              0.4315225 = fieldWeight in 3308, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.0625 = fieldNorm(doc=3308)
        0.2 = coord(5/25)