Document (#38653)

Author
Líska, M.
Sojka, P.
Title
MIaS 1.5
Source
http://www.muni.cz/research/publications/1173732
Year
2014
Abstract
A math-aware, full-text indexing based search engine that enables users to search for mathematical formulae inside documents. Search engine is unique because it is able to index and search structural information like representation of mathematical formulae. There is no other software or IR system that is able to store three billions of formulae in its index and search it with response time below a second. MIaS processes documents containing mathematical notation in MathML format. The system is built as an extension to any full-text indexing engine and has been verifiend on state-of-the-art Lucene core. It is scalable - it was verified to index almost whole arxiv.org (440,000 papers) having more than 160,000,000 formulae. Software is being used in EuDML (eudml.org) and other digital libraries. For more details see papers in peer reviewed conferences: [1] Sojka, Petr; Líska, Martin. In Matthew R. B. Hardy, Frank Wm. Tompa. Proceedings of the 2011 ACM Symposium on Document Engineering. Mountain View, CA, USA : ACM, 2011. pp.57--60. [2] Sojka, Petr; Líska, Martin. In J.H.Davenport, W.M. Farmer, J.Urban, F. Rabe. Intelligent Computer Mathematics LNCS 6824. Springer, 2011, pp.228--243.
Content
Vgl.: https://mir.fi.muni.cz/mias/.
Field
Mathematik

Similar documents (author)

  1. Sojka, P.: Exploiting semantic annotations in math information retrieval (2012) 6.19
    6.190705 = sum of:
      6.190705 = weight(author_txt:sojka in 32) [ClassicSimilarity], result of:
        6.190705 = fieldWeight in 32, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.625 = fieldNorm(doc=32)
    
  2. Rehurek, R.; Sojka, P.: Software framework for topic modelling with large corpora (2010) 4.95
    4.952564 = sum of:
      4.952564 = weight(author_txt:sojka in 1058) [ClassicSimilarity], result of:
        4.952564 = fieldWeight in 1058, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.5 = fieldNorm(doc=1058)
    
  3. Sojka, P.; Liska, M.: ¬The art of mathematics retrieval (2011) 4.95
    4.952564 = sum of:
      4.952564 = weight(author_txt:sojka in 3450) [ClassicSimilarity], result of:
        4.952564 = fieldWeight in 3450, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.5 = fieldNorm(doc=3450)
    
  4. Sojka, P.; Lee, M.; Rehurek, R.; Hatlapatka, R.; Kucbel, M.; Bouche, T.; Goutorbe, C.; Anghelache, R.; Wojciechowski, K.: Toolset for entity and semantic associations : Final Release (2013) 2.17
    2.1667466 = sum of:
      2.1667466 = weight(author_txt:sojka in 1057) [ClassicSimilarity], result of:
        2.1667466 = fieldWeight in 1057, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.21875 = fieldNorm(doc=1057)
    

Similar documents (content)

  1. Sojka, P.; Liska, M.: ¬The art of mathematics retrieval (2011) 0.39
    0.39497694 = sum of:
      0.39497694 = product of:
        1.4106319 = sum of:
          0.032408237 = weight(abstract_txt:documents in 3450) [ClassicSimilarity], result of:
            0.032408237 = score(doc=3450,freq=1.0), product of:
              0.07189569 = queryWeight, product of:
                1.0587988 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.016476119 = queryNorm
              0.45076746 = fieldWeight in 3450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.109375 = fieldNorm(doc=3450)
          0.23465566 = weight(abstract_txt:math in 3450) [ClassicSimilarity], result of:
            0.23465566 = score(doc=3450,freq=3.0), product of:
              0.1480822 = queryWeight, product of:
                1.0744803 = boost
                8.364683 = idf(docFreq=27, maxDocs=44218)
                0.016476119 = queryNorm
              1.5846312 = fieldWeight in 3450, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.364683 = idf(docFreq=27, maxDocs=44218)
                0.109375 = fieldNorm(doc=3450)
          0.26323557 = weight(abstract_txt:lucene in 3450) [ClassicSimilarity], result of:
            0.26323557 = score(doc=3450,freq=2.0), product of:
              0.18301034 = queryWeight, product of:
                1.1944964 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.016476119 = queryNorm
              1.4383645 = fieldWeight in 3450, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.109375 = fieldNorm(doc=3450)
          0.11726725 = weight(abstract_txt:engine in 3450) [ClassicSimilarity], result of:
            0.11726725 = score(doc=3450,freq=1.0), product of:
              0.19397576 = queryWeight, product of:
                2.1300087 = boost
                5.5272765 = idf(docFreq=477, maxDocs=44218)
                0.016476119 = queryNorm
              0.6045459 = fieldWeight in 3450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5272765 = idf(docFreq=477, maxDocs=44218)
                0.109375 = fieldNorm(doc=3450)
          0.05665545 = weight(abstract_txt:search in 3450) [ClassicSimilarity], result of:
            0.05665545 = score(doc=3450,freq=1.0), product of:
              0.14160341 = queryWeight, product of:
                2.3494644 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.016476119 = queryNorm
              0.4000995 = fieldWeight in 3450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.109375 = fieldNorm(doc=3450)
          0.17779483 = weight(abstract_txt:mathematical in 3450) [ClassicSimilarity], result of:
            0.17779483 = score(doc=3450,freq=1.0), product of:
              0.25600144 = queryWeight, product of:
                2.4469712 = boost
                6.3497796 = idf(docFreq=209, maxDocs=44218)
                0.016476119 = queryNorm
              0.6945071 = fieldWeight in 3450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3497796 = idf(docFreq=209, maxDocs=44218)
                0.109375 = fieldNorm(doc=3450)
          0.5286149 = weight(abstract_txt:formulae in 3450) [ClassicSimilarity], result of:
            0.5286149 = score(doc=3450,freq=1.0), product of:
              0.5825978 = queryWeight, product of:
                4.2624707 = boost
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.016476119 = queryNorm
              0.90734106 = fieldWeight in 3450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.109375 = fieldNorm(doc=3450)
        0.28 = coord(7/25)
    
  2. Sojka, P.: Exploiting semantic annotations in math information retrieval (2012) 0.29
    0.2869843 = sum of:
      0.2869843 = product of:
        0.8968259 = sum of:
          0.024741119 = weight(abstract_txt:text in 32) [ClassicSimilarity], result of:
            0.024741119 = score(doc=32,freq=2.0), product of:
              0.06921934 = queryWeight, product of:
                1.0389048 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.016476119 = queryNorm
              0.3574307 = fieldWeight in 32, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=32)
          0.026189812 = weight(abstract_txt:documents in 32) [ClassicSimilarity], result of:
            0.026189812 = score(doc=32,freq=2.0), product of:
              0.07189569 = queryWeight, product of:
                1.0587988 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.016476119 = queryNorm
              0.36427513 = fieldWeight in 32, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=32)
          0.13408895 = weight(abstract_txt:math in 32) [ClassicSimilarity], result of:
            0.13408895 = score(doc=32,freq=3.0), product of:
              0.1480822 = queryWeight, product of:
                1.0744803 = boost
                8.364683 = idf(docFreq=27, maxDocs=44218)
                0.016476119 = queryNorm
              0.9055035 = fieldWeight in 32, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.364683 = idf(docFreq=27, maxDocs=44218)
                0.0625 = fieldNorm(doc=32)
          0.037706714 = weight(abstract_txt:indexing in 32) [ClassicSimilarity], result of:
            0.037706714 = score(doc=32,freq=3.0), product of:
              0.08008109 = queryWeight, product of:
                1.1174471 = boost
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.016476119 = queryNorm
              0.47085664 = fieldWeight in 32, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.0625 = fieldNorm(doc=32)
          0.10636323 = weight(abstract_txt:lucene in 32) [ClassicSimilarity], result of:
            0.10636323 = score(doc=32,freq=1.0), product of:
              0.18301034 = queryWeight, product of:
                1.1944964 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.016476119 = queryNorm
              0.581187 = fieldWeight in 32, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.0625 = fieldNorm(doc=32)
          0.094766244 = weight(abstract_txt:engine in 32) [ClassicSimilarity], result of:
            0.094766244 = score(doc=32,freq=2.0), product of:
              0.19397576 = queryWeight, product of:
                2.1300087 = boost
                5.5272765 = idf(docFreq=477, maxDocs=44218)
                0.016476119 = queryNorm
              0.48854682 = fieldWeight in 32, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5272765 = idf(docFreq=477, maxDocs=44218)
                0.0625 = fieldNorm(doc=32)
          0.045784518 = weight(abstract_txt:search in 32) [ClassicSimilarity], result of:
            0.045784518 = score(doc=32,freq=2.0), product of:
              0.14160341 = queryWeight, product of:
                2.3494644 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.016476119 = queryNorm
              0.3233292 = fieldWeight in 32, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.0625 = fieldNorm(doc=32)
          0.42718533 = weight(abstract_txt:formulae in 32) [ClassicSimilarity], result of:
            0.42718533 = score(doc=32,freq=2.0), product of:
              0.5825978 = queryWeight, product of:
                4.2624707 = boost
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.016476119 = queryNorm
              0.7332423 = fieldWeight in 32, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.0625 = fieldNorm(doc=32)
        0.32 = coord(8/25)
    
  3. Fife, E.D.; Husch, L.: ¬The Mathematics Archives : making mathematics easy to find on the Web (1999) 0.15
    0.15164961 = sum of:
      0.15164961 = product of:
        0.5416058 = sum of:
          0.017494611 = weight(abstract_txt:text in 1239) [ClassicSimilarity], result of:
            0.017494611 = score(doc=1239,freq=1.0), product of:
              0.06921934 = queryWeight, product of:
                1.0389048 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.016476119 = queryNorm
              0.25274166 = fieldWeight in 1239, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=1239)
          0.07741629 = weight(abstract_txt:math in 1239) [ClassicSimilarity], result of:
            0.07741629 = score(doc=1239,freq=1.0), product of:
              0.1480822 = queryWeight, product of:
                1.0744803 = boost
                8.364683 = idf(docFreq=27, maxDocs=44218)
                0.016476119 = queryNorm
              0.5227927 = fieldWeight in 1239, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.364683 = idf(docFreq=27, maxDocs=44218)
                0.0625 = fieldNorm(doc=1239)
          0.03787507 = weight(abstract_txt:software in 1239) [ClassicSimilarity], result of:
            0.03787507 = score(doc=1239,freq=3.0), product of:
              0.080319285 = queryWeight, product of:
                1.1191078 = boost
                4.3560514 = idf(docFreq=1541, maxDocs=44218)
                0.016476119 = queryNorm
              0.4715564 = fieldWeight in 1239, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.3560514 = idf(docFreq=1541, maxDocs=44218)
                0.0625 = fieldNorm(doc=1239)
          0.03155835 = weight(abstract_txt:full in 1239) [ClassicSimilarity], result of:
            0.03155835 = score(doc=1239,freq=1.0), product of:
              0.10257325 = queryWeight, product of:
                1.2646754 = boost
                4.922663 = idf(docFreq=874, maxDocs=44218)
                0.016476119 = queryNorm
              0.30766645 = fieldWeight in 1239, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.922663 = idf(docFreq=874, maxDocs=44218)
                0.0625 = fieldNorm(doc=1239)
          0.094766244 = weight(abstract_txt:engine in 1239) [ClassicSimilarity], result of:
            0.094766244 = score(doc=1239,freq=2.0), product of:
              0.19397576 = queryWeight, product of:
                2.1300087 = boost
                5.5272765 = idf(docFreq=477, maxDocs=44218)
                0.016476119 = queryNorm
              0.48854682 = fieldWeight in 1239, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5272765 = idf(docFreq=477, maxDocs=44218)
                0.0625 = fieldNorm(doc=1239)
          0.07930112 = weight(abstract_txt:search in 1239) [ClassicSimilarity], result of:
            0.07930112 = score(doc=1239,freq=6.0), product of:
              0.14160341 = queryWeight, product of:
                2.3494644 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.016476119 = queryNorm
              0.56002265 = fieldWeight in 1239, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.0625 = fieldNorm(doc=1239)
          0.2031941 = weight(abstract_txt:mathematical in 1239) [ClassicSimilarity], result of:
            0.2031941 = score(doc=1239,freq=4.0), product of:
              0.25600144 = queryWeight, product of:
                2.4469712 = boost
                6.3497796 = idf(docFreq=209, maxDocs=44218)
                0.016476119 = queryNorm
              0.79372245 = fieldWeight in 1239, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.3497796 = idf(docFreq=209, maxDocs=44218)
                0.0625 = fieldNorm(doc=1239)
        0.28 = coord(7/25)
    
  4. Greiner-Petter, A.; Schubotz, M.; Cohl, H.S.; Gipp, B.: Semantic preserving bijective mappings for expressions involving special functions between computer algebra systems and document preparation systems (2019) 0.12
    0.120583475 = sum of:
      0.120583475 = product of:
        0.75364673 = sum of:
          0.07741629 = weight(abstract_txt:math in 5499) [ClassicSimilarity], result of:
            0.07741629 = score(doc=5499,freq=1.0), product of:
              0.1480822 = queryWeight, product of:
                1.0744803 = boost
                8.364683 = idf(docFreq=27, maxDocs=44218)
                0.016476119 = queryNorm
              0.5227927 = fieldWeight in 5499, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.364683 = idf(docFreq=27, maxDocs=44218)
                0.0625 = fieldNorm(doc=5499)
          0.021867184 = weight(abstract_txt:software in 5499) [ClassicSimilarity], result of:
            0.021867184 = score(doc=5499,freq=1.0), product of:
              0.080319285 = queryWeight, product of:
                1.1191078 = boost
                4.3560514 = idf(docFreq=1541, maxDocs=44218)
                0.016476119 = queryNorm
              0.27225322 = fieldWeight in 5499, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3560514 = idf(docFreq=1541, maxDocs=44218)
                0.0625 = fieldNorm(doc=5499)
          0.2271779 = weight(abstract_txt:mathematical in 5499) [ClassicSimilarity], result of:
            0.2271779 = score(doc=5499,freq=5.0), product of:
              0.25600144 = queryWeight, product of:
                2.4469712 = boost
                6.3497796 = idf(docFreq=209, maxDocs=44218)
                0.016476119 = queryNorm
              0.8874087 = fieldWeight in 5499, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.3497796 = idf(docFreq=209, maxDocs=44218)
                0.0625 = fieldNorm(doc=5499)
          0.42718533 = weight(abstract_txt:formulae in 5499) [ClassicSimilarity], result of:
            0.42718533 = score(doc=5499,freq=2.0), product of:
              0.5825978 = queryWeight, product of:
                4.2624707 = boost
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.016476119 = queryNorm
              0.7332423 = fieldWeight in 5499, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.0625 = fieldNorm(doc=5499)
        0.16 = coord(4/25)
    
  5. Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (2005) 0.12
    0.1204823 = sum of:
      0.1204823 = product of:
        0.43029395 = sum of:
          0.030926397 = weight(abstract_txt:text in 7) [ClassicSimilarity], result of:
            0.030926397 = score(doc=7,freq=2.0), product of:
              0.06921934 = queryWeight, product of:
                1.0389048 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.016476119 = queryNorm
              0.44678837 = fieldWeight in 7, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.078125 = fieldNorm(doc=7)
          0.027212476 = weight(abstract_txt:indexing in 7) [ClassicSimilarity], result of:
            0.027212476 = score(doc=7,freq=1.0), product of:
              0.08008109 = queryWeight, product of:
                1.1174471 = boost
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.016476119 = queryNorm
              0.3398115 = fieldWeight in 7, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.078125 = fieldNorm(doc=7)
          0.02733398 = weight(abstract_txt:software in 7) [ClassicSimilarity], result of:
            0.02733398 = score(doc=7,freq=1.0), product of:
              0.080319285 = queryWeight, product of:
                1.1191078 = boost
                4.3560514 = idf(docFreq=1541, maxDocs=44218)
                0.016476119 = queryNorm
              0.34031653 = fieldWeight in 7, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3560514 = idf(docFreq=1541, maxDocs=44218)
                0.078125 = fieldNorm(doc=7)
          0.053126097 = weight(abstract_txt:index in 7) [ClassicSimilarity], result of:
            0.053126097 = score(doc=7,freq=1.0), product of:
              0.1431925 = queryWeight, product of:
                1.8300704 = boost
                4.74895 = idf(docFreq=1040, maxDocs=44218)
                0.016476119 = queryNorm
              0.37101173 = fieldWeight in 7, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.74895 = idf(docFreq=1040, maxDocs=44218)
                0.078125 = fieldNorm(doc=7)
          0.08376232 = weight(abstract_txt:engine in 7) [ClassicSimilarity], result of:
            0.08376232 = score(doc=7,freq=1.0), product of:
              0.19397576 = queryWeight, product of:
                2.1300087 = boost
                5.5272765 = idf(docFreq=477, maxDocs=44218)
                0.016476119 = queryNorm
              0.4318185 = fieldWeight in 7, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5272765 = idf(docFreq=477, maxDocs=44218)
                0.078125 = fieldNorm(doc=7)
          0.08093636 = weight(abstract_txt:search in 7) [ClassicSimilarity], result of:
            0.08093636 = score(doc=7,freq=4.0), product of:
              0.14160341 = queryWeight, product of:
                2.3494644 = boost
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.016476119 = queryNorm
              0.5715707 = fieldWeight in 7, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.6580524 = idf(docFreq=3098, maxDocs=44218)
                0.078125 = fieldNorm(doc=7)
          0.12699631 = weight(abstract_txt:mathematical in 7) [ClassicSimilarity], result of:
            0.12699631 = score(doc=7,freq=1.0), product of:
              0.25600144 = queryWeight, product of:
                2.4469712 = boost
                6.3497796 = idf(docFreq=209, maxDocs=44218)
                0.016476119 = queryNorm
              0.49607652 = fieldWeight in 7, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3497796 = idf(docFreq=209, maxDocs=44218)
                0.078125 = fieldNorm(doc=7)
        0.28 = coord(7/25)