Search (3 results, page 1 of 1)

  • × author_ss:"Sojka, P."
  • × type_ss:"el"
  1. Sojka, P.; Liska, M.: ¬The art of mathematics retrieval (2011) 0.01
    0.005082734 = product of:
      0.035579138 = sum of:
        0.024592843 = weight(_text_:retrieval in 3450) [ClassicSimilarity], result of:
          0.024592843 = score(doc=3450,freq=4.0), product of:
            0.07433229 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.024573348 = queryNorm
            0.33085006 = fieldWeight in 3450, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3450)
        0.010986294 = product of:
          0.03295888 = sum of:
            0.03295888 = weight(_text_:22 in 3450) [ClassicSimilarity], result of:
              0.03295888 = score(doc=3450,freq=4.0), product of:
                0.08605168 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.024573348 = queryNorm
                0.38301262 = fieldWeight in 3450, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3450)
          0.33333334 = coord(1/3)
      0.14285715 = coord(2/14)
    
    Abstract
    The design and architecture of MIaS (Math Indexer and Searcher), a system for mathematics retrieval is presented, and design decisions are discussed. We argue for an approach based on Presentation MathML using a similarity of math subformulae. The system was implemented as a math-aware search engine based on the state-ofthe-art system Apache Lucene. Scalability issues were checked against more than 400,000 arXiv documents with 158 million mathematical formulae. Almost three billion MathML subformulae were indexed using a Solr-compatible Lucene.
    Content
    Vgl.: DocEng2011, September 19-22, 2011, Mountain View, California, USA Copyright 2011 ACM 978-1-4503-0863-2/11/09
    Date
    22. 2.2017 13:00:42
  2. Rehurek, R.; Sojka, P.: Software framework for topic modelling with large corpora (2010) 0.00
    0.0010867883 = product of:
      0.015215035 = sum of:
        0.015215035 = product of:
          0.045645103 = sum of:
            0.045645103 = weight(_text_:2010 in 1058) [ClassicSimilarity], result of:
              0.045645103 = score(doc=1058,freq=3.0), product of:
                0.117538005 = queryWeight, product of:
                  4.7831497 = idf(docFreq=1005, maxDocs=44218)
                  0.024573348 = queryNorm
                0.38834336 = fieldWeight in 1058, product of:
                  1.7320508 = tf(freq=3.0), with freq of:
                    3.0 = termFreq=3.0
                  4.7831497 = idf(docFreq=1005, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1058)
          0.33333334 = coord(1/3)
      0.071428575 = coord(1/14)
    
    Year
    2010
  3. Líska, M.; Sojka, P.: MIaS 1.5 (2014) 0.00
    3.5857776E-4 = product of:
      0.0050200885 = sum of:
        0.0050200885 = weight(_text_:information in 1652) [ClassicSimilarity], result of:
          0.0050200885 = score(doc=1652,freq=2.0), product of:
            0.04313797 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.024573348 = queryNorm
            0.116372846 = fieldWeight in 1652, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=1652)
      0.071428575 = coord(1/14)
    
    Abstract
    A math-aware, full-text indexing based search engine that enables users to search for mathematical formulae inside documents. Search engine is unique because it is able to index and search structural information like representation of mathematical formulae. There is no other software or IR system that is able to store three billions of formulae in its index and search it with response time below a second. MIaS processes documents containing mathematical notation in MathML format. The system is built as an extension to any full-text indexing engine and has been verifiend on state-of-the-art Lucene core. It is scalable - it was verified to index almost whole arxiv.org (440,000 papers) having more than 160,000,000 formulae. Software is being used in EuDML (eudml.org) and other digital libraries. For more details see papers in peer reviewed conferences: [1] Sojka, Petr; Líska, Martin. In Matthew R. B. Hardy, Frank Wm. Tompa. Proceedings of the 2011 ACM Symposium on Document Engineering. Mountain View, CA, USA : ACM, 2011. pp.57--60. [2] Sojka, Petr; Líska, Martin. In J.H.Davenport, W.M. Farmer, J.Urban, F. Rabe. Intelligent Computer Mathematics LNCS 6824. Springer, 2011, pp.228--243.