Search (2 results, page 1 of 1)

  • × author_ss:"Szlávik, Z."
  1. Szlávik, Z.; Tombros, A.; Lalmas, M.: Summarisation of the logical structure of XML documents (2012) 0.00
    0.0044597755 = product of:
      0.017839102 = sum of:
        0.017839102 = weight(_text_:information in 2731) [ClassicSimilarity], result of:
          0.017839102 = score(doc=2731,freq=6.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.20156369 = fieldWeight in 2731, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2731)
      0.25 = coord(1/4)
    
    Abstract
    Summarisation is traditionally used to produce summaries of the textual contents of documents. In this paper, it is argued that summarisation methods can also be applied to the logical structure of XML documents. Structure summarisation selects the most important elements of the logical structure and ensures that the user's attention is focused towards sections, subsections, etc. that are believed to be of particular interest. Structure summaries are shown to users as hierarchical tables of contents. This paper discusses methods for structure summarisation that use various features of XML elements in order to select document portions that a user's attention should be focused to. An evaluation methodology for structure summarisation is also introduced and summarisation results using various summariser versions are presented and compared to one another. We show that data sets used in information retrieval evaluation can be used effectively in order to produce high quality (query independent) structure summaries. We also discuss the choice and effectiveness of particular summariser features with respect to several evaluation measures.
    Content
    Beitrag in einem Themenheft "Large-Scale and Distributed Systems for Information Retrieval" Vgl.: doi:10.1016/j.ipm.2011.11.002.
    Source
    Information processing and management. 48(2012) no.5, S.956-968
  2. Dominich, S.; Góth, J.; Kiezer, T.; Szlávik, Z.: ¬An entropy-based interpretation of retrieval status value-based retrieval, and its application to the computation of term and query discrimination value (2004) 0.00
    0.0042914203 = product of:
      0.017165681 = sum of:
        0.017165681 = weight(_text_:information in 2237) [ClassicSimilarity], result of:
          0.017165681 = score(doc=2237,freq=8.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.19395474 = fieldWeight in 2237, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2237)
      0.25 = coord(1/4)
    
    Abstract
    The concepts of Shannon information and entropy have been applied to a number of information retrieval tasks such as to formalize the probabilistic model, to design practical retrieval systems, to cluster documents, and to model texture in image retrieval. In this report, the concept of entropy is used for a different purpose. It is shown that any positive Retrieval Status Value (RSV)based retrieval system may be conceived as a special probability space in which the amount of the associated Shannon information is being reduced; in this view, the retrieval system is referred to as Uncertainty Decreasing Operation (UDO). The concept of UDO is then proposed as a theoretical background for term and query discrimination Power, and it is applied to the computation of term and query discrimination values in the vector space retrieval model. Experimental evidence is given as regards such computation; the results obtained compare weIl to those obtained using vector-based calculation of term discrimination values. The UDO-based computation, however, presents advantages over the vectorbased calculation: It is faster, easier to assess and handle in practice, and its application is not restricted to the vector space model. Based an the ADI test collection, it is shown that the UDO-based Term Discrimination Value (TDV) weighting scheme yields better retrieval effectiveness than using the vector-based TDV weighting scheme. Also, experimental evidence is given to the intuition that the choice of an appropriate weighting scheure and similarity measure depends an collection properties, and thus the UDO approach may be used as a theoretical basis for this intuition.
    Source
    Journal of the American Society for Information Science and technology. 55(2004) no.7, S.613-626