Document (#16321)

Author
Kuikka, E.
Salminen, A.
Title
Two-dimensional filters for structured text
Source
Information processing and management. 33(1997) no.1, S.37-54
Year
1997
Abstract
Introduces a method for defining filters for structured text. The text structure is defined by a grammar consisting of a set of productions. To describe the information interests, a two-dimensional template is first created interactively from the grammar to show the structure of a set of textual elements, at a chosen level of detail. The template depicts the hierarchical structure of the elements and indicates also optionality, alternatives and iteration in the structure. The template is filled vy constraints and annotations. The constraints allow giving conditions to the content of parts, to the position of parts in an orderd set of parts, and to the number of parts obeying a specified property. In a compound filter, several templates are connected by annotations. The method is intended to be used as a theoretical framework for developing flexible and powerful graphical interfaces for filtering structured text. Describes a prototype implementation

Similar documents (author)

  1. Salminen, A.: Modeling documents in their context (2009) 6.09
    6.094361 = sum of:
      6.094361 = weight(author_txt:salminen in 3847) [ClassicSimilarity], result of:
        6.094361 = fieldWeight in 3847, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.7509775 = idf(docFreq=6, maxDocs=44218)
          0.625 = fieldNorm(doc=3847)
    
  2. Salminen, A.: Markup languages (2009) 6.09
    6.094361 = sum of:
      6.094361 = weight(author_txt:salminen in 3849) [ClassicSimilarity], result of:
        6.094361 = fieldWeight in 3849, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.7509775 = idf(docFreq=6, maxDocs=44218)
          0.625 = fieldNorm(doc=3849)
    
  3. Salminen, A.; Kauppinen, K.; Lehtovaara, M.: Towards a methodology for document analysis (1997) 3.66
    3.6566167 = sum of:
      3.6566167 = weight(author_txt:salminen in 3643) [ClassicSimilarity], result of:
        3.6566167 = fieldWeight in 3643, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.7509775 = idf(docFreq=6, maxDocs=44218)
          0.375 = fieldNorm(doc=3643)
    
  4. Salminen, A.; Tague-Sutcliffe, J.; McClellan, C.: From text to hypertext by indexing (1995) 3.66
    3.6566167 = sum of:
      3.6566167 = weight(author_txt:salminen in 1863) [ClassicSimilarity], result of:
        3.6566167 = fieldWeight in 1863, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.7509775 = idf(docFreq=6, maxDocs=44218)
          0.375 = fieldNorm(doc=1863)
    
  5. Salminen, A.; Jauhiainen, E.; Nurmeksela, R.: ¬A life cycle model of XML documents (2014) 3.66
    3.6566167 = sum of:
      3.6566167 = weight(author_txt:salminen in 1553) [ClassicSimilarity], result of:
        3.6566167 = fieldWeight in 1553, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.7509775 = idf(docFreq=6, maxDocs=44218)
          0.375 = fieldNorm(doc=1553)
    

Similar documents (content)

  1. Darányi, S.; Wittek, P.: Demonstrating conceptual dynamics in an evolving text collection (2013) 0.10
    0.0984166 = sum of:
      0.0984166 = product of:
        0.41006917 = sum of:
          0.077745065 = weight(abstract_txt:filter in 1137) [ClassicSimilarity], result of:
            0.077745065 = score(doc=1137,freq=2.0), product of:
              0.12085455 = queryWeight, product of:
                1.1132202 = boost
                7.2780466 = idf(docFreq=82, maxDocs=44218)
                0.014916505 = queryNorm
              0.6432945 = fieldWeight in 1137, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.2780466 = idf(docFreq=82, maxDocs=44218)
                0.0625 = fieldNorm(doc=1137)
          0.026004942 = weight(abstract_txt:method in 1137) [ClassicSimilarity], result of:
            0.026004942 = score(doc=1137,freq=1.0), product of:
              0.092442505 = queryWeight, product of:
                1.3768938 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.014916505 = queryNorm
              0.28130937 = fieldWeight in 1137, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.0625 = fieldNorm(doc=1137)
          0.041522466 = weight(abstract_txt:elements in 1137) [ClassicSimilarity], result of:
            0.041522466 = score(doc=1137,freq=1.0), product of:
              0.12628639 = queryWeight, product of:
                1.6093216 = boost
                5.260737 = idf(docFreq=623, maxDocs=44218)
                0.014916505 = queryNorm
              0.32879606 = fieldWeight in 1137, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.260737 = idf(docFreq=623, maxDocs=44218)
                0.0625 = fieldNorm(doc=1137)
          0.17986728 = weight(abstract_txt:dimensional in 1137) [ClassicSimilarity], result of:
            0.17986728 = score(doc=1137,freq=4.0), product of:
              0.2114053 = queryWeight, product of:
                2.0822005 = boost
                6.806538 = idf(docFreq=132, maxDocs=44218)
                0.014916505 = queryNorm
              0.85081726 = fieldWeight in 1137, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.806538 = idf(docFreq=132, maxDocs=44218)
                0.0625 = fieldNorm(doc=1137)
          0.037719317 = weight(abstract_txt:text in 1137) [ClassicSimilarity], result of:
            0.037719317 = score(doc=1137,freq=1.0), product of:
              0.1492406 = queryWeight, product of:
                2.4741333 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.014916505 = queryNorm
              0.25274166 = fieldWeight in 1137, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=1137)
          0.04721009 = weight(abstract_txt:structure in 1137) [ClassicSimilarity], result of:
            0.04721009 = score(doc=1137,freq=1.0), product of:
              0.17332757 = queryWeight, product of:
                2.666327 = boost
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.014916505 = queryNorm
              0.27237496 = fieldWeight in 1137, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.0625 = fieldNorm(doc=1137)
        0.24 = coord(6/25)
    
  2. Taniguchi, S.: ¬A system for analyzing cataloguing rules : a feasibility study (1996) 0.10
    0.098239996 = sum of:
      0.098239996 = product of:
        0.61399996 = sum of:
          0.081408694 = weight(abstract_txt:templates in 4198) [ClassicSimilarity], result of:
            0.081408694 = score(doc=4198,freq=1.0), product of:
              0.15701397 = queryWeight, product of:
                1.2688748 = boost
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.014916505 = queryNorm
              0.5184806 = fieldWeight in 4198, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.0625 = fieldNorm(doc=4198)
          0.066765144 = weight(abstract_txt:structure in 4198) [ClassicSimilarity], result of:
            0.066765144 = score(doc=4198,freq=2.0), product of:
              0.17332757 = queryWeight, product of:
                2.666327 = boost
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.014916505 = queryNorm
              0.38519636 = fieldWeight in 4198, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.0625 = fieldNorm(doc=4198)
          0.12843677 = weight(abstract_txt:parts in 4198) [ClassicSimilarity], result of:
            0.12843677 = score(doc=4198,freq=1.0), product of:
              0.33778265 = queryWeight, product of:
                3.7221878 = boost
                6.0837593 = idf(docFreq=273, maxDocs=44218)
                0.014916505 = queryNorm
              0.38023496 = fieldWeight in 4198, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0837593 = idf(docFreq=273, maxDocs=44218)
                0.0625 = fieldNorm(doc=4198)
          0.33738935 = weight(abstract_txt:template in 4198) [ClassicSimilarity], result of:
            0.33738935 = score(doc=4198,freq=2.0), product of:
              0.4637413 = queryWeight, product of:
                3.7770097 = boost
                8.231152 = idf(docFreq=31, maxDocs=44218)
                0.014916505 = queryNorm
              0.7275379 = fieldWeight in 4198, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.231152 = idf(docFreq=31, maxDocs=44218)
                0.0625 = fieldNorm(doc=4198)
        0.16 = coord(4/25)
    
  3. Shuldberg, H.K.; Macpherson, M.; Humphrey, P.: Distilling information from text : the EDS TeplateFiller system (1993) 0.08
    0.08263418 = sum of:
      0.08263418 = product of:
        0.51646364 = sum of:
          0.04981092 = weight(abstract_txt:filtering in 5642) [ClassicSimilarity], result of:
            0.04981092 = score(doc=5642,freq=1.0), product of:
              0.097521596 = queryWeight, product of:
                6.537832 = idf(docFreq=173, maxDocs=44218)
                0.014916505 = queryNorm
              0.5107681 = fieldWeight in 5642, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.537832 = idf(docFreq=173, maxDocs=44218)
                0.078125 = fieldNorm(doc=5642)
          0.101760864 = weight(abstract_txt:templates in 5642) [ClassicSimilarity], result of:
            0.101760864 = score(doc=5642,freq=1.0), product of:
              0.15701397 = queryWeight, product of:
                1.2688748 = boost
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.014916505 = queryNorm
              0.64810073 = fieldWeight in 5642, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.078125 = fieldNorm(doc=5642)
          0.06667896 = weight(abstract_txt:text in 5642) [ClassicSimilarity], result of:
            0.06667896 = score(doc=5642,freq=2.0), product of:
              0.1492406 = queryWeight, product of:
                2.4741333 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.014916505 = queryNorm
              0.44678837 = fieldWeight in 5642, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.078125 = fieldNorm(doc=5642)
          0.2982129 = weight(abstract_txt:template in 5642) [ClassicSimilarity], result of:
            0.2982129 = score(doc=5642,freq=1.0), product of:
              0.4637413 = queryWeight, product of:
                3.7770097 = boost
                8.231152 = idf(docFreq=31, maxDocs=44218)
                0.014916505 = queryNorm
              0.6430587 = fieldWeight in 5642, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.231152 = idf(docFreq=31, maxDocs=44218)
                0.078125 = fieldNorm(doc=5642)
        0.16 = coord(4/25)
    
  4. Smith, D.A.; Shadbolt, N.R.: FacetOntology : expressive descriptions of facets in the Semantic Web (2012) 0.08
    0.07782717 = sum of:
      0.07782717 = product of:
        0.38913584 = sum of:
          0.049816463 = weight(abstract_txt:specified in 2208) [ClassicSimilarity], result of:
            0.049816463 = score(doc=2208,freq=1.0), product of:
              0.11317218 = queryWeight, product of:
                1.0772573 = boost
                7.042927 = idf(docFreq=104, maxDocs=44218)
                0.014916505 = queryNorm
              0.44018292 = fieldWeight in 2208, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.042927 = idf(docFreq=104, maxDocs=44218)
                0.0625 = fieldNorm(doc=2208)
          0.054974064 = weight(abstract_txt:filter in 2208) [ClassicSimilarity], result of:
            0.054974064 = score(doc=2208,freq=1.0), product of:
              0.12085455 = queryWeight, product of:
                1.1132202 = boost
                7.2780466 = idf(docFreq=82, maxDocs=44218)
                0.014916505 = queryNorm
              0.4548779 = fieldWeight in 2208, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2780466 = idf(docFreq=82, maxDocs=44218)
                0.0625 = fieldNorm(doc=2208)
          0.026004942 = weight(abstract_txt:method in 2208) [ClassicSimilarity], result of:
            0.026004942 = score(doc=2208,freq=1.0), product of:
              0.092442505 = queryWeight, product of:
                1.3768938 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.014916505 = queryNorm
              0.28130937 = fieldWeight in 2208, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.0625 = fieldNorm(doc=2208)
          0.21113029 = weight(abstract_txt:filters in 2208) [ClassicSimilarity], result of:
            0.21113029 = score(doc=2208,freq=2.0), product of:
              0.2963863 = queryWeight, product of:
                2.4654355 = boost
                8.059301 = idf(docFreq=37, maxDocs=44218)
                0.014916505 = queryNorm
              0.71234834 = fieldWeight in 2208, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.059301 = idf(docFreq=37, maxDocs=44218)
                0.0625 = fieldNorm(doc=2208)
          0.04721009 = weight(abstract_txt:structure in 2208) [ClassicSimilarity], result of:
            0.04721009 = score(doc=2208,freq=1.0), product of:
              0.17332757 = queryWeight, product of:
                2.666327 = boost
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.014916505 = queryNorm
              0.27237496 = fieldWeight in 2208, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.0625 = fieldNorm(doc=2208)
        0.2 = coord(5/25)
    
  5. Crestani, F.; Vegas, J.; Fuente, P. de la: ¬A graphical user interface for the retrieval of hierarchically structured documents (2004) 0.08
    0.07769201 = sum of:
      0.07769201 = product of:
        0.48557508 = sum of:
          0.07339771 = weight(abstract_txt:graphical in 2555) [ClassicSimilarity], result of:
            0.07339771 = score(doc=2555,freq=2.0), product of:
              0.10022963 = queryWeight, product of:
                1.0137892 = boost
                6.627983 = idf(docFreq=158, maxDocs=44218)
                0.014916505 = queryNorm
              0.7322956 = fieldWeight in 2555, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.627983 = idf(docFreq=158, maxDocs=44218)
                0.078125 = fieldNorm(doc=2555)
          0.19261879 = weight(abstract_txt:structured in 2555) [ClassicSimilarity], result of:
            0.19261879 = score(doc=2555,freq=5.0), product of:
              0.2026441 = queryWeight, product of:
                2.4967623 = boost
                5.4411373 = idf(docFreq=520, maxDocs=44218)
                0.014916505 = queryNorm
              0.95052755 = fieldWeight in 2555, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.4411373 = idf(docFreq=520, maxDocs=44218)
                0.078125 = fieldNorm(doc=2555)
          0.05901261 = weight(abstract_txt:structure in 2555) [ClassicSimilarity], result of:
            0.05901261 = score(doc=2555,freq=1.0), product of:
              0.17332757 = queryWeight, product of:
                2.666327 = boost
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.014916505 = queryNorm
              0.3404687 = fieldWeight in 2555, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.078125 = fieldNorm(doc=2555)
          0.16054596 = weight(abstract_txt:parts in 2555) [ClassicSimilarity], result of:
            0.16054596 = score(doc=2555,freq=1.0), product of:
              0.33778265 = queryWeight, product of:
                3.7221878 = boost
                6.0837593 = idf(docFreq=273, maxDocs=44218)
                0.014916505 = queryNorm
              0.4752937 = fieldWeight in 2555, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0837593 = idf(docFreq=273, maxDocs=44218)
                0.078125 = fieldNorm(doc=2555)
        0.16 = coord(4/25)