Document (#16322)

Author
Kuikka, E.
Salminen, A.
Title
Two-dimensional filters for structured text
Source
Information processing and management. 33(1997) no.1, S.37-54
Year
1997
Abstract
Introduces a method for defining filters for structured text. The text structure is defined by a grammar consisting of a set of productions. To describe the information interests, a two-dimensional template is first created interactively from the grammar to show the structure of a set of textual elements, at a chosen level of detail. The template depicts the hierarchical structure of the elements and indicates also optionality, alternatives and iteration in the structure. The template is filled vy constraints and annotations. The constraints allow giving conditions to the content of parts, to the position of parts in an orderd set of parts, and to the number of parts obeying a specified property. In a compound filter, several templates are connected by annotations. The method is intended to be used as a theoretical framework for developing flexible and powerful graphical interfaces for filtering structured text. Describes a prototype implementation

Similar documents (author)

  1. Salminen, A.: Modeling documents in their context (2009) 6.08
    6.0805845 = sum of:
      6.0805845 = weight(author_txt:salminen in 312) [ClassicSimilarity], result of:
        6.0805845 = fieldWeight in 312, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.728935 = idf(docFreq=6, maxDocs=43254)
          0.625 = fieldNorm(doc=312)
    
  2. Salminen, A.: Markup languages (2009) 6.08
    6.0805845 = sum of:
      6.0805845 = weight(author_txt:salminen in 314) [ClassicSimilarity], result of:
        6.0805845 = fieldWeight in 314, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.728935 = idf(docFreq=6, maxDocs=43254)
          0.625 = fieldNorm(doc=314)
    
  3. Salminen, A.; Kauppinen, K.; Lehtovaara, M.: Towards a methodology for document analysis (1997) 3.65
    3.6483507 = sum of:
      3.6483507 = weight(author_txt:salminen in 3643) [ClassicSimilarity], result of:
        3.6483507 = fieldWeight in 3643, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.728935 = idf(docFreq=6, maxDocs=43254)
          0.375 = fieldNorm(doc=3643)
    
  4. Salminen, A.; Tague-Sutcliffe, J.; McClellan, C.: From text to hypertext by indexing (1995) 3.65
    3.6483507 = sum of:
      3.6483507 = weight(author_txt:salminen in 2932) [ClassicSimilarity], result of:
        3.6483507 = fieldWeight in 2932, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.728935 = idf(docFreq=6, maxDocs=43254)
          0.375 = fieldNorm(doc=2932)
    
  5. Salminen, A.; Jauhiainen, E.; Nurmeksela, R.: ¬A life cycle model of XML documents (2014) 3.65
    3.6483507 = sum of:
      3.6483507 = weight(author_txt:salminen in 3018) [ClassicSimilarity], result of:
        3.6483507 = fieldWeight in 3018, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.728935 = idf(docFreq=6, maxDocs=43254)
          0.375 = fieldNorm(doc=3018)
    

Similar documents (content)

  1. Taniguchi, S.: ¬A system for analyzing cataloguing rules : a feasibility study (1996) 0.10
    0.09874906 = sum of:
      0.09874906 = product of:
        0.6171816 = sum of:
          0.08275469 = weight(abstract_txt:templates in 5267) [ClassicSimilarity], result of:
            0.08275469 = score(doc=5267,freq=1.0), product of:
              0.15871173 = queryWeight, product of:
                1.2758191 = boost
                8.342641 = idf(docFreq=27, maxDocs=43254)
                0.014911329 = queryNorm
              0.52141505 = fieldWeight in 5267, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.342641 = idf(docFreq=27, maxDocs=43254)
                0.0625 = fieldNorm(doc=5267)
          0.06717354 = weight(abstract_txt:structure in 5267) [ClassicSimilarity], result of:
            0.06717354 = score(doc=5267,freq=2.0), product of:
              0.17400275 = queryWeight, product of:
                2.6717305 = boost
                4.367643 = idf(docFreq=1490, maxDocs=43254)
                0.014911329 = queryNorm
              0.38604873 = fieldWeight in 5267, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.367643 = idf(docFreq=1490, maxDocs=43254)
                0.0625 = fieldNorm(doc=5267)
          0.12884936 = weight(abstract_txt:parts in 5267) [ClassicSimilarity], result of:
            0.12884936 = score(doc=5267,freq=1.0), product of:
              0.33844554 = queryWeight, product of:
                3.7261386 = boost
                6.0913486 = idf(docFreq=265, maxDocs=43254)
                0.014911329 = queryNorm
              0.3807093 = fieldWeight in 5267, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0913486 = idf(docFreq=265, maxDocs=43254)
                0.0625 = fieldNorm(doc=5267)
          0.338404 = weight(abstract_txt:template in 5267) [ClassicSimilarity], result of:
            0.338404 = score(doc=5267,freq=2.0), product of:
              0.46458808 = queryWeight, product of:
                3.7807612 = boost
                8.240858 = idf(docFreq=30, maxDocs=43254)
                0.014911329 = queryNorm
              0.7283958 = fieldWeight in 5267, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.240858 = idf(docFreq=30, maxDocs=43254)
                0.0625 = fieldNorm(doc=5267)
        0.16 = coord(4/25)
    
  2. Darányi, S.; Wittek, P.: Demonstrating conceptual dynamics in an evolving text collection (2013) 0.10
    0.09863113 = sum of:
      0.09863113 = product of:
        0.41096306 = sum of:
          0.07899479 = weight(abstract_txt:filter in 2602) [ClassicSimilarity], result of:
            0.07899479 = score(doc=2602,freq=2.0), product of:
              0.12212454 = queryWeight, product of:
                1.1191442 = boost
                7.318136 = idf(docFreq=77, maxDocs=43254)
                0.014911329 = queryNorm
              0.64683795 = fieldWeight in 2602, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.318136 = idf(docFreq=77, maxDocs=43254)
                0.0625 = fieldNorm(doc=2602)
          0.026190776 = weight(abstract_txt:method in 2602) [ClassicSimilarity], result of:
            0.026190776 = score(doc=2602,freq=1.0), product of:
              0.092865884 = queryWeight, product of:
                1.3801545 = boost
                4.5124474 = idf(docFreq=1289, maxDocs=43254)
                0.014911329 = queryNorm
              0.28202796 = fieldWeight in 2602, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5124474 = idf(docFreq=1289, maxDocs=43254)
                0.0625 = fieldNorm(doc=2602)
          0.04178952 = weight(abstract_txt:elements in 2602) [ClassicSimilarity], result of:
            0.04178952 = score(doc=2602,freq=1.0), product of:
              0.12680475 = queryWeight, product of:
                1.6127512 = boost
                5.2729278 = idf(docFreq=602, maxDocs=43254)
                0.014911329 = queryNorm
              0.32955799 = fieldWeight in 2602, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2729278 = idf(docFreq=602, maxDocs=43254)
                0.0625 = fieldNorm(doc=2602)
          0.1786254 = weight(abstract_txt:dimensional in 2602) [ClassicSimilarity], result of:
            0.1786254 = score(doc=2602,freq=4.0), product of:
              0.21039373 = queryWeight, product of:
                2.0773802 = boost
                6.792043 = idf(docFreq=131, maxDocs=43254)
                0.014911329 = queryNorm
              0.8490054 = fieldWeight in 2602, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.792043 = idf(docFreq=131, maxDocs=43254)
                0.0625 = fieldNorm(doc=2602)
          0.037863668 = weight(abstract_txt:text in 2602) [ClassicSimilarity], result of:
            0.037863668 = score(doc=2602,freq=1.0), product of:
              0.14959455 = queryWeight, product of:
                2.4772651 = boost
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.014911329 = queryNorm
              0.25310862 = fieldWeight in 2602, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.0625 = fieldNorm(doc=2602)
          0.047498867 = weight(abstract_txt:structure in 2602) [ClassicSimilarity], result of:
            0.047498867 = score(doc=2602,freq=1.0), product of:
              0.17400275 = queryWeight, product of:
                2.6717305 = boost
                4.367643 = idf(docFreq=1490, maxDocs=43254)
                0.014911329 = queryNorm
              0.27297768 = fieldWeight in 2602, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.367643 = idf(docFreq=1490, maxDocs=43254)
                0.0625 = fieldNorm(doc=2602)
        0.24 = coord(6/25)
    
  3. Crestani, F.; Vegas, J.; Fuente, P. de la: ¬A graphical user interface for the retrieval of hierarchically structured documents (2004) 0.08
    0.07781485 = sum of:
      0.07781485 = product of:
        0.4863428 = sum of:
          0.07262915 = weight(abstract_txt:graphical in 4556) [ClassicSimilarity], result of:
            0.07262915 = score(doc=4556,freq=2.0), product of:
              0.09951104 = queryWeight, product of:
                1.01023 = boost
                6.605941 = idf(docFreq=158, maxDocs=43254)
                0.014911329 = queryNorm
              0.7298602 = fieldWeight in 4556, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.605941 = idf(docFreq=158, maxDocs=43254)
                0.078125 = fieldNorm(doc=4556)
          0.19327836 = weight(abstract_txt:structured in 4556) [ClassicSimilarity], result of:
            0.19327836 = score(doc=4556,freq=5.0), product of:
              0.20307034 = queryWeight, product of:
                2.4995883 = boost
                5.4483085 = idf(docFreq=505, maxDocs=43254)
                0.014911329 = queryNorm
              0.9517803 = fieldWeight in 4556, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.4483085 = idf(docFreq=505, maxDocs=43254)
                0.078125 = fieldNorm(doc=4556)
          0.059373587 = weight(abstract_txt:structure in 4556) [ClassicSimilarity], result of:
            0.059373587 = score(doc=4556,freq=1.0), product of:
              0.17400275 = queryWeight, product of:
                2.6717305 = boost
                4.367643 = idf(docFreq=1490, maxDocs=43254)
                0.014911329 = queryNorm
              0.3412221 = fieldWeight in 4556, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.367643 = idf(docFreq=1490, maxDocs=43254)
                0.078125 = fieldNorm(doc=4556)
          0.1610617 = weight(abstract_txt:parts in 4556) [ClassicSimilarity], result of:
            0.1610617 = score(doc=4556,freq=1.0), product of:
              0.33844554 = queryWeight, product of:
                3.7261386 = boost
                6.0913486 = idf(docFreq=265, maxDocs=43254)
                0.014911329 = queryNorm
              0.4758866 = fieldWeight in 4556, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0913486 = idf(docFreq=265, maxDocs=43254)
                0.078125 = fieldNorm(doc=4556)
        0.16 = coord(4/25)
    
  4. Smith, D.A.; Shadbolt, N.R.: FacetOntology : expressive descriptions of facets in the Semantic Web (2012) 0.08
    0.07779711 = sum of:
      0.07779711 = product of:
        0.3889855 = sum of:
          0.050147027 = weight(abstract_txt:specified in 3673) [ClassicSimilarity], result of:
            0.050147027 = score(doc=3673,freq=1.0), product of:
              0.11365209 = queryWeight, product of:
                1.079626 = boost
                7.0597243 = idf(docFreq=100, maxDocs=43254)
                0.014911329 = queryNorm
              0.44123277 = fieldWeight in 3673, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.0597243 = idf(docFreq=100, maxDocs=43254)
                0.0625 = fieldNorm(doc=3673)
          0.05585775 = weight(abstract_txt:filter in 3673) [ClassicSimilarity], result of:
            0.05585775 = score(doc=3673,freq=1.0), product of:
              0.12212454 = queryWeight, product of:
                1.1191442 = boost
                7.318136 = idf(docFreq=77, maxDocs=43254)
                0.014911329 = queryNorm
              0.4573835 = fieldWeight in 3673, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.318136 = idf(docFreq=77, maxDocs=43254)
                0.0625 = fieldNorm(doc=3673)
          0.026190776 = weight(abstract_txt:method in 3673) [ClassicSimilarity], result of:
            0.026190776 = score(doc=3673,freq=1.0), product of:
              0.092865884 = queryWeight, product of:
                1.3801545 = boost
                4.5124474 = idf(docFreq=1289, maxDocs=43254)
                0.014911329 = queryNorm
              0.28202796 = fieldWeight in 3673, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5124474 = idf(docFreq=1289, maxDocs=43254)
                0.0625 = fieldNorm(doc=3673)
          0.20929112 = weight(abstract_txt:filters in 3673) [ClassicSimilarity], result of:
            0.20929112 = score(doc=3673,freq=2.0), product of:
              0.29461023 = queryWeight, product of:
                2.4582357 = boost
                8.037259 = idf(docFreq=37, maxDocs=43254)
                0.014911329 = queryNorm
              0.71040004 = fieldWeight in 3673, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.037259 = idf(docFreq=37, maxDocs=43254)
                0.0625 = fieldNorm(doc=3673)
          0.047498867 = weight(abstract_txt:structure in 3673) [ClassicSimilarity], result of:
            0.047498867 = score(doc=3673,freq=1.0), product of:
              0.17400275 = queryWeight, product of:
                2.6717305 = boost
                4.367643 = idf(docFreq=1490, maxDocs=43254)
                0.014911329 = queryNorm
              0.27297768 = fieldWeight in 3673, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.367643 = idf(docFreq=1490, maxDocs=43254)
                0.0625 = fieldNorm(doc=3673)
        0.2 = coord(5/25)
    
  5. Lawson, M.: Automatic extraction of citations from the text of English-language patents : an example of template mining (1996) 0.07
    0.06875287 = sum of:
      0.06875287 = product of:
        0.5729406 = sum of:
          0.08275469 = weight(abstract_txt:templates in 4655) [ClassicSimilarity], result of:
            0.08275469 = score(doc=4655,freq=1.0), product of:
              0.15871173 = queryWeight, product of:
                1.2758191 = boost
                8.342641 = idf(docFreq=27, maxDocs=43254)
                0.014911329 = queryNorm
              0.52141505 = fieldWeight in 4655, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.342641 = idf(docFreq=27, maxDocs=43254)
                0.0625 = fieldNorm(doc=4655)
          0.075727336 = weight(abstract_txt:text in 4655) [ClassicSimilarity], result of:
            0.075727336 = score(doc=4655,freq=4.0), product of:
              0.14959455 = queryWeight, product of:
                2.4772651 = boost
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.014911329 = queryNorm
              0.50621724 = fieldWeight in 4655, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.049738 = idf(docFreq=2048, maxDocs=43254)
                0.0625 = fieldNorm(doc=4655)
          0.41445857 = weight(abstract_txt:template in 4655) [ClassicSimilarity], result of:
            0.41445857 = score(doc=4655,freq=3.0), product of:
              0.46458808 = queryWeight, product of:
                3.7807612 = boost
                8.240858 = idf(docFreq=30, maxDocs=43254)
                0.014911329 = queryNorm
              0.892099 = fieldWeight in 4655, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.240858 = idf(docFreq=30, maxDocs=43254)
                0.0625 = fieldNorm(doc=4655)
        0.12 = coord(3/25)