Search (4 results, page 1 of 1)

  • × theme_ss:"Automatisches Indexieren"
  • × theme_ss:"Metadaten"
  • × type_ss:"a"
  1. Wolfekuhler, M.R.; Punch, W.F.: Finding salient features for personal Web pages categories (1997) 0.02
    0.01796158 = product of:
      0.03592316 = sum of:
        0.012015978 = weight(_text_:information in 2673) [ClassicSimilarity], result of:
          0.012015978 = score(doc=2673,freq=2.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.13576832 = fieldWeight in 2673, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2673)
        0.023907183 = product of:
          0.047814365 = sum of:
            0.047814365 = weight(_text_:22 in 2673) [ClassicSimilarity], result of:
              0.047814365 = score(doc=2673,freq=2.0), product of:
                0.17654699 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050415643 = queryNorm
                0.2708308 = fieldWeight in 2673, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2673)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Examines techniques that discover features in sets of pre-categorized documents, such that similar documents can be found on the WWW. Examines techniques which will classifiy training examples with high accuracy, then explains why this is not necessarily useful. Describes a method for extracting word clusters from the raw document features. Results show that the clustering technique is successful in discovering word groups in personal Web pages which can be used to find similar information on the WWW
    Date
    1. 8.1996 22:08:06
  2. Husevag, A.-S.R.: Named entities in indexing : a case study of TV subtitles and metadata records (2016) 0.00
    0.004425438 = product of:
      0.017701752 = sum of:
        0.017701752 = product of:
          0.035403505 = sum of:
            0.035403505 = weight(_text_:organization in 3105) [ClassicSimilarity], result of:
              0.035403505 = score(doc=3105,freq=2.0), product of:
                0.17974974 = queryWeight, product of:
                  3.5653565 = idf(docFreq=3399, maxDocs=44218)
                  0.050415643 = queryNorm
                0.19695997 = fieldWeight in 3105, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5653565 = idf(docFreq=3399, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3105)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Source
    Proceedings of the 15th European Networked Knowledge Organization Systems Workshop (NKOS 2016) co-located with the 20th International Conference on Theory and Practice of Digital Libraries 2016 (TPDL 2016), Hannover, Germany, September 9, 2016. Edi. by Philipp Mayr et al. [http://ceur-ws.org/Vol-1676/=urn:nbn:de:0074-1676-5]
  3. Yang, T.-H.; Hsieh, Y.-L.; Liu, S.-H.; Chang, Y.-C.; Hsu, W.-L.: ¬A flexible template generation and matching method with applications for publication reference metadata extraction (2021) 0.00
    0.0030344925 = product of:
      0.01213797 = sum of:
        0.01213797 = weight(_text_:information in 63) [ClassicSimilarity], result of:
          0.01213797 = score(doc=63,freq=4.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.13714671 = fieldWeight in 63, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=63)
      0.25 = coord(1/4)
    
    Abstract
    Conventional rule-based approaches use exact template matching to capture linguistic information and necessarily need to enumerate all variations. We propose a novel flexible template generation and matching scheme called the principle-based approach (PBA) based on sequence alignment, and employ it for reference metadata extraction (RME) to demonstrate its effectiveness. The main contributions of this research are threefold. First, we propose an automatic template generation that can capture prominent patterns using the dominating set algorithm. Second, we devise an alignment-based template-matching technique that uses a logistic regression model, which makes it more general and flexible than pure rule-based approaches. Last, we apply PBA to RME on extensive cross-domain corpora and demonstrate its robustness and generality. Experiments reveal that the same set of templates produced by the PBA framework not only deliver consistent performance on various unseen domains, but also surpass hand-crafted knowledge (templates). We use four independent journal style test sets and one conference style test set in the experiments. When compared to renowned machine learning methods, such as conditional random fields (CRF), as well as recent deep learning methods (i.e., bi-directional long short-term memory with a CRF layer, Bi-LSTM-CRF), PBA has the best performance for all datasets.
    Source
    Journal of the Association for Information Science and Technology. 72(2021) no.1, S.32-45
  4. Strobel, S.; Marín-Arraiza, P.: Metadata for scientific audiovisual media : current practices and perspectives of the TIB / AV-portal (2015) 0.00
    0.0021457102 = product of:
      0.008582841 = sum of:
        0.008582841 = weight(_text_:information in 3667) [ClassicSimilarity], result of:
          0.008582841 = score(doc=3667,freq=2.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.09697737 = fieldWeight in 3667, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3667)
      0.25 = coord(1/4)
    
    Series
    Communications in computer and information science; 544