Search (52 results, page 1 of 3)

  • × theme_ss:"Automatisches Indexieren"
  1. Ward, M.L.: ¬The future of the human indexer (1996) 0.12
    0.11610073 = product of:
      0.1741511 = sum of:
        0.055512875 = weight(_text_:reference in 7244) [ClassicSimilarity], result of:
          0.055512875 = score(doc=7244,freq=2.0), product of:
            0.205834 = queryWeight, product of:
              4.0683694 = idf(docFreq=2055, maxDocs=44218)
              0.050593734 = queryNorm
            0.2696973 = fieldWeight in 7244, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.0683694 = idf(docFreq=2055, maxDocs=44218)
              0.046875 = fieldNorm(doc=7244)
        0.11863822 = sum of:
          0.0775097 = weight(_text_:database in 7244) [ClassicSimilarity], result of:
            0.0775097 = score(doc=7244,freq=4.0), product of:
              0.20452234 = queryWeight, product of:
                4.042444 = idf(docFreq=2109, maxDocs=44218)
                0.050593734 = queryNorm
              0.37897915 = fieldWeight in 7244, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.042444 = idf(docFreq=2109, maxDocs=44218)
                0.046875 = fieldNorm(doc=7244)
          0.041128512 = weight(_text_:22 in 7244) [ClassicSimilarity], result of:
            0.041128512 = score(doc=7244,freq=2.0), product of:
              0.17717063 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.050593734 = queryNorm
              0.23214069 = fieldWeight in 7244, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=7244)
      0.6666667 = coord(2/3)
    
    Abstract
    Considers the principles of indexing and the intellectual skills involved in order to determine what automatic indexing systems would be required in order to supplant or complement the human indexer. Good indexing requires: considerable prior knowledge of the literature; judgement as to what to index and what depth to index; reading skills; abstracting skills; and classification skills, Illustrates these features with a detailed description of abstracting and indexing processes involved in generating entries for the mechanical engineering database POWERLINK. Briefly assesses the possibility of replacing human indexers with specialist indexing software, with particular reference to the Object Analyzer from the InTEXT automatic indexing system and using the criteria described for human indexers. At present, it is unlikely that the automatic indexer will replace the human indexer, but when more primary texts are available in electronic form, it may be a useful productivity tool for dealing with large quantities of low grade texts (should they be wanted in the database)
    Date
    9. 2.1997 18:44:22
  2. Li, W.; Wong, K.-F.; Yuan, C.: Toward automatic Chinese temporal information extraction (2001) 0.05
    0.046064835 = product of:
      0.06909725 = sum of:
        0.046260733 = weight(_text_:reference in 6029) [ClassicSimilarity], result of:
          0.046260733 = score(doc=6029,freq=2.0), product of:
            0.205834 = queryWeight, product of:
              4.0683694 = idf(docFreq=2055, maxDocs=44218)
              0.050593734 = queryNorm
            0.22474778 = fieldWeight in 6029, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.0683694 = idf(docFreq=2055, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6029)
        0.022836514 = product of:
          0.045673028 = sum of:
            0.045673028 = weight(_text_:database in 6029) [ClassicSimilarity], result of:
              0.045673028 = score(doc=6029,freq=2.0), product of:
                0.20452234 = queryWeight, product of:
                  4.042444 = idf(docFreq=2109, maxDocs=44218)
                  0.050593734 = queryNorm
                0.2233156 = fieldWeight in 6029, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.042444 = idf(docFreq=2109, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=6029)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Over the past few years, temporal information processing and temporal database management have increasingly become hot topics. Nevertheless, only a few researchers have investigated these areas in the Chinese language. This lays down the objective of our research: to exploit Chinese language processing techniques for temporal information extraction and concept reasoning. In this article, we first study the mechanism for expressing time in Chinese. On the basis of the study, we then design a general frame structure for maintaining the extracted temporal concepts and propose a system for extracting time-dependent information from Hong Kong financial news. In the system, temporal knowledge is represented by different types of temporal concepts (TTC) and different temporal relations, including absolute and relative relations, which are used to correlate between action times and reference times. In analyzing a sentence, the algorithm first determines the situation related to the verb. This in turn will identify the type of temporal concept associated with the verb. After that, the relevant temporal information is extracted and the temporal relations are derived. These relations link relevant concept frames together in chronological order, which in turn provide the knowledge to fulfill users' queries, e.g., for question-answering (i.e., Q&A) applications
  3. Salton, G.; Wong, A.: Generation and search of clustered files (1978) 0.02
    0.024358949 = product of:
      0.073076844 = sum of:
        0.073076844 = product of:
          0.14615369 = sum of:
            0.14615369 = weight(_text_:database in 2411) [ClassicSimilarity], result of:
              0.14615369 = score(doc=2411,freq=2.0), product of:
                0.20452234 = queryWeight, product of:
                  4.042444 = idf(docFreq=2109, maxDocs=44218)
                  0.050593734 = queryNorm
                0.7146099 = fieldWeight in 2411, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.042444 = idf(docFreq=2109, maxDocs=44218)
                  0.125 = fieldNorm(doc=2411)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Source
    ACM transactions on database systems. 3(1978) no.4, S.321-346
  4. Yang, T.-H.; Hsieh, Y.-L.; Liu, S.-H.; Chang, Y.-C.; Hsu, W.-L.: ¬A flexible template generation and matching method with applications for publication reference metadata extraction (2021) 0.02
    0.02180752 = product of:
      0.06542256 = sum of:
        0.06542256 = weight(_text_:reference in 63) [ClassicSimilarity], result of:
          0.06542256 = score(doc=63,freq=4.0), product of:
            0.205834 = queryWeight, product of:
              4.0683694 = idf(docFreq=2055, maxDocs=44218)
              0.050593734 = queryNorm
            0.31784135 = fieldWeight in 63, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.0683694 = idf(docFreq=2055, maxDocs=44218)
              0.0390625 = fieldNorm(doc=63)
      0.33333334 = coord(1/3)
    
    Abstract
    Conventional rule-based approaches use exact template matching to capture linguistic information and necessarily need to enumerate all variations. We propose a novel flexible template generation and matching scheme called the principle-based approach (PBA) based on sequence alignment, and employ it for reference metadata extraction (RME) to demonstrate its effectiveness. The main contributions of this research are threefold. First, we propose an automatic template generation that can capture prominent patterns using the dominating set algorithm. Second, we devise an alignment-based template-matching technique that uses a logistic regression model, which makes it more general and flexible than pure rule-based approaches. Last, we apply PBA to RME on extensive cross-domain corpora and demonstrate its robustness and generality. Experiments reveal that the same set of templates produced by the PBA framework not only deliver consistent performance on various unseen domains, but also surpass hand-crafted knowledge (templates). We use four independent journal style test sets and one conference style test set in the experiments. When compared to renowned machine learning methods, such as conditional random fields (CRF), as well as recent deep learning methods (i.e., bi-directional long short-term memory with a CRF layer, Bi-LSTM-CRF), PBA has the best performance for all datasets.
  5. Bonzi, S.: Representation of concepts in text : a comparison of within-document frequency, anaphora, and synonymy (1991) 0.02
    0.02158834 = product of:
      0.06476502 = sum of:
        0.06476502 = weight(_text_:reference in 4933) [ClassicSimilarity], result of:
          0.06476502 = score(doc=4933,freq=2.0), product of:
            0.205834 = queryWeight, product of:
              4.0683694 = idf(docFreq=2055, maxDocs=44218)
              0.050593734 = queryNorm
            0.31464687 = fieldWeight in 4933, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.0683694 = idf(docFreq=2055, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4933)
      0.33333334 = coord(1/3)
    
    Abstract
    Investigates the 3 major ways by which a concept may be represented in text: within-document frequency, anaphoric reference, and synonyms in order to determine which provides the optical means of representation. Analysis a sample of 60 abstracts, drawn at random for the abstracting journals of 4 disciplines. Results show that in general, initial within-document frequency is higher for keyword terms. Additionally, frequency of keyword terms referenced anaphorically or with intellectually related terms is higher that that of other keyword terms. It appears that initial document length influences both the number and impact of both anaphoric resolutions and intellectually related terms
  6. Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.02
    0.01827934 = product of:
      0.05483802 = sum of:
        0.05483802 = product of:
          0.10967604 = sum of:
            0.10967604 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
              0.10967604 = score(doc=402,freq=2.0), product of:
                0.17717063 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050593734 = queryNorm
                0.61904186 = fieldWeight in 402, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=402)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Source
    Information processing and management. 22(1986) no.6, S.465-476
  7. Fuhr, N.; Niewelt, B.: ¬Ein Retrievaltest mit automatisch indexierten Dokumenten (1984) 0.02
    0.015994422 = product of:
      0.047983266 = sum of:
        0.047983266 = product of:
          0.09596653 = sum of:
            0.09596653 = weight(_text_:22 in 262) [ClassicSimilarity], result of:
              0.09596653 = score(doc=262,freq=2.0), product of:
                0.17717063 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050593734 = queryNorm
                0.5416616 = fieldWeight in 262, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=262)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    20.10.2000 12:22:23
  8. Hlava, M.M.K.: Automatic indexing : comparing rule-based and statistics-based indexing systems (2005) 0.02
    0.015994422 = product of:
      0.047983266 = sum of:
        0.047983266 = product of:
          0.09596653 = sum of:
            0.09596653 = weight(_text_:22 in 6265) [ClassicSimilarity], result of:
              0.09596653 = score(doc=6265,freq=2.0), product of:
                0.17717063 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050593734 = queryNorm
                0.5416616 = fieldWeight in 6265, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=6265)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Source
    Information outlook. 9(2005) no.8, S.22-23
  9. Vlachidis, A.; Tudhope, D.: ¬A knowledge-based approach to information extraction for semantic interoperability in the archaeology domain (2016) 0.02
    0.015420245 = product of:
      0.046260733 = sum of:
        0.046260733 = weight(_text_:reference in 2895) [ClassicSimilarity], result of:
          0.046260733 = score(doc=2895,freq=2.0), product of:
            0.205834 = queryWeight, product of:
              4.0683694 = idf(docFreq=2055, maxDocs=44218)
              0.050593734 = queryNorm
            0.22474778 = fieldWeight in 2895, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.0683694 = idf(docFreq=2055, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2895)
      0.33333334 = coord(1/3)
    
    Abstract
    The article presents a method for automatic semantic indexing of archaeological grey-literature reports using empirical (rule-based) Information Extraction techniques in combination with domain-specific knowledge organization systems. The semantic annotation system (OPTIMA) performs the tasks of Named Entity Recognition, Relation Extraction, Negation Detection, and Word-Sense Disambiguation using hand-crafted rules and terminological resources for associating contextual abstractions with classes of the standard ontology CIDOC Conceptual Reference Model (CRM) for cultural heritage and its archaeological extension, CRM-EH. Relation Extraction (RE) performance benefits from a syntactic-based definition of RE patterns derived from domain oriented corpus analysis. The evaluation also shows clear benefit in the use of assistive natural language processing (NLP) modules relating to Word-Sense Disambiguation, Negation Detection, and Noun Phrase Validation, together with controlled thesaurus expansion. The semantic indexing results demonstrate the capacity of rule-based Information Extraction techniques to deliver interoperable semantic abstractions (semantic annotations) with respect to the CIDOC CRM and archaeological thesauri. Major contributions include recognition of relevant entities using shallow parsing NLP techniques driven by a complimentary use of ontological and terminological domain resources and empirical derivation of context-driven RE rules for the recognition of semantic relationships from phrases of unstructured text.
  10. Bredack, J.: Automatische Extraktion fachterminologischer Mehrwortbegriffe : ein Verfahrensvergleich (2016) 0.02
    0.015420245 = product of:
      0.046260733 = sum of:
        0.046260733 = weight(_text_:reference in 3194) [ClassicSimilarity], result of:
          0.046260733 = score(doc=3194,freq=2.0), product of:
            0.205834 = queryWeight, product of:
              4.0683694 = idf(docFreq=2055, maxDocs=44218)
              0.050593734 = queryNorm
            0.22474778 = fieldWeight in 3194, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.0683694 = idf(docFreq=2055, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3194)
      0.33333334 = coord(1/3)
    
    Abstract
    In dieser Untersuchung wurden zwei Systeme eingesetzt, um MWT aus einer Dokumentkollektion mit fachsprachlichem Bezug (Volltexte des ACL Anthology Reference Corpus) automatisch zu extrahieren. Das thematische Spektrum umfasste alle Bereiche der natürlichen Sprachverarbeitung, im Speziellen die CL als interdisziplinäre Wissenschaft. Ziel war es MWT zu extrahieren, die als potentielle Indexterme im IR Verwendung finden können. Diese sollten auf Konzepte, Methoden, Verfahren und Algorithmen in der CL und angrenzenden Teilgebieten, wie Linguistik und Informatik hinweisen bzw. benennen.
  11. Hodge, G.M.: Computer-assisted database indexing : the state-of-the-art (1994) 0.02
    0.015224343 = product of:
      0.045673028 = sum of:
        0.045673028 = product of:
          0.091346055 = sum of:
            0.091346055 = weight(_text_:database in 7936) [ClassicSimilarity], result of:
              0.091346055 = score(doc=7936,freq=2.0), product of:
                0.20452234 = queryWeight, product of:
                  4.042444 = idf(docFreq=2109, maxDocs=44218)
                  0.050593734 = queryNorm
                0.4466312 = fieldWeight in 7936, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.042444 = idf(docFreq=2109, maxDocs=44218)
                  0.078125 = fieldNorm(doc=7936)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
  12. Hersh, W.R.; Hickam, D.H.: ¬A comparison of two methods for indexing and retrieval from a full-text medical database (1992) 0.02
    0.0150713315 = product of:
      0.045213994 = sum of:
        0.045213994 = product of:
          0.09042799 = sum of:
            0.09042799 = weight(_text_:database in 4526) [ClassicSimilarity], result of:
              0.09042799 = score(doc=4526,freq=4.0), product of:
                0.20452234 = queryWeight, product of:
                  4.042444 = idf(docFreq=2109, maxDocs=44218)
                  0.050593734 = queryNorm
                0.44214234 = fieldWeight in 4526, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.042444 = idf(docFreq=2109, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4526)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Reports results of a study of 2 information retrieval systems on a 2.000 document full text medical database. The first system, SAPHIRE, features concept based automatic indexing and statistical retrieval techniques, while the second system, SWORD, features traditional word based Boolean techniques, 16 medical students at Oregon Health Sciences Univ. each performed 10 searches and their results, recorded in terms of recall and precision, showed nearly equal performance for both systems. SAPHIRE was also compared with a version of SWORD modified to use automatic indexing and ranked retrieval. Using batch input of queries, the latter method performed slightly better
  13. Lepsky, K.; Siepmann, J.; Zimmermann, A.: Automatische Indexierung für Online-Kataloge : Ergebnisse eines Retrievaltests (1996) 0.02
    0.0150713315 = product of:
      0.045213994 = sum of:
        0.045213994 = product of:
          0.09042799 = sum of:
            0.09042799 = weight(_text_:database in 3251) [ClassicSimilarity], result of:
              0.09042799 = score(doc=3251,freq=4.0), product of:
                0.20452234 = queryWeight, product of:
                  4.042444 = idf(docFreq=2109, maxDocs=44218)
                  0.050593734 = queryNorm
                0.44214234 = fieldWeight in 3251, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.042444 = idf(docFreq=2109, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3251)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Examines the effectiveness of automated indexing and presents the results of a study of information retrieval from a segment (40.000 items) of the ULB Düsseldorf database. The segment was selected randomly and all the documents included were indexed automatically. The search topics included 50 subject areas ranging from economic growth to alternative energy sources. While there were 876 relevant documents in the database segment for each of the 50 search topics, the recall ranged from 1 to 244 references, with the average being 17.52 documents per topic. Therefore it seems that, in the immediate future, automatic indexing should be used in combination with intellectual indexing
  14. Fuhr, N.: Ranking-Experimente mit gewichteter Indexierung (1986) 0.01
    0.013709504 = product of:
      0.041128512 = sum of:
        0.041128512 = product of:
          0.082257025 = sum of:
            0.082257025 = weight(_text_:22 in 58) [ClassicSimilarity], result of:
              0.082257025 = score(doc=58,freq=2.0), product of:
                0.17717063 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050593734 = queryNorm
                0.46428138 = fieldWeight in 58, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=58)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    14. 6.2015 22:12:44
  15. Hauer, M.: Automatische Indexierung (2000) 0.01
    0.013709504 = product of:
      0.041128512 = sum of:
        0.041128512 = product of:
          0.082257025 = sum of:
            0.082257025 = weight(_text_:22 in 5887) [ClassicSimilarity], result of:
              0.082257025 = score(doc=5887,freq=2.0), product of:
                0.17717063 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050593734 = queryNorm
                0.46428138 = fieldWeight in 5887, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=5887)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Source
    Wissen in Aktion: Wege des Knowledge Managements. 22. Online-Tagung der DGI, Frankfurt am Main, 2.-4.5.2000. Proceedings. Hrsg.: R. Schmidt
  16. Fuhr, N.: Rankingexperimente mit gewichteter Indexierung (1986) 0.01
    0.013709504 = product of:
      0.041128512 = sum of:
        0.041128512 = product of:
          0.082257025 = sum of:
            0.082257025 = weight(_text_:22 in 2051) [ClassicSimilarity], result of:
              0.082257025 = score(doc=2051,freq=2.0), product of:
                0.17717063 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050593734 = queryNorm
                0.46428138 = fieldWeight in 2051, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=2051)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    14. 6.2015 22:12:56
  17. Hauer, M.: Tiefenindexierung im Bibliothekskatalog : 17 Jahre intelligentCAPTURE (2019) 0.01
    0.013709504 = product of:
      0.041128512 = sum of:
        0.041128512 = product of:
          0.082257025 = sum of:
            0.082257025 = weight(_text_:22 in 5629) [ClassicSimilarity], result of:
              0.082257025 = score(doc=5629,freq=2.0), product of:
                0.17717063 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050593734 = queryNorm
                0.46428138 = fieldWeight in 5629, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=5629)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Source
    B.I.T.online. 22(2019) H.2, S.163-166
  18. Roberts, D.; Souter, C.: ¬The automation of controlled vocabulary subject indexing of medical journal articles (2000) 0.01
    0.012918284 = product of:
      0.03875485 = sum of:
        0.03875485 = product of:
          0.0775097 = sum of:
            0.0775097 = weight(_text_:database in 711) [ClassicSimilarity], result of:
              0.0775097 = score(doc=711,freq=4.0), product of:
                0.20452234 = queryWeight, product of:
                  4.042444 = idf(docFreq=2109, maxDocs=44218)
                  0.050593734 = queryNorm
                0.37897915 = fieldWeight in 711, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.042444 = idf(docFreq=2109, maxDocs=44218)
                  0.046875 = fieldNorm(doc=711)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    This article discusses the possibility of the automation of sophisticated subject indexing of medical journal articles. Approaches to subject descriptor assignment in information retrieval research are usually either based upon the manual descriptors in the database or generation of search parameters from the text of the article. The principles of the Medline indexing system are described, followed by a summary of a pilot project, based upon the Amed database. The results suggest that a more extended study, based upon Medline, should encompass various components: Extraction of 'concept strings' from titles and abstracts of records, based upon linguistic features characteristic of medical literature. Use of the Unified Medical Language System (UMLS) for identification of controlled vocabulary descriptors. Coordination of descriptors, utilising features of the Medline indexing system. The emphasis should be on system manipulation of data, based upon input, available resources and specifically designed rules.
  19. Koryconski, C.; Newell, A.F.: Natural-language processing and automatic indexing (1990) 0.01
    0.012179474 = product of:
      0.036538422 = sum of:
        0.036538422 = product of:
          0.073076844 = sum of:
            0.073076844 = weight(_text_:database in 2313) [ClassicSimilarity], result of:
              0.073076844 = score(doc=2313,freq=2.0), product of:
                0.20452234 = queryWeight, product of:
                  4.042444 = idf(docFreq=2109, maxDocs=44218)
                  0.050593734 = queryNorm
                0.35730496 = fieldWeight in 2313, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.042444 = idf(docFreq=2109, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2313)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    The task of producing satisfactory indexes by automatic means has been tackled on two fronts: by statistical analysis of text and by attempting content analysis of the text in much the same way as a human indexer does. Though statistical techniques have a lot to offer for free-text database systems, neither method has had much success with back-of-the-book indexing. This review examines some problems associated with the application of natural-language processing techniques to book texts. - Vgl. auch die Erwiderung von K.P. Jones
  20. Cunningham, P.; Veale, T.; Conway, A.: Knowledge acquisition for concept indexing in document retrieval (1992) 0.01
    0.012179474 = product of:
      0.036538422 = sum of:
        0.036538422 = product of:
          0.073076844 = sum of:
            0.073076844 = weight(_text_:database in 5083) [ClassicSimilarity], result of:
              0.073076844 = score(doc=5083,freq=2.0), product of:
                0.20452234 = queryWeight, product of:
                  4.042444 = idf(docFreq=2109, maxDocs=44218)
                  0.050593734 = queryNorm
                0.35730496 = fieldWeight in 5083, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.042444 = idf(docFreq=2109, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5083)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Describes TWIG, a system for knowledge acquisition from text for use in an intelligent document database system. Documents are scanned into the system and converted into a hypertext thus providing a richer environment for browsing and retrieval. The knowledge acquisition phase is blackboard based with the text analysis expertise partitioned into agents that communicate through the blackboard

Years

Languages

  • e 33
  • d 18
  • ru 1
  • More… Less…

Types

  • a 47
  • x 3
  • el 1
  • m 1
  • s 1
  • More… Less…