Search (11 results, page 1 of 1)

  • × theme_ss:"Automatisches Indexieren"
  • × type_ss:"a"
  • × type_ss:"el"
  1. Gödert, W.: Detecting multiword phrases in mathematical text corpora (2012) 0.00
    0.0037893022 = product of:
      0.026525114 = sum of:
        0.006682779 = weight(_text_:information in 466) [ClassicSimilarity], result of:
          0.006682779 = score(doc=466,freq=2.0), product of:
            0.04306919 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.02453417 = queryNorm
            0.1551638 = fieldWeight in 466, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=466)
        0.019842334 = weight(_text_:retrieval in 466) [ClassicSimilarity], result of:
          0.019842334 = score(doc=466,freq=2.0), product of:
            0.07421378 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.02453417 = queryNorm
            0.26736724 = fieldWeight in 466, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=466)
      0.14285715 = coord(2/14)
    
    Abstract
    We present an approach for detecting multiword phrases in mathematical text corpora. The method used is based on characteristic features of mathematical terminology. It makes use of a software tool named Lingo which allows to identify words by means of previously defined dictionaries for specific word classes as adjectives, personal names or nouns. The detection of multiword groups is done algorithmically. Possible advantages of the method for indexing and information retrieval and conclusions for applying dictionary-based methods of automatic indexing instead of stemming procedures are discussed.
  2. Mongin, L.; Fu, Y.Y.; Mostafa, J.: Open Archives data Service prototype and automated subject indexing using D-Lib archive content as a testbed (2003) 0.00
    0.0037270163 = product of:
      0.026089113 = sum of:
        0.011207362 = weight(_text_:information in 1167) [ClassicSimilarity], result of:
          0.011207362 = score(doc=1167,freq=10.0), product of:
            0.04306919 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.02453417 = queryNorm
            0.2602176 = fieldWeight in 1167, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=1167)
        0.014881751 = weight(_text_:retrieval in 1167) [ClassicSimilarity], result of:
          0.014881751 = score(doc=1167,freq=2.0), product of:
            0.07421378 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.02453417 = queryNorm
            0.20052543 = fieldWeight in 1167, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=1167)
      0.14285715 = coord(2/14)
    
    Abstract
    The Indiana University School of Library and Information Science opened a new research laboratory in January 2003; The Indiana University School of Library and Information Science Information Processing Laboratory [IU IP Lab]. The purpose of the new laboratory is to facilitate collaboration between scientists in the department in the areas of information retrieval (IR) and information visualization (IV) research. The lab has several areas of focus. These include grid and cluster computing, and a standard Java-based software platform to support plug and play research datasets, a selection of standard IR modules and standard IV algorithms. Future development includes software to enable researchers to contribute datasets, IR algorithms, and visualization algorithms into the standard environment. We decided early on to use OAI-PMH as a resource discovery tool because it is consistent with our mission.
  3. Mielke, B.: Wider einige gängige Ansichten zur juristischen Informationserschließung (2002) 0.00
    0.003317363 = product of:
      0.023221541 = sum of:
        0.016133383 = weight(_text_:system in 2145) [ClassicSimilarity], result of:
          0.016133383 = score(doc=2145,freq=2.0), product of:
            0.07727166 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.02453417 = queryNorm
            0.20878783 = fieldWeight in 2145, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=2145)
        0.0070881573 = weight(_text_:information in 2145) [ClassicSimilarity], result of:
          0.0070881573 = score(doc=2145,freq=4.0), product of:
            0.04306919 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.02453417 = queryNorm
            0.16457605 = fieldWeight in 2145, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2145)
      0.14285715 = coord(2/14)
    
    Abstract
    Ausgehend von einer Betrachtung in der Rechtsinformatik geläufiger Annahmen zur juristischen Informationserschließung beschreibt der Beitrag wesentliche Ergebnisse einer empirischen Studie der Retrievaleffektivität von Re-cherchen in juristischen Datenbanken. Dabei steht die Frage nach der Notwendigkeit einer intellektuellen Erschließung einerseits, der Effektivität der sogenannten Stichwortsuche andererseits im Mittelpunkt. Die Ergebnisse der Studie, bei der auch ein Vergleich zwischen einem Informationssystem auf der Basis eines Booleschen Retrievalmodells mit einem System auf der Basis statistischer Verfahren vorgenommen wurde, legen den Schluss nahe, dass in der rechtsinformatischen Fachliteratur analytisch begründete Annahmen wie die Gefahr zu großer Antwortmengen bei der Stichwortsuche empirisch nicht zu belegen sind. Auch zeigt sich keine Überlegenheit intellektueller Erschließungsverfahren (Beschlagwortung) gegenüber der automatischen Indexierung, im Gegenteil führt der Einsatz eines statistischen Verfahrens bei identischer Dokumentkollektion zu einer höheren Wiedergewinnungsrate (recall).
    Source
    Information und Mobilität: Optimierung und Vermeidung von Mobilität durch Information. Proceedings des 8. Internationalen Symposiums für Informationswissenschaft (ISI 2002), 7.-10.10.2002, Regensburg. Hrsg.: Rainer Hammwöhner, Christian Wolff, Christa Womser-Hacker
  4. Pielmeier, S.; Voß, V.; Carstensen, H.; Kahl, B.: Online-Workshop "Computerunterstützte Inhaltserschließung" 2020 (2021) 0.00
    0.003020781 = product of:
      0.021145467 = sum of:
        0.016133383 = weight(_text_:system in 4409) [ClassicSimilarity], result of:
          0.016133383 = score(doc=4409,freq=2.0), product of:
            0.07727166 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.02453417 = queryNorm
            0.20878783 = fieldWeight in 4409, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=4409)
        0.0050120843 = weight(_text_:information in 4409) [ClassicSimilarity], result of:
          0.0050120843 = score(doc=4409,freq=2.0), product of:
            0.04306919 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.02453417 = queryNorm
            0.116372846 = fieldWeight in 4409, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=4409)
      0.14285715 = coord(2/14)
    
    Abstract
    Zum ersten Mal in digitaler Form und mit 230 Teilnehmer*innen fand am 11. und 12. November 2020 der 4. Workshop "Computerunterstützte Inhaltserschließung" statt, organisiert von der Deutschen Nationalbibliothek (DNB), der Firma Eurospider Information Technology, der Staatsbibliothek zu Berlin - Preußischer Kulturbesitz (SBB), der UB Stuttgart und dem Bibliotheksservice-Zentrum Baden-Württemberg (BSZ). Im Mittelpunkt stand der "Digitale Assistent DA-3": In elf Vorträgen wurden Anwendungsszenarien und Erfahrungen mit dem System vorgestellt, das Bibliotheken und andere Wissenschafts- und Kultureinrichtungen bei der Inhaltserschließung unterstützen soll. Die Begrüßung und Einführung in die beiden Workshop-Tage übernahm Frank Scholze (Generaldirektor der DNB). Er sieht den DA-3 als Baustein für die Verzahnung der intellektuellen und der maschinellen Erschließung.
  5. Toepfer, M.; Seifert, C.: Content-based quality estimation for automatic subject indexing of short texts under precision and recall constraints 0.00
    0.0023683137 = product of:
      0.016578196 = sum of:
        0.004176737 = weight(_text_:information in 4309) [ClassicSimilarity], result of:
          0.004176737 = score(doc=4309,freq=2.0), product of:
            0.04306919 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.02453417 = queryNorm
            0.09697737 = fieldWeight in 4309, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4309)
        0.012401459 = weight(_text_:retrieval in 4309) [ClassicSimilarity], result of:
          0.012401459 = score(doc=4309,freq=2.0), product of:
            0.07421378 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.02453417 = queryNorm
            0.16710453 = fieldWeight in 4309, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4309)
      0.14285715 = coord(2/14)
    
    Abstract
    Semantic annotations have to satisfy quality constraints to be useful for digital libraries, which is particularly challenging on large and diverse datasets. Confidence scores of multi-label classification methods typically refer only to the relevance of particular subjects, disregarding indicators of insufficient content representation at the document-level. Therefore, we propose a novel approach that detects documents rather than concepts where quality criteria are met. Our approach uses a deep, multi-layered regression architecture, which comprises a variety of content-based indicators. We evaluated multiple configurations using text collections from law and economics, where the available content is restricted to very short texts. Notably, we demonstrate that the proposed quality estimation technique can determine subsets of the previously unseen data where considerable gains in document-level recall can be achieved, while upholding precision at the same time. Hence, the approach effectively performs a filtering that ensures high data quality standards in operative information retrieval systems.
  6. Mao, J.; Xu, W.; Yang, Y.; Wang, J.; Yuille, A.L.: Explain images with multimodal recurrent neural networks (2014) 0.00
    0.0015032839 = product of:
      0.021045974 = sum of:
        0.021045974 = weight(_text_:retrieval in 1557) [ClassicSimilarity], result of:
          0.021045974 = score(doc=1557,freq=4.0), product of:
            0.07421378 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.02453417 = queryNorm
            0.2835858 = fieldWeight in 1557, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=1557)
      0.071428575 = coord(1/14)
    
    Abstract
    In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel sentence descriptions to explain the content of images. It directly models the probability distribution of generating a word given previous words and the image. Image descriptions are generated by sampling from this distribution. The model consists of two sub-networks: a deep recurrent neural network for sentences and a deep convolutional network for images. These two sub-networks interact with each other in a multimodal layer to form the whole m-RNN model. The effectiveness of our model is validated on three benchmark datasets (IAPR TC-12 [8], Flickr 8K [28], and Flickr 30K [13]). Our model outperforms the state-of-the-art generative method. In addition, the m-RNN model can be applied to retrieval tasks for retrieving images or sentences, and achieves significant performance improvement over the state-of-the-art methods which directly optimize the ranking objective function for retrieval.
  7. Gábor, K.; Zargayouna, H.; Tellier, I.; Buscaldi, D.; Charnois, T.: ¬A typology of semantic relations dedicated to scientific literature analysis (2016) 0.00
    0.0012401459 = product of:
      0.017362041 = sum of:
        0.017362041 = weight(_text_:retrieval in 2933) [ClassicSimilarity], result of:
          0.017362041 = score(doc=2933,freq=2.0), product of:
            0.07421378 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.02453417 = queryNorm
            0.23394634 = fieldWeight in 2933, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2933)
      0.071428575 = coord(1/14)
    
    Theme
    Semantisches Umfeld in Indexierung u. Retrieval
  8. Karpathy, A.; Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions (2015) 0.00
    0.0010629823 = product of:
      0.014881751 = sum of:
        0.014881751 = weight(_text_:retrieval in 1868) [ClassicSimilarity], result of:
          0.014881751 = score(doc=1868,freq=2.0), product of:
            0.07421378 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.02453417 = queryNorm
            0.20052543 = fieldWeight in 1868, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=1868)
      0.071428575 = coord(1/14)
    
    Abstract
    We present a model that generates free-form natural language descriptions of image regions. Our model leverages datasets of images and their sentence descriptions to learn about the inter-modal correspondences between text and visual data. Our approach is based on a novel combination of Convolutional Neural Networks over image regions, bidirectional Recurrent Neural Networks over sentences, and a structured objective that aligns the two modalities through a multimodal embedding. We then describe a Recurrent Neural Network architecture that uses the inferred alignments to learn to generate novel descriptions of image regions. We demonstrate the effectiveness of our alignment model with ranking experiments on Flickr8K, Flickr30K and COCO datasets, where we substantially improve on the state of the art. We then show that the sentences created by our generative model outperform retrieval baselines on the three aforementioned datasets and a new dataset of region-level annotations.
  9. Donahue, J.; Hendricks, L.A.; Guadarrama, S.; Rohrbach, M.; Venugopalan, S.; Saenko, K.; Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description (2014) 0.00
    8.858185E-4 = product of:
      0.012401459 = sum of:
        0.012401459 = weight(_text_:retrieval in 1873) [ClassicSimilarity], result of:
          0.012401459 = score(doc=1873,freq=2.0), product of:
            0.07421378 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.02453417 = queryNorm
            0.16710453 = fieldWeight in 1873, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1873)
      0.071428575 = coord(1/14)
    
    Abstract
    Models based on deep convolutional networks have dominated recent image interpretation tasks; we investigate whether models which are also recurrent, or "temporally deep", are effective for tasks involving sequences, visual and otherwise. We develop a novel recurrent convolutional architecture suitable for large-scale visual learning which is end-to-end trainable, and demonstrate the value of these models on benchmark video recognition tasks, image description and retrieval problems, and video narration challenges. In contrast to current models which assume a fixed spatio-temporal receptive field or simple temporal averaging for sequential processing, recurrent convolutional models are "doubly deep" in that they can be compositional in spatial and temporal "layers". Such models may have advantages when target concepts are complex and/or training data are limited. Learning long-term dependencies is possible when nonlinearities are incorporated into the network state updates. Long-term RNN models are appealing in that they directly can map variable-length inputs (e.g., video frames) to variable length outputs (e.g., natural language text) and can model complex temporal dynamics; yet they can be optimized with backpropagation. Our recurrent long-term models are directly connected to modern visual convnet models and can be jointly trained to simultaneously learn temporal dynamics and convolutional perceptual representations. Our results show such models have distinct advantages over state-of-the-art models for recognition or generation which are separately defined and/or optimized.
  10. Junger, U.; Schwens, U.: ¬Die inhaltliche Erschließung des schriftlichen kulturellen Erbes auf dem Weg in die Zukunft : Automatische Vergabe von Schlagwörtern in der Deutschen Nationalbibliothek (2017) 0.00
    5.9357885E-4 = product of:
      0.008310104 = sum of:
        0.008310104 = product of:
          0.016620208 = sum of:
            0.016620208 = weight(_text_:22 in 3780) [ClassicSimilarity], result of:
              0.016620208 = score(doc=3780,freq=2.0), product of:
                0.085914485 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02453417 = queryNorm
                0.19345059 = fieldWeight in 3780, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3780)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Date
    19. 8.2017 9:24:22
  11. Beckmann, R.; Hinrichs, I.; Janßen, M.; Milmeister, G.; Schäuble, P.: ¬Der Digitale Assistent DA-3 : Eine Plattform für die Inhaltserschließung (2019) 0.00
    3.5800604E-4 = product of:
      0.0050120843 = sum of:
        0.0050120843 = weight(_text_:information in 5408) [ClassicSimilarity], result of:
          0.0050120843 = score(doc=5408,freq=2.0), product of:
            0.04306919 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.02453417 = queryNorm
            0.116372846 = fieldWeight in 5408, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=5408)
      0.071428575 = coord(1/14)
    
    Abstract
    Der "Digitale Assistent" DA-3 ist ein webbasiertes Tool zur maschinellen Unterstützung der intellektuellen verbalen und klassifikatorischen Inhaltserschließung. Im Frühjahr 2016 wurde einer breiteren Fachöffentlichkeit die zunächst für den Einsatz im IBS|BW-Konsortium konzipierte Vorgängerversion DA-2 vorgestellt. Die Community nahm die Entwicklung vor dem Hintergrund der strategischen Diskussionen um zukunftsfähige Verfahren der Inhaltserschließung mit großem Interesse auf. Inzwischen wird das Tool in einem auf drei Jahre angelegten Kooperationsprojekt zwischen der Firma Eurospider Information Technology, dem IBS|BW-Konsortium, der Staatsbibliothek zu Berlin und den beiden Verbundzentralen VZG und BSZ zu einem zentralen und leistungsstarken Service weiterentwickelt. Die ersten Anwenderbibliotheken in SWB und GBV setzen den DA-3 während dieser Projektphase bereits erfolgreich ein, am Ende ist die Überführung in den Routinebetrieb vorgesehen. Der Beitrag beschreibt den derzeitigen Stand und Nutzen des Projekts im Kontext der aktuellen Rahmenbedingungen, stellt ausführlich die Funktionalitäten des DA-3 vor, gibt einen kleinen Einblick hinter die Kulissen der Projektpartner und einen Ausblick auf kommende Entwicklungsschritte.