Search (43 results, page 1 of 3)

  • × theme_ss:"Automatisches Indexieren"
  1. Dolamic, L.; Savoy, J.: Indexing and searching strategies for the Russian language (2009) 0.05
    0.047464628 = product of:
      0.094929256 = sum of:
        0.094929256 = product of:
          0.18985851 = sum of:
            0.18985851 = weight(_text_:light in 3301) [ClassicSimilarity], result of:
              0.18985851 = score(doc=3301,freq=6.0), product of:
                0.34357315 = queryWeight, product of:
                  5.7753086 = idf(docFreq=372, maxDocs=44218)
                  0.059490006 = queryNorm
                0.55259997 = fieldWeight in 3301, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  5.7753086 = idf(docFreq=372, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3301)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This paper describes and evaluates various stemming and indexing strategies for the Russian language. We design and evaluate two stemming approaches, a light and a more aggressive one, and compare these stemmers to the Snowball stemmer, to no stemming, and also to a language-independent approach (n-gram). To evaluate the suggested stemming strategies we apply various probabilistic information retrieval (IR) models, including the Okapi, the Divergence from Randomness (DFR), a statistical language model (LM), as well as two vector-space approaches, namely, the classical tf idf scheme and the dtu-dtn model. We find that the vector-space dtu-dtn and the DFR models tend to result in better retrieval effectiveness than the Okapi, LM, or tf idf models, while only the latter two IR approaches result in statistically significant performance differences. Ignoring stemming generally reduces the MAP by more than 50%, and these differences are always significant. When applying an n-gram approach, performance differences are usually lower than an approach involving stemming. Finally, our light stemmer tends to perform best, although performance differences between the light, aggressive, and Snowball stemmers are not statistically significant.
  2. Chou, C.; Chu, T.: ¬An analysis of BERT (NLP) for assisted subject indexing for Project Gutenberg (2022) 0.04
    0.038365204 = product of:
      0.07673041 = sum of:
        0.07673041 = product of:
          0.15346082 = sum of:
            0.15346082 = weight(_text_:light in 1139) [ClassicSimilarity], result of:
              0.15346082 = score(doc=1139,freq=2.0), product of:
                0.34357315 = queryWeight, product of:
                  5.7753086 = idf(docFreq=372, maxDocs=44218)
                  0.059490006 = queryNorm
                0.44666123 = fieldWeight in 1139, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.7753086 = idf(docFreq=372, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1139)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In light of AI (Artificial Intelligence) and NLP (Natural language processing) technologies, this article examines the feasibility of using AI/NLP models to enhance the subject indexing of digital resources. While BERT (Bidirectional Encoder Representations from Transformers) models are widely used in scholarly communities, the authors assess whether BERT models can be used in machine-assisted indexing in the Project Gutenberg collection, through suggesting Library of Congress subject headings filtered by certain Library of Congress Classification subclass labels. The findings of this study are informative for further research on BERT models to assist with automatic subject indexing for digital library collections.
  3. Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.03
    0.032240298 = product of:
      0.064480595 = sum of:
        0.064480595 = product of:
          0.12896119 = sum of:
            0.12896119 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
              0.12896119 = score(doc=402,freq=2.0), product of:
                0.20832387 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.059490006 = queryNorm
                0.61904186 = fieldWeight in 402, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=402)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Information processing and management. 22(1986) no.6, S.465-476
  4. Fuhr, N.; Niewelt, B.: ¬Ein Retrievaltest mit automatisch indexierten Dokumenten (1984) 0.03
    0.02821026 = product of:
      0.05642052 = sum of:
        0.05642052 = product of:
          0.11284104 = sum of:
            0.11284104 = weight(_text_:22 in 262) [ClassicSimilarity], result of:
              0.11284104 = score(doc=262,freq=2.0), product of:
                0.20832387 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.059490006 = queryNorm
                0.5416616 = fieldWeight in 262, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=262)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    20.10.2000 12:22:23
  5. Hlava, M.M.K.: Automatic indexing : comparing rule-based and statistics-based indexing systems (2005) 0.03
    0.02821026 = product of:
      0.05642052 = sum of:
        0.05642052 = product of:
          0.11284104 = sum of:
            0.11284104 = weight(_text_:22 in 6265) [ClassicSimilarity], result of:
              0.11284104 = score(doc=6265,freq=2.0), product of:
                0.20832387 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.059490006 = queryNorm
                0.5416616 = fieldWeight in 6265, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=6265)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Information outlook. 9(2005) no.8, S.22-23
  6. Kanan, T.; Fox, E.A.: Automated arabic text classification with P-Stemmer, machine learning, and a tailored news article taxonomy (2016) 0.03
    0.027403714 = product of:
      0.05480743 = sum of:
        0.05480743 = product of:
          0.10961486 = sum of:
            0.10961486 = weight(_text_:light in 3151) [ClassicSimilarity], result of:
              0.10961486 = score(doc=3151,freq=2.0), product of:
                0.34357315 = queryWeight, product of:
                  5.7753086 = idf(docFreq=372, maxDocs=44218)
                  0.059490006 = queryNorm
                0.31904373 = fieldWeight in 3151, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.7753086 = idf(docFreq=372, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3151)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Arabic news articles in electronic collections are difficult to study. Browsing by category is rarely supported. Although helpful machine-learning methods have been applied successfully to similar situations for English news articles, limited research has been completed to yield suitable solutions for Arabic news. In connection with a Qatar National Research Fund (QNRF)-funded project to build digital library community and infrastructure in Qatar, we developed software for browsing a collection of about 237,000 Arabic news articles, which should be applicable to other Arabic news collections. We designed a simple taxonomy for Arabic news stories that is suitable for the needs of Qatar and other nations, is compatible with the subject codes of the International Press Telecommunications Council, and was enhanced with the aid of a librarian expert as well as five Arabic-speaking volunteers. We developed tailored stemming (i.e., a new Arabic light stemmer called P-Stemmer) and automatic classification methods (the best being binary Support Vector Machines classifiers) to work with the taxonomy. Using evaluation techniques commonly used in the information retrieval community, including 10-fold cross-validation and the Wilcoxon signed-rank test, we showed that our approach to stemming and classification is superior to state-of-the-art techniques.
  7. Martins, E.F.; Belém, F.M.; Almeida, J.M.; Gonçalves, M.A.: On cold start for associative tag recommendation (2016) 0.03
    0.026800727 = product of:
      0.053601455 = sum of:
        0.053601455 = product of:
          0.16080436 = sum of:
            0.16080436 = weight(_text_:objects in 2494) [ClassicSimilarity], result of:
              0.16080436 = score(doc=2494,freq=6.0), product of:
                0.3161936 = queryWeight, product of:
                  5.315071 = idf(docFreq=590, maxDocs=44218)
                  0.059490006 = queryNorm
                0.508563 = fieldWeight in 2494, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  5.315071 = idf(docFreq=590, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2494)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Abstract
    Tag recommendation strategies that exploit term co-occurrence patterns with tags previously assigned to the target object have consistently produced state-of-the-art results. However, such techniques work only for objects with previously assigned tags. Here we focus on tag recommendation for objects with no tags, a variation of the well-known \textit{cold start} problem. We start by evaluating state-of-the-art co-occurrence based methods in cold start. Our results show that the effectiveness of these methods suffers in this situation. Moreover, we show that employing various automatic filtering strategies to generate an initial tag set that enables the use of co-occurrence patterns produces only marginal improvements. We then propose a new approach that exploits both positive and negative user feedback to iteratively select input tags along with a genetic programming strategy to learn the recommendation function. Our experimental results indicate that extending the methods to include user relevance feedback leads to gains in precision of up to 58% over the best baseline in cold start scenarios and gains of up to 43% over the best baseline in objects that contain some initial tags (i.e., no cold start). We also show that our best relevance-feedback-driven strategy performs well even in scenarios that lack user cooperation (i.e., users may refuse to provide feedback) and user reliability (i.e., users may provide the wrong feedback).
  8. Fuhr, N.: Ranking-Experimente mit gewichteter Indexierung (1986) 0.02
    0.024180222 = product of:
      0.048360445 = sum of:
        0.048360445 = product of:
          0.09672089 = sum of:
            0.09672089 = weight(_text_:22 in 58) [ClassicSimilarity], result of:
              0.09672089 = score(doc=58,freq=2.0), product of:
                0.20832387 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.059490006 = queryNorm
                0.46428138 = fieldWeight in 58, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=58)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    14. 6.2015 22:12:44
  9. Hauer, M.: Automatische Indexierung (2000) 0.02
    0.024180222 = product of:
      0.048360445 = sum of:
        0.048360445 = product of:
          0.09672089 = sum of:
            0.09672089 = weight(_text_:22 in 5887) [ClassicSimilarity], result of:
              0.09672089 = score(doc=5887,freq=2.0), product of:
                0.20832387 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.059490006 = queryNorm
                0.46428138 = fieldWeight in 5887, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=5887)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Wissen in Aktion: Wege des Knowledge Managements. 22. Online-Tagung der DGI, Frankfurt am Main, 2.-4.5.2000. Proceedings. Hrsg.: R. Schmidt
  10. Fuhr, N.: Rankingexperimente mit gewichteter Indexierung (1986) 0.02
    0.024180222 = product of:
      0.048360445 = sum of:
        0.048360445 = product of:
          0.09672089 = sum of:
            0.09672089 = weight(_text_:22 in 2051) [ClassicSimilarity], result of:
              0.09672089 = score(doc=2051,freq=2.0), product of:
                0.20832387 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.059490006 = queryNorm
                0.46428138 = fieldWeight in 2051, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=2051)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    14. 6.2015 22:12:56
  11. Hauer, M.: Tiefenindexierung im Bibliothekskatalog : 17 Jahre intelligentCAPTURE (2019) 0.02
    0.024180222 = product of:
      0.048360445 = sum of:
        0.048360445 = product of:
          0.09672089 = sum of:
            0.09672089 = weight(_text_:22 in 5629) [ClassicSimilarity], result of:
              0.09672089 = score(doc=5629,freq=2.0), product of:
                0.20832387 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.059490006 = queryNorm
                0.46428138 = fieldWeight in 5629, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=5629)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    B.I.T.online. 22(2019) H.2, S.163-166
  12. Search Engines and Beyond : Developing efficient knowledge management systems, April 19-20 1999, Boston, Mass (1999) 0.02
    0.021922972 = product of:
      0.043845944 = sum of:
        0.043845944 = product of:
          0.08769189 = sum of:
            0.08769189 = weight(_text_:light in 2596) [ClassicSimilarity], result of:
              0.08769189 = score(doc=2596,freq=2.0), product of:
                0.34357315 = queryWeight, product of:
                  5.7753086 = idf(docFreq=372, maxDocs=44218)
                  0.059490006 = queryNorm
                0.255235 = fieldWeight in 2596, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.7753086 = idf(docFreq=372, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2596)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Content
    Ramana Rao (Inxight, Palo Alto, CA) 7 ± 2 Insights on achieving Effective Information Access Session One: Updates and a twelve month perspective Danny Sullivan (Search Engine Watch, US / England) Portalization and other search trends Carol Tenopir (University of Tennessee) Search realities faced by end users and professional searchers Session Two: Today's search engines and beyond Daniel Hoogterp (Retrieval Technologies, McLean, VA) Effective presentation and utilization of search techniques Rick Kenny (Fulcrum Technologies, Ontario, Canada) Beyond document clustering: The knowledge impact statement Gary Stock (Ingenius, Kalamazoo, MI) Automated change monitoring Gary Culliss (Direct Hit, Wellesley Hills, MA) User popularity ranked search engines Byron Dom (IBM, CA) Automatically finding the best pages on the World Wide Web (CLEVER) Peter Tomassi (LookSmart, San Francisco, CA) Adding human intellect to search technology Session Three: Panel discussion: Human v automated categorization and editing Ev Brenner (New York, NY)- Chairman James Callan (University of Massachusetts, MA) Marc Krellenstein (Northern Light Technology, Cambridge, MA) Dan Miller (Ask Jeeves, Berkeley, CA) Session Four: Updates and a twelve month perspective Steve Arnold (AIT, Harrods Creek, KY) Review: The leading edge in search and retrieval software Ellen Voorhees (NIST, Gaithersburg, MD) TREC update Session Five: Search engines now and beyond Intelligent Agents John Snyder (Muscat, Cambridge, England) Practical issues behind intelligent agents Text summarization Therese Firmin, (Dept of Defense, Ft George G. Meade, MD) The TIPSTER/SUMMAC evaluation of automatic text summarization systems Cross language searching Elizabeth Liddy (TextWise, Syracuse, NY) A conceptual interlingua approach to cross-language retrieval. Video search and retrieval Armon Amir (IBM, Almaden, CA) CueVideo: Modular system for automatic indexing and browsing of video/audio Speech recognition Michael Witbrock (Lycos, Waltham, MA) Retrieval of spoken documents Visualization James A. Wise (Integral Visuals, Richland, WA) Information visualization in the new millennium: Emerging science or passing fashion? Text mining David Evans (Claritech, Pittsburgh, PA) Text mining - towards decision support
  13. Biebricher, N.; Fuhr, N.; Lustig, G.; Schwantner, M.; Knorz, G.: ¬The automatic indexing system AIR/PHYS : from research to application (1988) 0.02
    0.020150186 = product of:
      0.040300373 = sum of:
        0.040300373 = product of:
          0.080600746 = sum of:
            0.080600746 = weight(_text_:22 in 1952) [ClassicSimilarity], result of:
              0.080600746 = score(doc=1952,freq=2.0), product of:
                0.20832387 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.059490006 = queryNorm
                0.38690117 = fieldWeight in 1952, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1952)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    16. 8.1998 12:51:22
  14. Kutschekmanesch, S.; Lutes, B.; Moelle, K.; Thiel, U.; Tzeras, K.: Automated multilingual indexing : a synthesis of rule-based and thesaurus-based methods (1998) 0.02
    0.020150186 = product of:
      0.040300373 = sum of:
        0.040300373 = product of:
          0.080600746 = sum of:
            0.080600746 = weight(_text_:22 in 4157) [ClassicSimilarity], result of:
              0.080600746 = score(doc=4157,freq=2.0), product of:
                0.20832387 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.059490006 = queryNorm
                0.38690117 = fieldWeight in 4157, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=4157)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Information und Märkte: 50. Deutscher Dokumentartag 1998, Kongreß der Deutschen Gesellschaft für Dokumentation e.V. (DGD), Rheinische Friedrich-Wilhelms-Universität Bonn, 22.-24. September 1998. Hrsg. von Marlies Ockenfeld u. Gerhard J. Mantwill
  15. Tsareva, P.V.: Algoritmy dlya raspoznavaniya pozitivnykh i negativnykh vkhozdenii deskriptorov v tekst i protsedura avtomaticheskoi klassifikatsii tekstov (1999) 0.02
    0.020150186 = product of:
      0.040300373 = sum of:
        0.040300373 = product of:
          0.080600746 = sum of:
            0.080600746 = weight(_text_:22 in 374) [ClassicSimilarity], result of:
              0.080600746 = score(doc=374,freq=2.0), product of:
                0.20832387 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.059490006 = queryNorm
                0.38690117 = fieldWeight in 374, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=374)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    1. 4.2002 10:22:41
  16. Stankovic, R. et al.: Indexing of textual databases based on lexical resources : a case study for Serbian (2016) 0.02
    0.020150186 = product of:
      0.040300373 = sum of:
        0.040300373 = product of:
          0.080600746 = sum of:
            0.080600746 = weight(_text_:22 in 2759) [ClassicSimilarity], result of:
              0.080600746 = score(doc=2759,freq=2.0), product of:
                0.20832387 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.059490006 = queryNorm
                0.38690117 = fieldWeight in 2759, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2759)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    1. 2.2016 18:25:22
  17. Vinyals, O.; Toshev, A.; Bengio, S.; Erhan, D.: ¬A picture is worth a thousand (coherent) words : building a natural description of images (2014) 0.02
    0.01876051 = product of:
      0.03752102 = sum of:
        0.03752102 = product of:
          0.11256306 = sum of:
            0.11256306 = weight(_text_:objects in 1874) [ClassicSimilarity], result of:
              0.11256306 = score(doc=1874,freq=6.0), product of:
                0.3161936 = queryWeight, product of:
                  5.315071 = idf(docFreq=590, maxDocs=44218)
                  0.059490006 = queryNorm
                0.3559941 = fieldWeight in 1874, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  5.315071 = idf(docFreq=590, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=1874)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Content
    "People can summarize a complex scene in a few words without thinking twice. It's much more difficult for computers. But we've just gotten a bit closer -- we've developed a machine-learning system that can automatically produce captions (like the three above) to accurately describe images the first time it sees them. This kind of system could eventually help visually impaired people understand pictures, provide alternate text for images in parts of the world where mobile connections are slow, and make it easier for everyone to search on Google for images. Recent research has greatly improved object detection, classification, and labeling. But accurately describing a complex scene requires a deeper representation of what's going on in the scene, capturing how the various objects relate to one another and translating it all into natural-sounding language. Many efforts to construct computer-generated natural descriptions of images propose combining current state-of-the-art techniques in both computer vision and natural language processing to form a complete image description approach. But what if we instead merged recent computer vision and language models into a single jointly trained system, taking an image and directly producing a human readable sequence of words to describe it? This idea comes from recent advances in machine translation between languages, where a Recurrent Neural Network (RNN) transforms, say, a French sentence into a vector representation, and a second RNN uses that vector representation to generate a target sentence in German. Now, what if we replaced that first RNN and its input words with a deep Convolutional Neural Network (CNN) trained to classify objects in images? Normally, the CNN's last layer is used in a final Softmax among known classes of objects, assigning a probability that each object might be in the image. But if we remove that final layer, we can instead feed the CNN's rich encoding of the image into a RNN designed to produce phrases. We can then train the whole system directly on images and their captions, so it maximizes the likelihood that descriptions it produces best match the training descriptions for each image.
  18. Banerjee, K.; Johnson, M.: Improving access to archival collections with automated entity extraction (2015) 0.02
    0.018568087 = product of:
      0.037136175 = sum of:
        0.037136175 = product of:
          0.111408524 = sum of:
            0.111408524 = weight(_text_:objects in 2144) [ClassicSimilarity], result of:
              0.111408524 = score(doc=2144,freq=2.0), product of:
                0.3161936 = queryWeight, product of:
                  5.315071 = idf(docFreq=590, maxDocs=44218)
                  0.059490006 = queryNorm
                0.35234275 = fieldWeight in 2144, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.315071 = idf(docFreq=590, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2144)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Abstract
    The complexity and diversity of archival resources make constructing rich metadata records time consuming and expensive, which in turn limits access to these valuable materials. However, significant automation of the metadata creation process would dramatically reduce the cost of providing access points, improve access to individual resources, and establish connections between resources that would otherwise remain unknown. Using a case study at Oregon Health & Science University as a lens to examine the conceptual and technical challenges associated with automated extraction of access points, we discuss using publically accessible API's to extract entities (i.e. people, places, concepts, etc.) from digital and digitized objects. We describe why Linked Open Data is not well suited for a use case such as ours. We conclude with recommendations about how this method can be used in archives as well as for other library applications.
  19. Tsujii, J.-I.: Automatic acquisition of semantic collocation from corpora (1995) 0.02
    0.016120149 = product of:
      0.032240298 = sum of:
        0.032240298 = product of:
          0.064480595 = sum of:
            0.064480595 = weight(_text_:22 in 4709) [ClassicSimilarity], result of:
              0.064480595 = score(doc=4709,freq=2.0), product of:
                0.20832387 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.059490006 = queryNorm
                0.30952093 = fieldWeight in 4709, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4709)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    31. 7.1996 9:22:19
  20. Riloff, E.: ¬An empirical study of automated dictionary construction for information extraction in three domains (1996) 0.02
    0.016120149 = product of:
      0.032240298 = sum of:
        0.032240298 = product of:
          0.064480595 = sum of:
            0.064480595 = weight(_text_:22 in 6752) [ClassicSimilarity], result of:
              0.064480595 = score(doc=6752,freq=2.0), product of:
                0.20832387 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.059490006 = queryNorm
                0.30952093 = fieldWeight in 6752, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6752)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    6. 3.1997 16:22:15

Years

Languages

  • e 27
  • d 15
  • ru 1
  • More… Less…

Types

  • a 37
  • el 5
  • x 2
  • m 1
  • More… Less…