Search (48 results, page 1 of 3)

Stankovic, R. et al.: Indexing of textual databases based on lexical resources : a case study for Serbian (2016) 0.03

0.03406783 = product of:
  0.06813566 = sum of:
    0.042235587 = weight(_text_:data in 2759) [ClassicSimilarity], result of:
      0.042235587 = score(doc=2759,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.34936053 = fieldWeight in 2759, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.078125 = fieldNorm(doc=2759)
    0.02590007 = product of:
      0.05180014 = sum of:
        0.05180014 = weight(_text_:22 in 2759) [ClassicSimilarity], result of:
          0.05180014 = score(doc=2759,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.38690117 = fieldWeight in 2759, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2759)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Date: 1. 2.2016 18:25:22
Source: Semantic keyword-based search on structured data sources: First COST Action IC1302 International KEYSTONE Conference, IKC 2015, Coimbra, Portugal, September 8-9, 2015. Revised Selected Papers. Eds.: J. Cardoso et al

Milstead, J.L.: Thesauri in a full-text world (1998) 0.02

0.017033914 = product of:
  0.03406783 = sum of:
    0.021117793 = weight(_text_:data in 2337) [ClassicSimilarity], result of:
      0.021117793 = score(doc=2337,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.17468026 = fieldWeight in 2337, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2337)
    0.012950035 = product of:
      0.02590007 = sum of:
        0.02590007 = weight(_text_:22 in 2337) [ClassicSimilarity], result of:
          0.02590007 = score(doc=2337,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.19345059 = fieldWeight in 2337, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2337)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Date: 22. 9.1997 19:16:05
Source: Visualizing subject access for 21st century information resources: Papers presented at the 1997 Clinic on Library Applications of Data Processing, 2-4 Mar 1997, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign. Ed.: P.A. Cochrane et al

Salton, G.; Allen, J.; Buckley, C.; Singhal, A.: Automatic analysis, theme generation, and summarization of machine-readable data (1994) 0.01

0.014782455 = product of:
  0.05912982 = sum of:
    0.05912982 = weight(_text_:data in 1168) [ClassicSimilarity], result of:
      0.05912982 = score(doc=1168,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.48910472 = fieldWeight in 1168, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.109375 = fieldNorm(doc=1168)
  0.25 = coord(1/4)

Alexander, M.: Retrieving digital data with fuzzy matching (1997) 0.01
```
0.011946027 = product of:
  0.04778411 = sum of:
    0.04778411 = weight(_text_:data in 151) [ClassicSimilarity], result of:
      0.04778411 = score(doc=151,freq=4.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.3952563 = fieldWeight in 151, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0625 = fieldNorm(doc=151)
  0.25 = coord(1/4)
```
Abstract

In 1993 the British Library established a programme of activities entitled Initiatives for Access (IFA) to identify and develop computer applications based on the new technologies emerging in the aereas of digital and network service. Discusses the problem of the effective retrieval of digital data after its capture focusing on the product Excalibur EFS which looks at the way information is sorted at its fundamental level and identifies patterns in numbers. Looks at the benefits of Excalibur and outlines other experiments in progress as part of the IFA programme

Fox, C.: Lexical analysis and stoplists (1992) 0.01

0.011946027 = product of:
  0.04778411 = sum of:
    0.04778411 = weight(_text_:data in 3502) [ClassicSimilarity], result of:
      0.04778411 = score(doc=3502,freq=4.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.3952563 = fieldWeight in 3502, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0625 = fieldNorm(doc=3502)
  0.25 = coord(1/4)

Abstract: Lexical analysis is a fundamental operation in both query processing and automatic indexing, and filtering stoplist words is an important step in the automatic indexing process. Presents basic algorithms and data structures for lexical analysis, and shows how stoplist word removal can be efficiently incorporated into lexical analysis
Source: Information retrieval: data structures and algorithms. Ed.: W.B. Frakes u. R. Baeza-Yates

Milstead, J.L.: Methodologies for subject analysis in bibliographic databases (1992) 0.01
```
0.010452774 = product of:
  0.041811097 = sum of:
    0.041811097 = weight(_text_:data in 2311) [ClassicSimilarity], result of:
      0.041811097 = score(doc=2311,freq=4.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.34584928 = fieldWeight in 2311, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2311)
  0.25 = coord(1/4)
```
Abstract

The goal of the study was to determine the state of the art of subject analysis as applied to large bibliographic data bases. The intent was to gather and evaluate information, casting it in a form that could be applied by management. There was no attempt to determine actual costs or trade-offs among costs and possible benefits. Commercial automatic indexing packages were also reviewed. The overall conclusion was that data base producers should begin working seriously on upgrading their thesauri and codifying their indexing policies as a means of moving toward development of machine aids to indexing, but that fully automatic indexing is not yet ready for wholesale implementation
Gábor, K.; Zargayouna, H.; Tellier, I.; Buscaldi, D.; Charnois, T.: ¬A typology of semantic relations dedicated to scientific literature analysis (2016) 0.01
```
0.010452774 = product of:
  0.041811097 = sum of:
    0.041811097 = weight(_text_:data in 2933) [ClassicSimilarity], result of:
      0.041811097 = score(doc=2933,freq=4.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.34584928 = fieldWeight in 2933, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2933)
  0.25 = coord(1/4)
```
Abstract

We propose a method for improving access to scientific literature by analyzing the content of research papers beyond citation links and topic tracking. Our model relies on a typology of explicit semantic relations. These relations are instantiated in the abstract/introduction part of the papers and can be identified automatically using textual data and external ontologies. Preliminary results show a promising precision in unsupervised relationship classification.

Content

Vortrag, "Semantics, Analytics, Visualisation: Enhancing Scholarly Data Workshop co-located with the 25th International World Wide Web Conference April 11, 2016 - Montreal, Canada", Montreal 2016.

Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.01

0.010360028 = product of:
  0.04144011 = sum of:
    0.04144011 = product of:
      0.08288022 = sum of:
        0.08288022 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
          0.08288022 = score(doc=402,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.61904186 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Source: Information processing and management. 22(1986) no.6, S.465-476

Strobel, S.; Marín-Arraiza, P.: Metadata for scientific audiovisual media : current practices and perspectives of the TIB / AV-portal (2015) 0.01
```
0.009144273 = product of:
  0.03657709 = sum of:
    0.03657709 = weight(_text_:data in 3667) [ClassicSimilarity], result of:
      0.03657709 = score(doc=3667,freq=6.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.30255508 = fieldWeight in 3667, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3667)
  0.25 = coord(1/4)
```
Abstract

Descriptive metadata play a key role in finding relevant search results in large amounts of unstructured data. However, current scientific audiovisual media are provided with little metadata, which makes them hard to find, let alone individual sequences. In this paper, the TIB / AV-Portal is presented as a use case where methods concerning the automatic generation of metadata, a semantic search and cross-lingual retrieval (German/English) have already been applied. These methods result in a better discoverability of the scientific audiovisual media hosted in the portal. Text, speech, and image content of the video are automatically indexed by specialised GND (Gemeinsame Normdatei) subject headings. A semantic search is established based on properties of the GND ontology. The cross-lingual retrieval uses English 'translations' that were derived by an ontology mapping (DBpedia i. a.). Further ways of increasing the discoverability and reuse of the metadata are publishing them as Linked Open Data and interlinking them with other data sets.

Hlava, M.M.K.: Automatic indexing : comparing rule-based and statistics-based indexing systems (2005) 0.01

0.009065025 = product of:
  0.0362601 = sum of:
    0.0362601 = product of:
      0.0725202 = sum of:
        0.0725202 = weight(_text_:22 in 6265) [ClassicSimilarity], result of:
          0.0725202 = score(doc=6265,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.5416616 = fieldWeight in 6265, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=6265)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Source: Information outlook. 9(2005) no.8, S.22-23

Advances in intelligent retrieval: Proc. of a conference ... Wadham College, Oxford, 16.-17.4.1985 (1986) 0.01
```
0.008959521 = product of:
  0.035838082 = sum of:
    0.035838082 = weight(_text_:data in 1384) [ClassicSimilarity], result of:
      0.035838082 = score(doc=1384,freq=4.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.29644224 = fieldWeight in 1384, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=1384)
  0.25 = coord(1/4)
```
Content

Enthält die Beiträge: ADDIS, T.: Extended relational analysis: a design approach to knowledge-based systems; PARKINSON, D.: Supercomputers and non-numeric processing; McGREGOR, D.R. u. J.R. MALONE: An architectural approach to advances in information retrieval; ALLEN, M.J. u. O.S. HARRISON: Word processing and information retrieval: some practical problems; MURTAGH, F.: Clustering and nearest neighborhood searching; ENSER, P.G.B.: Experimenting with the automatic classification of books; TESKEY, N. u. Z. RAZAK: An analysis of ranking for free text retrieval systems; ZARRI, G.P.: Interactive information retrieval: an artificial intelligence approach to deal with biographical data; HANCOX, P. u. F. SMITH: A case system processor for the PRECIS indexing language; ROUAULT, J.: Linguistic methods in information retrieval systems; ARAGON-RAMIREZ, V. u. C.D. PAICE: Design of a system for the online elucidation of natural language search statements; BROOKS, H.M., P.J. DANIELS u. N.J. BELKIN: Problem descriptions and user models: developing an intelligent interface for document retrieval systems; BLACK, W.J., P. HARGREAVES u. P.B. MAYES: HEADS: a cataloguing advisory system; BELL, D.A.: An architecture for integrating data, knowledge, and information bases

Alexander, M.: Retrieving digital data with fuzzy matching (1996) 0.01

0.008447117 = product of:
  0.03378847 = sum of:
    0.03378847 = weight(_text_:data in 6961) [ClassicSimilarity], result of:
      0.03378847 = score(doc=6961,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.2794884 = fieldWeight in 6961, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0625 = fieldNorm(doc=6961)
  0.25 = coord(1/4)

Toepfer, M.; Seifert, C.: Content-based quality estimation for automatic subject indexing of short texts under precision and recall constraints 0.01
```
0.0074662673 = product of:
  0.02986507 = sum of:
    0.02986507 = weight(_text_:data in 4309) [ClassicSimilarity], result of:
      0.02986507 = score(doc=4309,freq=4.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.24703519 = fieldWeight in 4309, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4309)
  0.25 = coord(1/4)
```
Abstract

Semantic annotations have to satisfy quality constraints to be useful for digital libraries, which is particularly challenging on large and diverse datasets. Confidence scores of multi-label classification methods typically refer only to the relevance of particular subjects, disregarding indicators of insufficient content representation at the document-level. Therefore, we propose a novel approach that detects documents rather than concepts where quality criteria are met. Our approach uses a deep, multi-layered regression architecture, which comprises a variety of content-based indicators. We evaluated multiple configurations using text collections from law and economics, where the available content is restricted to very short texts. Notably, we demonstrate that the proposed quality estimation technique can determine subsets of the previously unseen data where considerable gains in document-level recall can be achieved, while upholding precision at the same time. Hence, the approach effectively performs a filtering that ensures high data quality standards in operative information retrieval systems.
Wang, S.; Koopman, R.: Embed first, then predict (2019) 0.01
```
0.0074662673 = product of:
  0.02986507 = sum of:
    0.02986507 = weight(_text_:data in 5400) [ClassicSimilarity], result of:
      0.02986507 = score(doc=5400,freq=4.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.24703519 = fieldWeight in 5400, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5400)
  0.25 = coord(1/4)
```
Abstract

Automatic subject prediction is a desirable feature for modern digital library systems, as manual indexing can no longer cope with the rapid growth of digital collections. It is also desirable to be able to identify a small set of entities (e.g., authors, citations, bibliographic records) which are most relevant to a query. This gets more difficult when the amount of data increases dramatically. Data sparsity and model scalability are the major challenges to solving this type of extreme multilabel classification problem automatically. In this paper, we propose to address this problem in two steps: we first embed different types of entities into the same semantic space, where similarity could be computed easily; second, we propose a novel non-parametric method to identify the most relevant entities in addition to direct semantic similarities. We show how effectively this approach predicts even very specialised subjects, which are associated with few documents in the training set and are more problematic for a classifier.
Zhang, Y.; Zhang, C.; Li, J.: Joint modeling of characters, words, and conversation contexts for microblog keyphrase extraction (2020) 0.01
```
0.0074662673 = product of:
  0.02986507 = sum of:
    0.02986507 = weight(_text_:data in 5816) [ClassicSimilarity], result of:
      0.02986507 = score(doc=5816,freq=4.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.24703519 = fieldWeight in 5816, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5816)
  0.25 = coord(1/4)
```
Abstract

Millions of messages are produced on microblog platforms every day, leading to the pressing need for automatic identification of key points from the massive texts. To absorb salient content from the vast bulk of microblog posts, this article focuses on the task of microblog keyphrase extraction. In previous work, most efforts treat messages as independent documents and might suffer from the data sparsity problem exhibited in short and informal microblog posts. On the contrary, we propose to enrich contexts via exploiting conversations initialized by target posts and formed by their replies, which are generally centered around relevant topics to the target posts and therefore helpful for keyphrase identification. Concretely, we present a neural keyphrase extraction framework, which has 2 modules: a conversation context encoder and a keyphrase tagger. The conversation context encoder captures indicative representation from their conversation contexts and feeds the representation into the keyphrase tagger, and the keyphrase tagger extracts salient words from target posts. The 2 modules were trained jointly to optimize the conversation context encoding and keyphrase extraction processes. In the conversation context encoder, we leverage hierarchical structures to capture the word-level indicative representation and message-level indicative representation hierarchically. In both of the modules, we apply character-level representations, which enables the model to explore morphological features and deal with the out-of-vocabulary problem caused by the informal language style of microblog messages. Extensive comparison results on real-life data sets indicate that our model outperforms state-of-the-art models from previous studies.
Driscoll, J.R.; Rajala, D.A.; Shaffer, W.H.: ¬The operation and performance of an artificially intelligent keywording system (1991) 0.01
```
0.0073912274 = product of:
  0.02956491 = sum of:
    0.02956491 = weight(_text_:data in 6681) [ClassicSimilarity], result of:
      0.02956491 = score(doc=6681,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.24455236 = fieldWeight in 6681, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=6681)
  0.25 = coord(1/4)
```
Abstract

Presents a new approach to text analysis for automating the key phrase indexing process, using artificial intelligence techniques. This mimics the behaviour of human experts by using a rule base consisting of insertion and deletion rules generated by subject-matter experts. The insertion rules are based on the idea that some phrases found in a text imply or trigger other phrases. The deletion rules apply to semantically ambiguous phrases where text presence alone does not determine appropriateness as a key phrase. The insertion and deletion rules are used to transform a list of found phrases to a list of key phrases for indexing a document. Statistical data are provided to demonstrate the performance of this expert rule based system
Micco, M.; Popp, R.: Improving library subject access (ILSA) : a theory of clustering based in classification (1994) 0.01
```
0.0073912274 = product of:
  0.02956491 = sum of:
    0.02956491 = weight(_text_:data in 7715) [ClassicSimilarity], result of:
      0.02956491 = score(doc=7715,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.24455236 = fieldWeight in 7715, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7715)
  0.25 = coord(1/4)
```
Abstract

The ILSA prototype was developed using an object-oriented multimedia user interfcae on six NeXT workstations with two databases: the first with 100.000 MARC records and the second with 20.000 additional records enhanced with table of contents data. The items are grouped into subject clusters consisting of the classification number and the first subject heading assigned. Every other distinct keyword in the MARC record is linked to the subject cluster in an automated natural language mapping scheme, which leads the user from the term entered to the controlled vocabulary of the subject clusters in which the term appeared. The use of a hierarchical classification number (Dewey) makes it possible to broaden or narrow a search at will
Gomez, I.: Coping with the problem of subject classification diversity (1996) 0.01
```
0.0073912274 = product of:
  0.02956491 = sum of:
    0.02956491 = weight(_text_:data in 5074) [ClassicSimilarity], result of:
      0.02956491 = score(doc=5074,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.24455236 = fieldWeight in 5074, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5074)
  0.25 = coord(1/4)
```
Abstract

The delimination of a research field in bibliometric studies presents the problem of the diversity of subject classifications used in the sources of input and output data. Classification of documents according the thematic codes or keywords is the most accurate method, mainly used is specialized bibliographic or patent databases. Classification of journals in disciplines presents lower specifity, and some shortcomings as the change over time of both journals and disciplines and the increasing interdisciplinarity of research. Standardization of subject classifications emerges as an important point in bibliometric studies in order to allow international comparisons, although flexibility is needed to meet the needs of local studies
Lepsky, K.; Müller, T.; Wille, J.: Metadata improvement for image information retrieval (2010) 0.01
```
0.0073912274 = product of:
  0.02956491 = sum of:
    0.02956491 = weight(_text_:data in 4995) [ClassicSimilarity], result of:
      0.02956491 = score(doc=4995,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.24455236 = fieldWeight in 4995, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4995)
  0.25 = coord(1/4)
```
Abstract

This paper discusses the goals and results of the research project Perseus-a as an attempt to improve information retrieval of digital images by automatically connecting them with text-based descriptions. The development uses the image collection of prometheus, the distributed digital image archive for research and studies, the articles of the digitized Reallexikon zur Deutschen Kunstgeschichte, art historical terminological resources and classification data, and an open source system for linguistic and statistic automatic indexing called lingo.

Daudaravicius, V.: ¬A framework for keyphrase extraction from scientific journals (2016) 0.01

0.0073912274 = product of:
  0.02956491 = sum of:
    0.02956491 = weight(_text_:data in 2930) [ClassicSimilarity], result of:
      0.02956491 = score(doc=2930,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.24455236 = fieldWeight in 2930, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2930)
  0.25 = coord(1/4)

Content: Vortrag, "Semantics, Analytics, Visualisation: Enhancing Scholarly Data Workshop co-located with the 25th International World Wide Web Conference April 11, 2016 - Montreal, Canada", Montreal 2016.

Search (48 results, page 1 of 3)

Authors

Years

Types

Themes