Search (65 results, page 1 of 4)

Thenmalar, S.; Geetha, T.V.: Enhanced ontology-based indexing and searching (2014) 0.06
```
0.0639876 = product of:
  0.0959814 = sum of:
    0.083929054 = weight(_text_:index in 1633) [ClassicSimilarity], result of:
      0.083929054 = score(doc=1633,freq=10.0), product of:
        0.2221244 = queryWeight, product of:
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.05083213 = queryNorm
        0.37784708 = fieldWeight in 1633, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1633)
    0.01205234 = product of:
      0.02410468 = sum of:
        0.02410468 = weight(_text_:22 in 1633) [ClassicSimilarity], result of:
          0.02410468 = score(doc=1633,freq=2.0), product of:
            0.17800546 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05083213 = queryNorm
            0.1354154 = fieldWeight in 1633, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1633)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Purpose - The purpose of this paper is to improve the conceptual-based search by incorporating structural ontological information such as concepts and relations. Generally, Semantic-based information retrieval aims to identify relevant information based on the meanings of the query terms or on the context of the terms and the performance of semantic information retrieval is carried out through standard measures-precision and recall. Higher precision leads to the (meaningful) relevant documents obtained and lower recall leads to the less coverage of the concepts. Design/methodology/approach - In this paper, the authors enhance the existing ontology-based indexing proposed by Kohler et al., by incorporating sibling information to the index. The index designed by Kohler et al., contains only super and sub-concepts from the ontology. In addition, in our approach, we focus on two tasks; query expansion and ranking of the expanded queries, to improve the efficiency of the ontology-based search. The aforementioned tasks make use of ontological concepts, and relations existing between those concepts so as to obtain semantically more relevant search results for a given query. Findings - The proposed ontology-based indexing technique is investigated by analysing the coverage of concepts that are being populated in the index. Here, we introduce a new measure called index enhancement measure, to estimate the coverage of ontological concepts being indexed. We have evaluated the ontology-based search for the tourism domain with the tourism documents and tourism-specific ontology. The comparison of search results based on the use of ontology "with and without query expansion" is examined to estimate the efficiency of the proposed query expansion task. The ranking is compared with the ORank system to evaluate the performance of our ontology-based search. From these analyses, the ontology-based search results shows better recall when compared to the other concept-based search systems. The mean average precision of the ontology-based search is found to be 0.79 and the recall is found to be 0.65, the ORank system has the mean average precision of 0.62 and the recall is found to be 0.51, while the concept-based search has the mean average precision of 0.56 and the recall is found to be 0.42. Practical implications - When the concept is not present in the domain-specific ontology, the concept cannot be indexed. When the given query term is not available in the ontology then the term-based results are retrieved. Originality/value - In addition to super and sub-concepts, we incorporate the concepts present in same level (siblings) to the ontological index. The structural information from the ontology is determined for the query expansion. The ranking of the documents depends on the type of the query (single concept query, multiple concept queries and concept with relation queries) and the ontological relations that exists in the query and the documents. With this ontological structural information, the search results showed us better coverage of concepts with respect to the query.

Date

20. 1.2015 18:30:22

Kopácsi, S. et al.: Development of a classification server to support metadata harmonization in a long term preservation system (2016) 0.04

0.04194404 = product of:
  0.12583213 = sum of:
    0.12583213 = sum of:
      0.056961603 = weight(_text_:classification in 3280) [ClassicSimilarity], result of:
        0.056961603 = score(doc=3280,freq=2.0), product of:
          0.16188543 = queryWeight, product of:
            3.1847067 = idf(docFreq=4974, maxDocs=44218)
            0.05083213 = queryNorm
          0.35186368 = fieldWeight in 3280, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.1847067 = idf(docFreq=4974, maxDocs=44218)
            0.078125 = fieldNorm(doc=3280)
      0.06887052 = weight(_text_:22 in 3280) [ClassicSimilarity], result of:
        0.06887052 = score(doc=3280,freq=2.0), product of:
          0.17800546 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05083213 = queryNorm
          0.38690117 = fieldWeight in 3280, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.078125 = fieldNorm(doc=3280)
  0.33333334 = coord(1/3)

Source: Metadata and semantics research: 10th International Conference, MTSR 2016, Göttingen, Germany, November 22-25, 2016, Proceedings. Eds.: E. Garoufallou

Jansen, B.; Browne, G.M.: Navigating information spaces : index / mind map / topic map? (2021) 0.04

0.040442966 = product of:
  0.1213289 = sum of:
    0.1213289 = weight(_text_:index in 436) [ClassicSimilarity], result of:
      0.1213289 = score(doc=436,freq=4.0), product of:
        0.2221244 = queryWeight, product of:
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.05083213 = queryNorm
        0.5462205 = fieldWeight in 436, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.0625 = fieldNorm(doc=436)
  0.33333334 = coord(1/3)

Abstract: This paper discusses the use of wiki technology to provide a navigation structure for a collection of newspaper clippings. We overview the architecture of the wiki, discuss the navigation structure and pose the question: is the navigation structure an index, and if so, what type, or is it just a linkage structure or topic map. Does such a distinction really matter? Are these definitions in reality function based?

Järvelin, A.; Keskustalo, H.; Sormunen, E.; Saastamoinen, M.; Kettunen, K.: Information retrieval from historical newspaper collections in highly inflectional languages : a query expansion approach (2016) 0.03
```
0.030957699 = product of:
  0.0928731 = sum of:
    0.0928731 = weight(_text_:index in 3223) [ClassicSimilarity], result of:
      0.0928731 = score(doc=3223,freq=6.0), product of:
        0.2221244 = queryWeight, product of:
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.05083213 = queryNorm
        0.418113 = fieldWeight in 3223, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3223)
  0.33333334 = coord(1/3)
```
Abstract

The aim of the study was to test whether query expansion by approximate string matching methods is beneficial in retrieval from historical newspaper collections in a language rich with compounds and inflectional forms (Finnish). First, approximate string matching methods were used to generate lists of index words most similar to contemporary query terms in a digitized newspaper collection from the 1800s. Top index word variants were categorized to estimate the appropriate query expansion ranges in the retrieval test. Second, the effectiveness of approximate string matching methods, automatically generated inflectional forms, and their combinations were measured in a Cranfield-style test. Finally, a detailed topic-level analysis of test results was conducted. In the index of historical newspaper collection the occurrences of a word typically spread to many linguistic and historical variants along with optical character recognition (OCR) errors. All query expansion methods improved the baseline results. Extensive expansion of around 30 variants for each query word was required to achieve the highest performance improvement. Query expansion based on approximate string matching was superior to using the inflectional forms of the query words, showing that coverage of the different types of variation is more important than precision in handling one type of variation.

Harman, D.: Automatic indexing (1994) 0.03

0.028597495 = product of:
  0.08579248 = sum of:
    0.08579248 = weight(_text_:index in 7729) [ClassicSimilarity], result of:
      0.08579248 = score(doc=7729,freq=2.0), product of:
        0.2221244 = queryWeight, product of:
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.05083213 = queryNorm
        0.3862362 = fieldWeight in 7729, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.0625 = fieldNorm(doc=7729)
  0.33333334 = coord(1/3)

Content: Enthält die Abschnitte: What constitutes a record; What constitutes a word and what 'words' to index; Use of stop lists; Use of suffixing or stemming; Advanced automatic indexing techniques (term weighting, query expansion, the use of multiple-word phrases for indexing)

Otto, A.: Ordnungssysteme als Wissensbasis für die Suche in textbasierten Datenbeständen : dargestellt am Beispiel einer soziologischen Bibliographie (1998) 0.02
```
0.020221483 = product of:
  0.06066445 = sum of:
    0.06066445 = weight(_text_:index in 6625) [ClassicSimilarity], result of:
      0.06066445 = score(doc=6625,freq=4.0), product of:
        0.2221244 = queryWeight, product of:
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.05083213 = queryNorm
        0.27311024 = fieldWeight in 6625, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.03125 = fieldNorm(doc=6625)
  0.33333334 = coord(1/3)
```
Abstract

Es wird eine Methode vorgestellt, wie sich Ordnungssysteme für die Suche in textbasierten Datenbeständen verwenden lassen. "Ordnungssystem" wird hier als Oberbegriff für beliebige geordnete Begriffssammlungen verwendet. Dies sind beispielsweise Thesauri, Klassifikationen und formale Systematiken. Weil Thesauri dabei die leistungsfähigsten Ordnungssysteme sind, finden sie eine besondere Berücksichtigung. Der Beitrag ist streng praxisbezogenen und auf die Nutzerschnittstelle konzentriert. Die Basis für die Nutzerschnittstelle bilden Ordnungssysteme, die über eine WWW-Schnittstelle angeboten werden. Je nach Fachgebiet kann der Nutzer ein spezielles Ordnungssystem für die Suche auswählen. Im Unterschied zu klassischen Verfahren werden die Ordnungssysteme nicht zur ausschließlichen Suche in Deskriptorenfeldern, sondern für die Suche in einem Basic Index verwendet. In der Anwendung auf den Basic Index sind die Ordnungssysteme quasi "entkoppelt" von der ursprünglichen Datenbank und den Deskriptorenfeldern, für die das Ordnungssystem entwickelt wurde. Die Inhalte einer Datenbank spielen bei der Wahl der Ordnungssysteme zunächst keine Rolle. Sie machen sich erst bei der Suche in der Anzahl der Treffer bemerkbar: so findet ein rechtswissenschaftlicher Thesaurus natürlicherweise in einer Medizin-Datenbank weniger relevante Dokumente als in einer Rechts-Datenbank, weil das im Rechts-Thesaurus abgebildete Begriffsgut eher in einer Rechts-Datenbank zu finden ist. Das Verfahren ist modular aufgebaut und sieht in der Konzeption nachgeordnete semantische Retrievalverfahren vor, die zu einer Verbesserung von Retrievaleffektivität und -effizienz führen werden. So werden aus einer Ergebnismenge, die ausschließlich durch exakten Zeichenkettenabgleich gefunden wurde, in einem nachfolgenden Schritt durch eine semantische Analyse diejenigen Dokumente herausgefiltert, die für die Suchfrage relevant sind. Die WWW-Nutzerschnittstelle und die Verwendung bereits bestehender Ordnungssysteme führen zu einer Minimierung des Arbeitsaufwands auf Nutzerseite. Die Kosten für eine Suche lassen sich sowohl auf der Input-Seite verringern, indem eine aufwendige "manuelle" Indexierung entfällt, als auch auf der Output-Seite, indem den Nutzern leicht bedienbare Suchoptionen zur Verfügung gestellt werden
Hancock-Beaulieu, M.: Evaluating the impact of an online library catalogue on subject searching behaviour at the catalogue and at the shelves (1990) 0.02
```
0.017873434 = product of:
  0.0536203 = sum of:
    0.0536203 = weight(_text_:index in 5691) [ClassicSimilarity], result of:
      0.0536203 = score(doc=5691,freq=2.0), product of:
        0.2221244 = queryWeight, product of:
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.05083213 = queryNorm
        0.24139762 = fieldWeight in 5691, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5691)
  0.33333334 = coord(1/3)
```
Abstract

The second half of a 'before and after' study to evaluate the impact of an online catalogue on subject searching behaviour is reported. A holistic approach is adopted encompassing both catalogue use and browsing at the shelves for catalogue users and non-users. Verbal and non-verbal data were elicited from searchers using a combined methodology including talk-aloud technique, observation and a screen logging facility. An extensive qualitative analysis was carried out correlating expressed topics, search formulation strategies and documents retrieved at the shelves. The online catalogue environment does not appear to have increased the extent of subject searching nor the use of the bibliographic tool. The manual PRECIS index supported a contextual approach for broad and more interactive search formulations whereas the OPAC encouraged a matching approach and narrow formulations with fewer but user generated formulations. The success rate of the online catalogue was slightly better than that of the manual tools but fewer items were retrieved at the shelves. Non-users of the bibliographic tools seemed to be just as successful. To improve retrieval effectiveness it is suggested that online catalogues should cater for both matching and contextual approaches to searching. Recent research indicates that a more interactive process could be promoted by providing query expansion through a combination of searching aids for matching, for search formulation assistance and for structured contextual retrieval
Weichselgartner, E.: ZPID bindet Thesaurus in Retrievaloberfläche ein (2006) 0.02
```
0.017873434 = product of:
  0.0536203 = sum of:
    0.0536203 = weight(_text_:index in 5962) [ClassicSimilarity], result of:
      0.0536203 = score(doc=5962,freq=2.0), product of:
        0.2221244 = queryWeight, product of:
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.05083213 = queryNorm
        0.24139762 = fieldWeight in 5962, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5962)
  0.33333334 = coord(1/3)
```
Abstract

Seit 3. Juli 2006 stellt das ZPID eine verbesserte Suchoberfläche für die Recherche in der bibliographischen Psychologie-Datenbank PSYNDEX zur Verfügung. Hauptmerkmal der neuen Version 1.1 des 'ZPID-Retrieval für PSYNDEX' ist die Einbindung von 'PSYNDEX Terms', dem kontrollierten Wortschatz der psychologischen Fachsprache. PSYNDEX Terms basiert auf dem 'Thesaurus of Psychological Index Terms' der American Psychological Association (APA) und enthält im Moment über 5.400 Deskriptoren. Zu jedem Deskriptor werden ggf. Oberbegriffe, Unterbegriffe und verwandte Begriffe angezeigt. Wer die Suchoberfläche nutzt, kann entweder im Thesaurus blättern oder gezielt nach Thesaurusbegriffen suchen. Kommt der eigene frei gewählte Suchbegriff nicht im Thesaurus vor, macht das System selbsttätig Vorschläge für passende Thesaurusbegriffe. DerThesaurus ist komplett zweisprachig (deutsch/englisch) implementiert, sodass er auch als Übersetzungshilfe dient. Weitere Verbesserungen der Suchoberfläche betreffen die Darstellbarkeit in unterschiedlichen Web-Browsern mit dem Ziel der Barrierefreiheit, die Erweiterung der OnlineHilfe mit Beispielen für erfolgreiche Suchstrategien, die Möglichkeit, zu speziellen Themen vertiefte Informationen abzurufen (den Anfang machen psychologische Behandlungsprogramme) und die Bereitstellung eines Export-Filters für EndNote. Zielgruppe des ZPID-Retrieval sind Einzelpersonen, die keinen institutionellen PSYNDEX-Zugang, z.B. am Campus einer Universität, nutzen können. Sie können das kostenpflichtige Retrieval direkt online erwerben und werden binnen weniger Minuten freigeschaltet. Kunden mit existierendem Vertrag kommen automatisch in den Genuss der verbesserten Suchoberfläche.
Bergamaschi, S.; Domnori, E.; Guerra, F.; Rota, S.; Lado, R.T.; Velegrakis, Y.: Understanding the semantics of keyword queries on relational data without accessing the instance (2012) 0.02
```
0.017873434 = product of:
  0.0536203 = sum of:
    0.0536203 = weight(_text_:index in 431) [ClassicSimilarity], result of:
      0.0536203 = score(doc=431,freq=2.0), product of:
        0.2221244 = queryWeight, product of:
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.05083213 = queryNorm
        0.24139762 = fieldWeight in 431, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.0390625 = fieldNorm(doc=431)
  0.33333334 = coord(1/3)
```
Abstract

The birth of the Web has brought an exponential growth to the amount of the information that is freely available to the Internet population, overloading users and entangling their efforts to satisfy their information needs. Web search engines such Google, Yahoo, or Bing have become popular mainly due to the fact that they offer an easy-to-use query interface (i.e., based on keywords) and an effective and efficient query execution mechanism. The majority of these search engines do not consider information stored on the deep or hidden Web [9,28], despite the fact that the size of the deep Web is estimated to be much bigger than the surface Web [9,47]. There have been a number of systems that record interactions with the deep Web sources or automatically submit queries them (mainly through their Web form interfaces) in order to index their context. Unfortunately, this technique is only partially indexing the data instance. Moreover, it is not possible to take advantage of the query capabilities of data sources, for example, of the relational query features, because their interface is often restricted from the Web form. Besides, Web search engines focus on retrieving documents and not on querying structured sources, so they are unable to access information based on concepts.
Chebil, W.; Soualmia, L.F.; Omri, M.N.; Darmoni, S.F.: Indexing biomedical documents with a possibilistic network (2016) 0.02
```
0.017873434 = product of:
  0.0536203 = sum of:
    0.0536203 = weight(_text_:index in 2854) [ClassicSimilarity], result of:
      0.0536203 = score(doc=2854,freq=2.0), product of:
        0.2221244 = queryWeight, product of:
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.05083213 = queryNorm
        0.24139762 = fieldWeight in 2854, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2854)
  0.33333334 = coord(1/3)
```
Abstract

In this article, we propose a new approach for indexing biomedical documents based on a possibilistic network that carries out partial matching between documents and biomedical vocabulary. The main contribution of our approach is to deal with the imprecision and uncertainty of the indexing task using possibility theory. We enhance estimation of the similarity between a document and a given concept using the two measures of possibility and necessity. Possibility estimates the extent to which a document is not similar to the concept. The second measure can provide confirmation that the document is similar to the concept. Our contribution also reduces the limitation of partial matching. Although the latter allows extracting from the document other variants of terms than those in dictionaries, it also generates irrelevant information. Our objective is to filter the index using the knowledge provided by the Unified Medical Language System®. Experiments were carried out on different corpora, showing encouraging results (the improvement rate is +26.37% in terms of main average precision when compared with the baseline).

Boyack, K.W.; Wylie,B.N.; Davidson, G.S.: Information Visualization, Human-Computer Interaction, and Cognitive Psychology : Domain Visualizations (2002) 0.02

0.016232938 = product of:
  0.04869881 = sum of:
    0.04869881 = product of:
      0.09739762 = sum of:
        0.09739762 = weight(_text_:22 in 1352) [ClassicSimilarity], result of:
          0.09739762 = score(doc=1352,freq=4.0), product of:
            0.17800546 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05083213 = queryNorm
            0.54716086 = fieldWeight in 1352, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=1352)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 22. 2.2003 17:25:39
22. 2.2003 18:17:40

Smeaton, A.F.; Rijsbergen, C.J. van: ¬The retrieval effects of query expansion on a feedback document retrieval system (1983) 0.02

0.016069788 = product of:
  0.04820936 = sum of:
    0.04820936 = product of:
      0.09641872 = sum of:
        0.09641872 = weight(_text_:22 in 2134) [ClassicSimilarity], result of:
          0.09641872 = score(doc=2134,freq=2.0), product of:
            0.17800546 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05083213 = queryNorm
            0.5416616 = fieldWeight in 2134, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=2134)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 30. 3.2001 13:32:22

Layfield, C.; Azzopardi, J,; Staff, C.: Experiments with document retrieval from small text collections using Latent Semantic Analysis or term similarity with query coordination and automatic relevance feedback (2017) 0.01
```
0.014298747 = product of:
  0.04289624 = sum of:
    0.04289624 = weight(_text_:index in 3478) [ClassicSimilarity], result of:
      0.04289624 = score(doc=3478,freq=2.0), product of:
        0.2221244 = queryWeight, product of:
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.05083213 = queryNorm
        0.1931181 = fieldWeight in 3478, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.03125 = fieldNorm(doc=3478)
  0.33333334 = coord(1/3)
```
Abstract

One of the problems faced by users of databases containing textual documents is the difficulty in retrieving relevant results due to the diverse vocabulary used in queries and contained in relevant documents, especially when there are only a small number of relevant documents. This problem is known as the Vocabulary Gap. The PIKES team have constructed a small test collection of 331 articles extracted from a blog and a Gold Standard for 35 queries selected from the blog's search log so the results of different approaches to semantic search can be compared. So far, prior approaches include recognising Named Entities in documents and queries, and relations including temporal relations, and represent them as `semantic layers' in a retrieval system index. In this work, we take two different approaches that do not involve Named Entity Recognition. In the first approach, we process an unannotated version of the PIKES document collection using Latent Semantic Analysis and use a combination of query coordination and automatic relevance feedback with which we outperform prior work. However, this approach is highly dependent on the underlying collection, and is not necessarily scalable to massive collections. In our second approach, we use an LSA Model generated by SEMILAR from a Wikipedia dump to generate a Term Similarity Matrix (TSM). We automatically expand the queries in the PIKES test collection with related terms from the TSM and submit them to a term-by-document matrix derived by indexing the PIKES collection using the Vector Space Model. Coupled with a combination of query coordination and automatic relevance feedback we also outperform prior work with this approach. The advantage of the second approach is that it is independent of the underlying document collection.
Caro Castro, C.; Travieso Rodríguez, C.: Ariadne's thread : knowledge structures for browsing in OPAC's (2003) 0.01
```
0.012511404 = product of:
  0.03753421 = sum of:
    0.03753421 = weight(_text_:index in 2768) [ClassicSimilarity], result of:
      0.03753421 = score(doc=2768,freq=2.0), product of:
        0.2221244 = queryWeight, product of:
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.05083213 = queryNorm
        0.16897833 = fieldWeight in 2768, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.02734375 = fieldNorm(doc=2768)
  0.33333334 = coord(1/3)
```
Abstract

Subject searching is the most common but also the most conflictive searching for end user. The aim of this paper is to check how users expressions match subject headings and to prove if knowledge structure used in online catalogs enhances searching effectiveness. A bibliographic revision about difficulties in subject access and proposed methods to improve it is also presented. For the empirical analysis, transaction logs from two university libraries, online catalogs (CISNE and FAMA) were collected. Results show that more than a quarter of user queries are effective due to an alphabetical subject index approach and browsing through hypertextual links. 1. Introduction Since the 1980's, online public access catalogs (OPAC's) have become usual way to access bibliographic information. During the last two decades the technological development has helped to extend their use, making feasible the access for a whole of users that is getting more and more extensive and heterogeneous, and also to incorporate information resources in electronic formats and to interconnect systems. However, technology seems to have developed faster than our knowledge about the tasks where it has been applied and than the evolution of our capacities for adapting to it. The conceptual model of OPAC has been hardly modified recently, and for interacting with them, users still need to combine the same skills and basic knowledge than at the beginning of its introduction (Borgman, 1986, 2000): a) conceptual knowledge to translate the information need into an appropriate query because of a well-designed mental model of the system, b) semantic and syntactic knowledge to be able to implement that query (access fields, searching type, Boolean logic, etc.) and c) basic technical skills in computing. At present many users have the essential technical skills to make use, with more or less expertise, of a computer. This number is substantially reduced when it is referred to the conceptual, semantic and syntactic knowledge that is necessary to achieve a moderately satisfactory search. An added difficulty arises in subject searching, as users should concrete their unknown information needs in terms that the information retrieval system can understand. Many researches have focused an unskilled searchers' difficulties to enter an effective query. The mental models influence, users assumption about characteristics, structure, contents and operation of the system they interact with have been analysed (Dillon, 2000; Dimitroff, 2000). Another issue that implies difficulties is vocabulary: how to find the right terms to implement a query and to modify it as the case may be. Terminology and expressions characteristics used in searching (Bates, 1993), the match between user terms and the subject headings from the catalog (Carlyle, 1989; Drabensttot, 1996; Drabensttot & Vizine-Goetz, 1994), the incidence of spelling errors (Drabensttot and Weller, 1996; Ferl and Millsap, 1996; Walker and Jones, 1987), users problems

Rekabsaz, N. et al.: Toward optimized multimodal concept indexing (2016) 0.01

0.01147842 = product of:
  0.03443526 = sum of:
    0.03443526 = product of:
      0.06887052 = sum of:
        0.06887052 = weight(_text_:22 in 2751) [ClassicSimilarity], result of:
          0.06887052 = score(doc=2751,freq=2.0), product of:
            0.17800546 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05083213 = queryNorm
            0.38690117 = fieldWeight in 2751, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2751)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 1. 2.2016 18:25:22

Kozikowski, P. et al.: Support of part-whole relations in query answering (2016) 0.01

0.01147842 = product of:
  0.03443526 = sum of:
    0.03443526 = product of:
      0.06887052 = sum of:
        0.06887052 = weight(_text_:22 in 2754) [ClassicSimilarity], result of:
          0.06887052 = score(doc=2754,freq=2.0), product of:
            0.17800546 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05083213 = queryNorm
            0.38690117 = fieldWeight in 2754, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2754)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 1. 2.2016 18:25:22

Marx, E. et al.: Exploring term networks for semantic search over RDF knowledge graphs (2016) 0.01

0.01147842 = product of:
  0.03443526 = sum of:
    0.03443526 = product of:
      0.06887052 = sum of:
        0.06887052 = weight(_text_:22 in 3279) [ClassicSimilarity], result of:
          0.06887052 = score(doc=3279,freq=2.0), product of:
            0.17800546 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05083213 = queryNorm
            0.38690117 = fieldWeight in 3279, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=3279)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Metadata and semantics research: 10th International Conference, MTSR 2016, Göttingen, Germany, November 22-25, 2016, Proceedings. Eds.: E. Garoufallou

Sacco, G.M.: Dynamic taxonomies and guided searches (2006) 0.01

0.011363056 = product of:
  0.034089167 = sum of:
    0.034089167 = product of:
      0.06817833 = sum of:
        0.06817833 = weight(_text_:22 in 5295) [ClassicSimilarity], result of:
          0.06817833 = score(doc=5295,freq=4.0), product of:
            0.17800546 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05083213 = queryNorm
            0.38301262 = fieldWeight in 5295, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5295)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 22. 7.2006 17:56:22

Morato, J.; Llorens, J.; Genova, G.; Moreiro, J.A.: Experiments in discourse analysis impact on information classification and retrieval algorithms (2003) 0.01
```
0.010614168 = product of:
  0.031842504 = sum of:
    0.031842504 = product of:
      0.06368501 = sum of:
        0.06368501 = weight(_text_:classification in 1083) [ClassicSimilarity], result of:
          0.06368501 = score(doc=1083,freq=10.0), product of:
            0.16188543 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.05083213 = queryNorm
            0.39339557 = fieldWeight in 1083, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1083)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Researchers in indexing and retrieval systems have been advocating the inclusion of more contextual information to improve results. The proliferation of full-text databases and advances in computer storage capacity have made it possible to carry out text analysis by means of linguistic and extra-linguistic knowledge. Since the mid 80s, research has tended to pay more attention to context, giving discourse analysis a more central role. The research presented in this paper aims to check whether discourse variables have an impact on modern information retrieval and classification algorithms. In order to evaluate this hypothesis, a functional framework for information analysis in an automated environment has been proposed, where the n-grams (filtering) and the k-means and Chen's classification algorithms have been tested against sub-collections of documents based on the following discourse variables: "Genre", "Register", "Domain terminology", and "Document structure". The results obtained with the algorithms for the different sub-collections were compared to the MeSH information structure. These demonstrate that n-grams does not appear to have a clear dependence on discourse variables, though the k-means classification algorithm does, but only on domain terminology and document structure, and finally Chen's algorithm has a clear dependence on all of the discourse variables. This information could be used to design better classification algorithms, where discourse variables should be taken into account. Other minor conclusions drawn from these results are also presented.
Gnoli, C.; Santis, R. de; Pusterla, L.: Commerce, see also Rhetoric : cross-discipline relationships as authority data for enhanced retrieval (2015) 0.01
```
0.010614168 = product of:
  0.031842504 = sum of:
    0.031842504 = product of:
      0.06368501 = sum of:
        0.06368501 = weight(_text_:classification in 2299) [ClassicSimilarity], result of:
          0.06368501 = score(doc=2299,freq=10.0), product of:
            0.16188543 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.05083213 = queryNorm
            0.39339557 = fieldWeight in 2299, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2299)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Subjects in a classification scheme are often related to other subjects belonging to different hierarchies. This problem was identified already by Hugh of Saint Victor (1096?-1141). Still with present-time bibliographic classifications, a user browsing the class of architecture under the hierarchy of arts may miss relevant items classified in building or in civil engineering under the hierarchy of applied sciences. To face these limitations we have developed SciGator, a browsable interface to explore the collections of all scientific libraries at the University of Pavia. Besides showing subclasses of a given class, the interface points users to related classes in the Dewey Decimal Classification, or in other local schemes, and allows for expanded queries that include them. This is made possible by using a special field for related classes in the database structure which models classification authority data. Ontologically, many relationships between classes in different hierarchies are cases of existential dependence. Dependence can occur between disciplines in such disciplinary classifications as Dewey (e.g. architecture existentially depends on building), or between phenomena in such phenomenon-based classifications as the Integrative Levels Classification (e.g. fishing as a human activity existentially depends on fish as a class of organisms). We provide an example of its representation in OWL and discuss some details of it.

Source

Classification and authority control: expanding resource discovery: proceedings of the International UDC Seminar 2015, 29-30 October 2015, Lisbon, Portugal. Eds.: Slavic, A. u. M.I. Cordeiro

Search (65 results, page 1 of 4)

Authors

Years

Languages

Types

Themes

Subjects

Classifications