Search (5 results, page 1 of 1)

Did you mean:
rswk_00%3a%22World wide web %2f meinungs%c3%A4u%c3%9fung %2f data mining%22 5
rswk_00%3a%22World wide web %2f meinung%c3%A4u%c3%9fung %2f data mining%22 5

Gödert, W.; Liebig, M.: Maschinelle Indexierung auf dem Prüfstand : Ergebnisse eines Retrievaltests zum MILOS II Projekt (1997) 0.00
```
0.004056596 = product of:
  0.03245277 = sum of:
    0.03245277 = weight(_text_:data in 1174) [ClassicSimilarity], result of:
      0.03245277 = score(doc=1174,freq=4.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.34584928 = fieldWeight in 1174, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1174)
  0.125 = coord(1/8)
```
Abstract

The test ran between Nov 95-Aug 96 in Cologne Fachhochschule fur Bibliothekswesen (College of Librarianship).The test basis was a database of 190,000 book titles published between 1990-95. MILOS II mechanized indexing methods proved helpful in avoiding or reducing numbers of unsatisfied/no result retrieval searches. Retrieval from mechanised indexing is 3 times more successful than from title keyword data. MILOS II also used a standardized semantic vocabulary. Mechanised indexing demands high quality software and output data

Fuhr, N.; Niewelt, B.: ¬Ein Retrievaltest mit automatisch indexierten Dokumenten (1984) 0.00

0.003518027 = product of:
  0.028144216 = sum of:
    0.028144216 = product of:
      0.056288432 = sum of:
        0.056288432 = weight(_text_:22 in 262) [ClassicSimilarity], result of:
          0.056288432 = score(doc=262,freq=2.0), product of:
            0.103918076 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.029675366 = queryNorm
            0.5416616 = fieldWeight in 262, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=262)
      0.5 = coord(1/2)
  0.125 = coord(1/8)

Date: 20.10.2000 12:22:23

Scherer, B.: Automatische Indexierung und ihre Anwendung im DFG-Projekt "Gemeinsames Portal für Bibliotheken, Archive und Museen (BAM)" (2003) 0.00
```
0.0032620174 = product of:
  0.02609614 = sum of:
    0.02609614 = product of:
      0.05219228 = sum of:
        0.05219228 = weight(_text_:mining in 4283) [ClassicSimilarity], result of:
          0.05219228 = score(doc=4283,freq=2.0), product of:
            0.16744171 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.029675366 = queryNorm
            0.31170416 = fieldWeight in 4283, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4283)
      0.5 = coord(1/2)
  0.125 = coord(1/8)
```
Abstract

Automatische Indexierung verzeichnet schon seit einigen Jahren aufgrund steigender Informationsflut ein wachsendes Interesse. Allerdings gibt es immer noch Vorbehalte gegenüber der intellektuellen Indexierung in Bezug auf Qualität und größerem Aufwand der Systemimplementierung bzw. -pflege. Neuere Entwicklungen aus dem Bereich des Wissensmanagements, wie beispielsweise Verfahren aus der Künstlichen Intelligenz, der Informationsextraktion, dem Text Mining bzw. der automatischen Klassifikation sollen die automatische Indexierung aufwerten und verbessern. Damit soll eine intelligentere und mehr inhaltsbasierte Erschließung geleistet werden. In dieser Masterarbeit wird außerhalb der Darstellung von Grundlagen und Verfahren der automatischen Indexierung sowie neueren Entwicklungen auch Möglichkeiten der Evaluation dargestellt. Die mögliche Anwendung der automatischen Indexierung im DFG-ProjektGemeinsames Portal für Bibliotheken, Archive und Museen (BAM)" bilden den Schwerpunkt der Arbeit. Im Portal steht die bibliothekarische Erschließung von Texten im Vordergrund. In einem umfangreichen Test werden drei deutsche, linguistische Systeme mit statistischen Verfahren kombiniert (die aber teilweise im System bereits integriert ist) und evaluiert, allerdings nur auf der Basis der ausgegebenen Indexate. Abschließend kann festgestellt werden, dass die Ergebnisse und damit die Qualität (bezogen auf die Indexate) von intellektueller und automatischer Indexierung noch signifikant unterschiedlich sind. Die Gründe liegen in noch zu lösenden semantischen Problemen bzw, in der Obereinstimmung mit Worten aus einem Thesaurus, die von einem automatischen Indexierungssystem nicht immer nachvollzogen werden kann. Eine Inhaltsanreicherung mit den Indexaten zum Vorteil beim Retrieval kann, je nach System oder auch über die Einbindung durch einen Thesaurus, erreicht werden.
Toepfer, M.; Seifert, C.: Content-based quality estimation for automatic subject indexing of short texts under precision and recall constraints 0.00
```
0.0028975685 = product of:
  0.023180548 = sum of:
    0.023180548 = weight(_text_:data in 4309) [ClassicSimilarity], result of:
      0.023180548 = score(doc=4309,freq=4.0), product of:
        0.093835 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.029675366 = queryNorm
        0.24703519 = fieldWeight in 4309, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4309)
  0.125 = coord(1/8)
```
Abstract

Semantic annotations have to satisfy quality constraints to be useful for digital libraries, which is particularly challenging on large and diverse datasets. Confidence scores of multi-label classification methods typically refer only to the relevance of particular subjects, disregarding indicators of insufficient content representation at the document-level. Therefore, we propose a novel approach that detects documents rather than concepts where quality criteria are met. Our approach uses a deep, multi-layered regression architecture, which comprises a variety of content-based indicators. We evaluated multiple configurations using text collections from law and economics, where the available content is restricted to very short texts. Notably, we demonstrate that the proposed quality estimation technique can determine subsets of the previously unseen data where considerable gains in document-level recall can be achieved, while upholding precision at the same time. Hence, the approach effectively performs a filtering that ensures high data quality standards in operative information retrieval systems.

Hodges, P.R.: Keyword in title indexes : effectiveness of retrieval in computer searches (1983) 0.00

0.0017590135 = product of:
  0.014072108 = sum of:
    0.014072108 = product of:
      0.028144216 = sum of:
        0.028144216 = weight(_text_:22 in 5001) [ClassicSimilarity], result of:
          0.028144216 = score(doc=5001,freq=2.0), product of:
            0.103918076 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.029675366 = queryNorm
            0.2708308 = fieldWeight in 5001, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5001)
      0.5 = coord(1/2)
  0.125 = coord(1/8)

Date: 14. 3.1996 13:22:21

Search (5 results, page 1 of 1)

Authors

Years

Languages

Types