Search (93 results, page 1 of 5)

Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.04

0.036520787 = product of:
  0.12782274 = sum of:
    0.08301862 = weight(_text_:management in 402) [ClassicSimilarity], result of:
      0.08301862 = score(doc=402,freq=2.0), product of:
        0.13932906 = queryWeight, product of:
          3.3706124 = idf(docFreq=4130, maxDocs=44218)
          0.041336425 = queryNorm
        0.5958457 = fieldWeight in 402, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.3706124 = idf(docFreq=4130, maxDocs=44218)
          0.125 = fieldNorm(doc=402)
    0.04480412 = product of:
      0.08960824 = sum of:
        0.08960824 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
          0.08960824 = score(doc=402,freq=2.0), product of:
            0.14475311 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.041336425 = queryNorm
            0.61904186 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)

Source: Information processing and management. 22(1986) no.6, S.465-476

Smiraglia, R.P.; Cai, X.: Tracking the evolution of clustering, machine learning, automatic indexing and automatic classification in knowledge organization (2017) 0.03
```
0.03339241 = product of:
  0.11687343 = sum of:
    0.09869386 = weight(_text_:case in 3627) [ClassicSimilarity], result of:
      0.09869386 = score(doc=3627,freq=10.0), product of:
        0.18173204 = queryWeight, product of:
          4.3964143 = idf(docFreq=1480, maxDocs=44218)
          0.041336425 = queryNorm
        0.54307353 = fieldWeight in 3627, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          4.3964143 = idf(docFreq=1480, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3627)
    0.018179566 = product of:
      0.03635913 = sum of:
        0.03635913 = weight(_text_:studies in 3627) [ClassicSimilarity], result of:
          0.03635913 = score(doc=3627,freq=2.0), product of:
            0.16494368 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.041336425 = queryNorm
            0.22043361 = fieldWeight in 3627, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3627)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)
```
Abstract

A very important extension of the traditional domain of knowledge organization (KO) arises from attempts to incorporate techniques devised in the computer science domain for automatic concept extraction and for grouping, categorizing, clustering and otherwise organizing knowledge using mechanical means. Four specific terms have emerged to identify the most prevalent techniques: machine learning, clustering, automatic indexing, and automatic classification. Our study presents three domain analytical case analyses in search of answers. The first case relies on citations located using the ISKO-supported "Knowledge Organization Bibliography." The second case relies on works in both Web of Science and SCOPUS. Case three applies co-word analysis and citation analysis to the contents of the papers in the present special issue. We observe scholars involved in "clustering" and "automatic classification" who share common thematic emphases. But we have found no coherence, no common activity and no social semantics. We have not found a research front, or a common teleology within the KO domain. We also have found a lively group of authors who have succeeded in submitting papers to this special issue, and their work quite interestingly aligns with the case studies we report. There is an emphasis on KO for information retrieval; there is much work on clustering (which involves conceptual points within texts) and automatic classification (which involves semantic groupings at the meta-document level).

Stankovic, R. et al.: Indexing of textual databases based on lexical resources : a case study for Serbian (2016) 0.03

0.033222016 = product of:
  0.116277054 = sum of:
    0.08827448 = weight(_text_:case in 2759) [ClassicSimilarity], result of:
      0.08827448 = score(doc=2759,freq=2.0), product of:
        0.18173204 = queryWeight, product of:
          4.3964143 = idf(docFreq=1480, maxDocs=44218)
          0.041336425 = queryNorm
        0.48573974 = fieldWeight in 2759, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.3964143 = idf(docFreq=1480, maxDocs=44218)
          0.078125 = fieldNorm(doc=2759)
    0.028002575 = product of:
      0.05600515 = sum of:
        0.05600515 = weight(_text_:22 in 2759) [ClassicSimilarity], result of:
          0.05600515 = score(doc=2759,freq=2.0), product of:
            0.14475311 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.041336425 = queryNorm
            0.38690117 = fieldWeight in 2759, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2759)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)

Date: 1. 2.2016 18:25:22

Fuhr, N.: Ranking-Experimente mit gewichteter Indexierung (1986) 0.03

0.027390588 = product of:
  0.09586705 = sum of:
    0.062263966 = weight(_text_:management in 58) [ClassicSimilarity], result of:
      0.062263966 = score(doc=58,freq=2.0), product of:
        0.13932906 = queryWeight, product of:
          3.3706124 = idf(docFreq=4130, maxDocs=44218)
          0.041336425 = queryNorm
        0.44688427 = fieldWeight in 58, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.3706124 = idf(docFreq=4130, maxDocs=44218)
          0.09375 = fieldNorm(doc=58)
    0.033603087 = product of:
      0.067206174 = sum of:
        0.067206174 = weight(_text_:22 in 58) [ClassicSimilarity], result of:
          0.067206174 = score(doc=58,freq=2.0), product of:
            0.14475311 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.041336425 = queryNorm
            0.46428138 = fieldWeight in 58, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=58)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)

Date: 14. 6.2015 22:12:44
Source: Deutscher Dokumentartag 1985, Nürnberg, 1.-4.10.1985: Fachinformation: Methodik - Management - Markt; neue Entwicklungen, Berufe, Produkte. Bearb.: H. Strohl-Goebel

Flores, F.N.; Moreira, V.P.: Assessing the impact of stemming accuracy on information retrieval : a multilingual perspective (2016) 0.02

0.024027621 = product of:
  0.08409667 = sum of:
    0.031131983 = weight(_text_:management in 3187) [ClassicSimilarity], result of:
      0.031131983 = score(doc=3187,freq=2.0), product of:
        0.13932906 = queryWeight, product of:
          3.3706124 = idf(docFreq=4130, maxDocs=44218)
          0.041336425 = queryNorm
        0.22344214 = fieldWeight in 3187, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.3706124 = idf(docFreq=4130, maxDocs=44218)
          0.046875 = fieldNorm(doc=3187)
    0.052964687 = weight(_text_:case in 3187) [ClassicSimilarity], result of:
      0.052964687 = score(doc=3187,freq=2.0), product of:
        0.18173204 = queryWeight, product of:
          4.3964143 = idf(docFreq=1480, maxDocs=44218)
          0.041336425 = queryNorm
        0.29144385 = fieldWeight in 3187, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.3964143 = idf(docFreq=1480, maxDocs=44218)
          0.046875 = fieldNorm(doc=3187)
  0.2857143 = coord(2/7)

Abstract: The quality of stemming algorithms is typically measured in two different ways: (i) how accurately they map the variant forms of a word to the same stem; or (ii) how much improvement they bring to Information Retrieval systems. In this article, we evaluate various stemming algorithms, in four languages, in terms of accuracy and in terms of their aid to Information Retrieval. The aim is to assess whether the most accurate stemmers are also the ones that bring the biggest gain in Information Retrieval. Experiments in English, French, Portuguese, and Spanish show that this is not always the case, as stemmers with higher error rates yield better retrieval quality. As a byproduct, we also identified the most accurate stemmers and the best for Information Retrieval purposes.
Source: Information processing and management. 52(2016) no.5, S.840-854

SIGIR'92 : Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (1992) 0.01
```
0.014016111 = product of:
  0.04905639 = sum of:
    0.018160323 = weight(_text_:management in 6671) [ClassicSimilarity], result of:
      0.018160323 = score(doc=6671,freq=2.0), product of:
        0.13932906 = queryWeight, product of:
          3.3706124 = idf(docFreq=4130, maxDocs=44218)
          0.041336425 = queryNorm
        0.13034125 = fieldWeight in 6671, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.3706124 = idf(docFreq=4130, maxDocs=44218)
          0.02734375 = fieldNorm(doc=6671)
    0.030896068 = weight(_text_:case in 6671) [ClassicSimilarity], result of:
      0.030896068 = score(doc=6671,freq=2.0), product of:
        0.18173204 = queryWeight, product of:
          4.3964143 = idf(docFreq=1480, maxDocs=44218)
          0.041336425 = queryNorm
        0.17000891 = fieldWeight in 6671, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.3964143 = idf(docFreq=1480, maxDocs=44218)
          0.02734375 = fieldNorm(doc=6671)
  0.2857143 = coord(2/7)
```
Content

HARMAN, D.: Relevance feedback revisited; AALBERSBERG, I.J.: Incremental relevance feedback; TAGUE-SUTCLIFFE, J.: Measuring the informativeness of a retrieval process; LEWIS, D.D.: An evaluation of phrasal and clustered representations on a text categorization task; BLOSSEVILLE, M.J., G. HÉBRAIL, M.G. MONTEIL u. N. PÉNOT: Automatic document classification: natural language processing, statistical analysis, and expert system techniques used together; MASAND, B., G. LINOFF u. D. WALTZ: Classifying news stories using memory based reasoning; KEEN, E.M.: Term position ranking: some new test results; CROUCH, C.J. u. B. YANG: Experiments in automatic statistical thesaurus construction; GREFENSTETTE, G.: Use of syntactic context to produce term association lists for text retrieval; ANICK, P.G. u. R.A. FLYNN: Versioning of full-text information retrieval system; BURKOWSKI, F.J.: Retrieval activities in a database consisting of heterogeneous collections; DEERWESTER, S.C., K. WACLENA u. M. LaMAR: A textual object management system; NIE, J.-Y.:Towards a probabilistic modal logic for semantic-based information retrieval; WANG, A.W., S.K.M. WONG u. Y.Y. YAO: An analysis of vector space models based on computational geometry; BARTELL, B.T., G.W. COTTRELL u. R.K. BELEW: Latent semantic indexing is an optimal special case of multidimensional scaling; GLAVITSCH, U. u. P. SCHÄUBLE: A system for retrieving speech documents; MARGULIS, E.L.: N-Poisson document modelling; HESS, M.: An incrementally extensible document retrieval system based on linguistics and logical principles; COOPER, W.S., F.C. GEY u. D.P. DABNEY: Probabilistic retrieval based on staged logistic regression; FUHR, N.: Integration of probabilistic fact and text retrieval; CROFT, B., L.A. SMITH u. H. TURTLE: A loosely-coupled integration of a text retrieval system and an object-oriented database system; DUMAIS, S.T. u. J. NIELSEN: Automating the assignement of submitted manuscripts to reviewers; GOST, M.A. u. M. MASOTTI: Design of an OPAC database to permit different subject searching accesses; ROBERTSON, A.M. u. P. WILLETT: Searching for historical word forms in a database of 17th century English text using spelling correction methods; FAX, E.A., Q.F. CHEN u. L.S. HEATH: A faster algorithm for constructing minimal perfect hash functions; MOFFAT, A. u. J. ZOBEL: Parameterised compression for sparse bitmaps; GRANDI, F., P. TIBERIO u. P. Zezula: Frame-sliced patitioned parallel signature files; ALLEN, B.: Cognitive differences in end user searching of a CD-ROM index; SONNENWALD, D.H.: Developing a theory to guide the process of designing information retrieval systems; CUTTING, D.R., J.O. PEDERSEN, D. KARGER, u. J.W. TUKEY: Scatter/ Gather: a cluster-based approach to browsing large document collections; CHALMERS, M. u. P. CHITSON: Bead: Explorations in information visualization; WILLIAMSON, C. u. B. SHNEIDERMAN: The dynamic HomeFinder: evaluating dynamic queries in a real-estate information exploring system

Willett, P.: Recent trends in hierarchic document clustering : a critical review (1988) 0.01

0.011859803 = product of:
  0.08301862 = sum of:
    0.08301862 = weight(_text_:management in 2604) [ClassicSimilarity], result of:
      0.08301862 = score(doc=2604,freq=2.0), product of:
        0.13932906 = queryWeight, product of:
          3.3706124 = idf(docFreq=4130, maxDocs=44218)
          0.041336425 = queryNorm
        0.5958457 = fieldWeight in 2604, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.3706124 = idf(docFreq=4130, maxDocs=44218)
          0.125 = fieldNorm(doc=2604)
  0.14285715 = coord(1/7)

Source: Information processing and management. 24(1988) no.5, S.577-597

Banerjee, K.; Johnson, M.: Improving access to archival collections with automated entity extraction (2015) 0.01
```
0.010700483 = product of:
  0.07490338 = sum of:
    0.07490338 = weight(_text_:case in 2144) [ClassicSimilarity], result of:
      0.07490338 = score(doc=2144,freq=4.0), product of:
        0.18173204 = queryWeight, product of:
          4.3964143 = idf(docFreq=1480, maxDocs=44218)
          0.041336425 = queryNorm
        0.41216385 = fieldWeight in 2144, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.3964143 = idf(docFreq=1480, maxDocs=44218)
          0.046875 = fieldNorm(doc=2144)
  0.14285715 = coord(1/7)
```
Abstract

The complexity and diversity of archival resources make constructing rich metadata records time consuming and expensive, which in turn limits access to these valuable materials. However, significant automation of the metadata creation process would dramatically reduce the cost of providing access points, improve access to individual resources, and establish connections between resources that would otherwise remain unknown. Using a case study at Oregon Health & Science University as a lens to examine the conceptual and technical challenges associated with automated extraction of access points, we discuss using publically accessible API's to extract entities (i.e. people, places, concepts, etc.) from digital and digitized objects. We describe why Linked Open Data is not well suited for a use case such as ours. We conclude with recommendations about how this method can be used in archives as well as for other library applications.

Thiel, T.J.: Automated indexing of information stored on optical disk electronic document image management systems (1994) 0.01

0.010377328 = product of:
  0.07264129 = sum of:
    0.07264129 = weight(_text_:management in 1260) [ClassicSimilarity], result of:
      0.07264129 = score(doc=1260,freq=2.0), product of:
        0.13932906 = queryWeight, product of:
          3.3706124 = idf(docFreq=4130, maxDocs=44218)
          0.041336425 = queryNorm
        0.521365 = fieldWeight in 1260, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.3706124 = idf(docFreq=4130, maxDocs=44218)
          0.109375 = fieldNorm(doc=1260)
  0.14285715 = coord(1/7)

Nohr, H.: Grundlagen der automatischen Indexierung : ein Lehrbuch (2003) 0.01
```
0.009130197 = product of:
  0.031955685 = sum of:
    0.020754656 = weight(_text_:management in 1767) [ClassicSimilarity], result of:
      0.020754656 = score(doc=1767,freq=2.0), product of:
        0.13932906 = queryWeight, product of:
          3.3706124 = idf(docFreq=4130, maxDocs=44218)
          0.041336425 = queryNorm
        0.14896142 = fieldWeight in 1767, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.3706124 = idf(docFreq=4130, maxDocs=44218)
          0.03125 = fieldNorm(doc=1767)
    0.01120103 = product of:
      0.02240206 = sum of:
        0.02240206 = weight(_text_:22 in 1767) [ClassicSimilarity], result of:
          0.02240206 = score(doc=1767,freq=2.0), product of:
            0.14475311 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.041336425 = queryNorm
            0.15476047 = fieldWeight in 1767, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=1767)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)
```
Date

22. 6.2009 12:46:51

Footnote

Rez. in: nfd 54(2003) H.5, S.314 (W. Ratzek): "Um entscheidungsrelevante Daten aus der ständig wachsenden Flut von mehr oder weniger relevanten Dokumenten zu extrahieren, müssen Unternehmen, öffentliche Verwaltung oder Einrichtungen der Fachinformation effektive und effiziente Filtersysteme entwickeln, einsetzen und pflegen. Das vorliegende Lehrbuch von Holger Nohr bietet erstmalig eine grundlegende Einführung in das Thema "automatische Indexierung". Denn: "Wie man Information sammelt, verwaltet und verwendet, wird darüber entscheiden, ob man zu den Gewinnern oder Verlierern gehört" (Bill Gates), heißt es einleitend. Im ersten Kapitel "Einleitung" stehen die Grundlagen im Mittelpunkt. Die Zusammenhänge zwischen Dokumenten-Management-Systeme, Information Retrieval und Indexierung für Planungs-, Entscheidungs- oder Innovationsprozesse, sowohl in Profit- als auch Non-Profit-Organisationen werden beschrieben. Am Ende des einleitenden Kapitels geht Nohr auf die Diskussion um die intellektuelle und automatische Indexierung ein und leitet damit über zum zweiten Kapitel "automatisches Indexieren. Hier geht der Autor überblickartig unter anderem ein auf - Probleme der automatischen Sprachverarbeitung und Indexierung - verschiedene Verfahren der automatischen Indexierung z.B. einfache Stichwortextraktion / Volltextinvertierung, - statistische Verfahren, Pattern-Matching-Verfahren. Die "Verfahren der automatischen Indexierung" behandelt Nohr dann vertiefend und mit vielen Beispielen versehen im umfangreichsten dritten Kapitel. Das vierte Kapitel "Keyphrase Extraction" nimmt eine Passpartout-Status ein: "Eine Zwischenstufe auf dem Weg von der automatischen Indexierung hin zur automatischen Generierung textueller Zusammenfassungen (Automatic Text Summarization) stellen Ansätze dar, die Schlüsselphrasen aus Dokumenten extrahieren (Keyphrase Extraction). Die Grenzen zwischen den automatischen Verfahren der Indexierung und denen des Text Summarization sind fließend." (S. 91). Am Beispiel NCR"s Extractor/Copernic Summarizer beschreibt Nohr die Funktionsweise.

Greiner-Petter, A.; Schubotz, M.; Cohl, H.S.; Gipp, B.: Semantic preserving bijective mappings for expressions involving special functions between computer algebra systems and document preparation systems (2019) 0.01

0.009130197 = product of:
  0.031955685 = sum of:
    0.020754656 = weight(_text_:management in 5499) [ClassicSimilarity], result of:
      0.020754656 = score(doc=5499,freq=2.0), product of:
        0.13932906 = queryWeight, product of:
          3.3706124 = idf(docFreq=4130, maxDocs=44218)
          0.041336425 = queryNorm
        0.14896142 = fieldWeight in 5499, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.3706124 = idf(docFreq=4130, maxDocs=44218)
          0.03125 = fieldNorm(doc=5499)
    0.01120103 = product of:
      0.02240206 = sum of:
        0.02240206 = weight(_text_:22 in 5499) [ClassicSimilarity], result of:
          0.02240206 = score(doc=5499,freq=2.0), product of:
            0.14475311 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.041336425 = queryNorm
            0.15476047 = fieldWeight in 5499, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=5499)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)

Date: 20. 1.2015 18:30:22
Source: Aslib journal of information management. 71(2019) no.3, S.415-439

Anderson, J.D.; Pérez-Carballo, J.: ¬The nature of indexing: how humans and machines analyze messages and texts for retrieval : Part I: Research and the nature of human indexing (2001) 0.01

0.008894852 = product of:
  0.062263966 = sum of:
    0.062263966 = weight(_text_:management in 3136) [ClassicSimilarity], result of:
      0.062263966 = score(doc=3136,freq=2.0), product of:
        0.13932906 = queryWeight, product of:
          3.3706124 = idf(docFreq=4130, maxDocs=44218)
          0.041336425 = queryNorm
        0.44688427 = fieldWeight in 3136, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.3706124 = idf(docFreq=4130, maxDocs=44218)
          0.09375 = fieldNorm(doc=3136)
  0.14285715 = coord(1/7)

Source: Information processing and management. 37(2001) no.2, S.231-254

Wolfe, EW.: a case study in automated metadata enhancement : Natural Language Processing in the humanities (2019) 0.01

0.008827448 = product of:
  0.061792135 = sum of:
    0.061792135 = weight(_text_:case in 5236) [ClassicSimilarity], result of:
      0.061792135 = score(doc=5236,freq=2.0), product of:
        0.18173204 = queryWeight, product of:
          4.3964143 = idf(docFreq=1480, maxDocs=44218)
          0.041336425 = queryNorm
        0.34001783 = fieldWeight in 5236, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.3964143 = idf(docFreq=1480, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5236)
  0.14285715 = coord(1/7)

Advances in intelligent retrieval: Proc. of a conference ... Wadham College, Oxford, 16.-17.4.1985 (1986) 0.01
```
0.007566384 = product of:
  0.052964687 = sum of:
    0.052964687 = weight(_text_:case in 1384) [ClassicSimilarity], result of:
      0.052964687 = score(doc=1384,freq=2.0), product of:
        0.18173204 = queryWeight, product of:
          4.3964143 = idf(docFreq=1480, maxDocs=44218)
          0.041336425 = queryNorm
        0.29144385 = fieldWeight in 1384, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.3964143 = idf(docFreq=1480, maxDocs=44218)
          0.046875 = fieldNorm(doc=1384)
  0.14285715 = coord(1/7)
```
Content

Enthält die Beiträge: ADDIS, T.: Extended relational analysis: a design approach to knowledge-based systems; PARKINSON, D.: Supercomputers and non-numeric processing; McGREGOR, D.R. u. J.R. MALONE: An architectural approach to advances in information retrieval; ALLEN, M.J. u. O.S. HARRISON: Word processing and information retrieval: some practical problems; MURTAGH, F.: Clustering and nearest neighborhood searching; ENSER, P.G.B.: Experimenting with the automatic classification of books; TESKEY, N. u. Z. RAZAK: An analysis of ranking for free text retrieval systems; ZARRI, G.P.: Interactive information retrieval: an artificial intelligence approach to deal with biographical data; HANCOX, P. u. F. SMITH: A case system processor for the PRECIS indexing language; ROUAULT, J.: Linguistic methods in information retrieval systems; ARAGON-RAMIREZ, V. u. C.D. PAICE: Design of a system for the online elucidation of natural language search statements; BROOKS, H.M., P.J. DANIELS u. N.J. BELKIN: Problem descriptions and user models: developing an intelligent interface for document retrieval systems; BLACK, W.J., P. HARGREAVES u. P.B. MAYES: HEADS: a cataloguing advisory system; BELL, D.A.: An architecture for integrating data, knowledge, and information bases
Liu, G.Z.: Semantic vector space model : implementation and evaluation (1997) 0.01
```
0.007566384 = product of:
  0.052964687 = sum of:
    0.052964687 = weight(_text_:case in 161) [ClassicSimilarity], result of:
      0.052964687 = score(doc=161,freq=2.0), product of:
        0.18173204 = queryWeight, product of:
          4.3964143 = idf(docFreq=1480, maxDocs=44218)
          0.041336425 = queryNorm
        0.29144385 = fieldWeight in 161, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.3964143 = idf(docFreq=1480, maxDocs=44218)
          0.046875 = fieldNorm(doc=161)
  0.14285715 = coord(1/7)
```
Abstract

Presents the Semantic Vector Space Model (SVSM), a text representation and searching technique based on the combination of Vector Space Model (VSM) with heuristic syntax parsing and distributed representation of semantic case structures. Both document and queries are represented as semantic matrices. A search mechanism is designed to compute the similarity between 2 semantic matrices to predict relevancy. A prototype system was built to implement this model by modifying the SMART system and using the Xerox Part of Speech tagged as the pre-processor of the indexing. The prototype system was used in an experimental study to evaluate this technique in terms of precision, recall, and effectiveness of relevance ranking. Results show that if documents and queries were too short, the technique was less effective than VSM. But with longer documents and queires, especially when original docuemtns were used as queries, the system based on this technique was found be performance better than SMART

Hüther, H.: Selix im DFG-Projekt Kascade (1998) 0.01

0.007412377 = product of:
  0.051886637 = sum of:
    0.051886637 = weight(_text_:management in 5151) [ClassicSimilarity], result of:
      0.051886637 = score(doc=5151,freq=2.0), product of:
        0.13932906 = queryWeight, product of:
          3.3706124 = idf(docFreq=4130, maxDocs=44218)
          0.041336425 = queryNorm
        0.37240356 = fieldWeight in 5151, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.3706124 = idf(docFreq=4130, maxDocs=44218)
          0.078125 = fieldNorm(doc=5151)
  0.14285715 = coord(1/7)

Source: Knowledge Management und Kommunikationssysteme: Proceedings des 6. Internationalen Symposiums für Informationswissenschaft (ISI '98) Prag, 3.-7. November 1998 / Hochschulverband für Informationswissenschaft (HI) e.V. Konstanz ; Fachrichtung Informationswissenschaft der Universität des Saarlandes, Saarbrücken. Hrsg.: Harald H. Zimmermann u. Volker Schramm

Anderson, J.D.; Pérez-Carballo, J.: ¬The nature of indexing: how humans and machines analyze messages and texts for retrieval : Part II: Machine indexing, and the allocation of human versus machine effort (2001) 0.01

0.007412377 = product of:
  0.051886637 = sum of:
    0.051886637 = weight(_text_:management in 368) [ClassicSimilarity], result of:
      0.051886637 = score(doc=368,freq=2.0), product of:
        0.13932906 = queryWeight, product of:
          3.3706124 = idf(docFreq=4130, maxDocs=44218)
          0.041336425 = queryNorm
        0.37240356 = fieldWeight in 368, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.3706124 = idf(docFreq=4130, maxDocs=44218)
          0.078125 = fieldNorm(doc=368)
  0.14285715 = coord(1/7)

Source: Information processing and management. 37(2001) no.2, S.255-277

Milstead, J.L.: Methodologies for subject analysis in bibliographic databases (1992) 0.01
```
0.0073378794 = product of:
  0.051365152 = sum of:
    0.051365152 = weight(_text_:management in 2311) [ClassicSimilarity], result of:
      0.051365152 = score(doc=2311,freq=4.0), product of:
        0.13932906 = queryWeight, product of:
          3.3706124 = idf(docFreq=4130, maxDocs=44218)
          0.041336425 = queryNorm
        0.36866072 = fieldWeight in 2311, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.3706124 = idf(docFreq=4130, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2311)
  0.14285715 = coord(1/7)
```
Abstract

The goal of the study was to determine the state of the art of subject analysis as applied to large bibliographic data bases. The intent was to gather and evaluate information, casting it in a form that could be applied by management. There was no attempt to determine actual costs or trade-offs among costs and possible benefits. Commercial automatic indexing packages were also reviewed. The overall conclusion was that data base producers should begin working seriously on upgrading their thesauri and codifying their indexing policies as a means of moving toward development of machine aids to indexing, but that fully automatic indexing is not yet ready for wholesale implementation

Source

Information processing and management. 28(1992) no.3, S.407-431
Vilares, D.; Alonso, M.A.; Gómez-Rodríguez, C.: On the usefulness of lexical and syntactic processing in polarity classification of Twitter messages (2015) 0.01
```
0.00630532 = product of:
  0.04413724 = sum of:
    0.04413724 = weight(_text_:case in 2161) [ClassicSimilarity], result of:
      0.04413724 = score(doc=2161,freq=2.0), product of:
        0.18173204 = queryWeight, product of:
          4.3964143 = idf(docFreq=1480, maxDocs=44218)
          0.041336425 = queryNorm
        0.24286987 = fieldWeight in 2161, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.3964143 = idf(docFreq=1480, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2161)
  0.14285715 = coord(1/7)
```
Abstract

Millions of micro texts are published every day on Twitter. Identifying the sentiment present in them can be helpful for measuring the frame of mind of the public, their satisfaction with respect to a product, or their support of a social event. In this context, polarity classification is a subfield of sentiment analysis focused on determining whether the content of a text is objective or subjective, and in the latter case, if it conveys a positive or a negative opinion. Most polarity detection techniques tend to take into account individual terms in the text and even some degree of linguistic knowledge, but they do not usually consider syntactic relations between words. This article explores how relating lexical, syntactic, and psychometric information can be helpful to perform polarity classification on Spanish tweets. We provide an evaluation for both shallow and deep linguistic perspectives. Empirical results show an improved performance of syntactic approaches over pure lexical models when using large training sets to create a classifier, but this tendency is reversed when small training collections are used.

Husevag, A.-S.R.: Named entities in indexing : a case study of TV subtitles and metadata records (2016) 0.01

0.00630532 = product of:
  0.04413724 = sum of:
    0.04413724 = weight(_text_:case in 3105) [ClassicSimilarity], result of:
      0.04413724 = score(doc=3105,freq=2.0), product of:
        0.18173204 = queryWeight, product of:
          4.3964143 = idf(docFreq=1480, maxDocs=44218)
          0.041336425 = queryNorm
        0.24286987 = fieldWeight in 3105, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.3964143 = idf(docFreq=1480, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3105)
  0.14285715 = coord(1/7)

Search (93 results, page 1 of 5)

Authors

Years

Languages

Types

Themes