Search (99 results, page 1 of 5)

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.10

0.10336239 = sum of:
  0.08230056 = product of:
    0.24690168 = sum of:
      0.24690168 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
        0.24690168 = score(doc=562,freq=2.0), product of:
          0.43931273 = queryWeight, product of:
            8.478011 = idf(docFreq=24, maxDocs=44218)
            0.051817898 = queryNorm
          0.56201804 = fieldWeight in 562, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            8.478011 = idf(docFreq=24, maxDocs=44218)
            0.046875 = fieldNorm(doc=562)
    0.33333334 = coord(1/3)
  0.021061828 = product of:
    0.042123657 = sum of:
      0.042123657 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
        0.042123657 = score(doc=562,freq=2.0), product of:
          0.18145745 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.051817898 = queryNorm
          0.23214069 = fieldWeight in 562, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=562)
    0.5 = coord(1/2)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32

Schwarz, C.: THESYS: Thesaurus Syntax System : a fully automatic thesaurus building aid (1988) 0.05

0.053932853 = product of:
  0.107865706 = sum of:
    0.107865706 = sum of:
      0.05872144 = weight(_text_:indexing in 1361) [ClassicSimilarity], result of:
        0.05872144 = score(doc=1361,freq=2.0), product of:
          0.19835205 = queryWeight, product of:
            3.8278677 = idf(docFreq=2614, maxDocs=44218)
            0.051817898 = queryNorm
          0.29604656 = fieldWeight in 1361, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.8278677 = idf(docFreq=2614, maxDocs=44218)
            0.0546875 = fieldNorm(doc=1361)
      0.049144268 = weight(_text_:22 in 1361) [ClassicSimilarity], result of:
        0.049144268 = score(doc=1361,freq=2.0), product of:
          0.18145745 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.051817898 = queryNorm
          0.2708308 = fieldWeight in 1361, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0546875 = fieldNorm(doc=1361)
  0.5 = coord(1/2)

Abstract: THESYS is based on the natural language processing of free-text databases. It yields statistically evaluated correlations between words of the database. These correlations correspond to traditional thesaurus relations. The person who has to build a thesaurus is thus assisted by the proposals made by THESYS. THESYS is being tested on commercial databases under real world conditions. It is part of a text processing project at Siemens, called TINA (Text-Inhalts-Analyse). Software from TINA is actually being applied and evaluated by the US Department of Commerce for patent search and indexing (REALIST: REtrieval Aids by Linguistics and STatistics)
Date: 6. 1.1999 10:22:07

Rahmstorf, G.: Concept structures for large vocabularies (1998) 0.05

0.046228163 = product of:
  0.092456326 = sum of:
    0.092456326 = sum of:
      0.050332665 = weight(_text_:indexing in 75) [ClassicSimilarity], result of:
        0.050332665 = score(doc=75,freq=2.0), product of:
          0.19835205 = queryWeight, product of:
            3.8278677 = idf(docFreq=2614, maxDocs=44218)
            0.051817898 = queryNorm
          0.2537542 = fieldWeight in 75, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.8278677 = idf(docFreq=2614, maxDocs=44218)
            0.046875 = fieldNorm(doc=75)
      0.042123657 = weight(_text_:22 in 75) [ClassicSimilarity], result of:
        0.042123657 = score(doc=75,freq=2.0), product of:
          0.18145745 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.051817898 = queryNorm
          0.23214069 = fieldWeight in 75, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=75)
  0.5 = coord(1/2)

Abstract: A technology is described which supports the acquisition, visualisation and manipulation of large vocabularies with associated structures. It is used for dictionary production, terminology data bases, thesauri, library classification systems etc. Essential features of the technology are a lexicographic user interface, variable word description, unlimited list of word readings, a concept language, automatic transformations of formulas into graphic structures, structure manipulation operations and retransformation into formulas. The concept language includes notations for undefined concepts. The structure of defined concepts can be constructed interactively. The technology supports the generation of large vocabularies with structures representing word senses. Concept structures and ordering systems for indexing and retrieval can be constructed separately and connected by associating relations.
Date: 30.12.2001 19:01:22

Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.04

0.04115028 = product of:
  0.08230056 = sum of:
    0.08230056 = product of:
      0.24690168 = sum of:
        0.24690168 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
          0.24690168 = score(doc=862,freq=2.0), product of:
            0.43931273 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.051817898 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Source: https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN

Lustig, G.: ¬Das Projekt WAI : Wörterbuchentwicklung für automatisches Indexing (1982) 0.03

0.02936072 = product of:
  0.05872144 = sum of:
    0.05872144 = product of:
      0.11744288 = sum of:
        0.11744288 = weight(_text_:indexing in 33) [ClassicSimilarity], result of:
          0.11744288 = score(doc=33,freq=2.0), product of:
            0.19835205 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.051817898 = queryNorm
            0.5920931 = fieldWeight in 33, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.109375 = fieldNorm(doc=33)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Warner, A.J.: ¬The role of linguistic analysis in full-text retrieval (1994) 0.03

0.02936072 = product of:
  0.05872144 = sum of:
    0.05872144 = product of:
      0.11744288 = sum of:
        0.11744288 = weight(_text_:indexing in 2992) [ClassicSimilarity], result of:
          0.11744288 = score(doc=2992,freq=2.0), product of:
            0.19835205 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.051817898 = queryNorm
            0.5920931 = fieldWeight in 2992, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.109375 = fieldNorm(doc=2992)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Challenges in indexing electronic text and images. Ed.: R. Fidel et al

Chou, C.; Chu, T.: ¬An analysis of BERT (NLP) for assisted subject indexing for Project Gutenberg (2022) 0.03
```
0.02936072 = product of:
  0.05872144 = sum of:
    0.05872144 = product of:
      0.11744288 = sum of:
        0.11744288 = weight(_text_:indexing in 1139) [ClassicSimilarity], result of:
          0.11744288 = score(doc=1139,freq=8.0), product of:
            0.19835205 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.051817898 = queryNorm
            0.5920931 = fieldWeight in 1139, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1139)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

In light of AI (Artificial Intelligence) and NLP (Natural language processing) technologies, this article examines the feasibility of using AI/NLP models to enhance the subject indexing of digital resources. While BERT (Bidirectional Encoder Representations from Transformers) models are widely used in scholarly communities, the authors assess whether BERT models can be used in machine-assisted indexing in the Project Gutenberg collection, through suggesting Library of Congress subject headings filtered by certain Library of Congress Classification subclass labels. The findings of this study are informative for further research on BERT models to assist with automatic subject indexing for digital library collections.

Garfield, E.: ¬The relationship between mechanical indexing, structural linguistics and information retrieval (1992) 0.03

0.029059576 = product of:
  0.05811915 = sum of:
    0.05811915 = product of:
      0.1162383 = sum of:
        0.1162383 = weight(_text_:indexing in 3632) [ClassicSimilarity], result of:
          0.1162383 = score(doc=3632,freq=6.0), product of:
            0.19835205 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.051817898 = queryNorm
            0.5860202 = fieldWeight in 3632, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.0625 = fieldNorm(doc=3632)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: It is possible to locate over 60% of indexing terms used in the Current List of Medical Literature by analysing the titles of the articles. Citation indexes contain 'noise' and lack many pertinent citations. Mechanical indexing or analysis of text must begin with some linguistic technique. Discusses Harris' methods of structural linguistics, discourse analysis and transformational analysis. Provides 3 examples with references, abstracts and index entries

Warner, A.J.: Natural language processing (1987) 0.03

0.028082438 = product of:
  0.056164876 = sum of:
    0.056164876 = product of:
      0.11232975 = sum of:
        0.11232975 = weight(_text_:22 in 337) [ClassicSimilarity], result of:
          0.11232975 = score(doc=337,freq=2.0), product of:
            0.18145745 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051817898 = queryNorm
            0.61904186 = fieldWeight in 337, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=337)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Annual review of information science and technology. 22(1987), S.79-108

Smeaton, A.F.: Progress in the application of natural language processing to information retrieval tasks (1992) 0.03

0.025166333 = product of:
  0.050332665 = sum of:
    0.050332665 = product of:
      0.10066533 = sum of:
        0.10066533 = weight(_text_:indexing in 7080) [ClassicSimilarity], result of:
          0.10066533 = score(doc=7080,freq=2.0), product of:
            0.19835205 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.051817898 = queryNorm
            0.5075084 = fieldWeight in 7080, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.09375 = fieldNorm(doc=7080)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: Account of recent developments in automatic and semi-automatic text indexing as well as in the generation of thesauri, text retrieval, abstracting and summarization

Hagn-Meincke, L.L.: Sprogspil pa tvaers : sprogfilosofiske teoriers betydning for indeksering og emnesogning (1999) 0.03

0.025166333 = product of:
  0.050332665 = sum of:
    0.050332665 = product of:
      0.10066533 = sum of:
        0.10066533 = weight(_text_:indexing in 4643) [ClassicSimilarity], result of:
          0.10066533 = score(doc=4643,freq=2.0), product of:
            0.19835205 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.051817898 = queryNorm
            0.5075084 = fieldWeight in 4643, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.09375 = fieldNorm(doc=4643)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Footnote: Übers. d. Titels: Language-game interferences: the importance of linguistic theories for indexing and subject searching

Godby, C.J.; Reighart, R.R.: ¬The WordSmith Indexing System (2001) 0.03

0.025166333 = product of:
  0.050332665 = sum of:
    0.050332665 = product of:
      0.10066533 = sum of:
        0.10066533 = weight(_text_:indexing in 1063) [ClassicSimilarity], result of:
          0.10066533 = score(doc=1063,freq=2.0), product of:
            0.19835205 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.051817898 = queryNorm
            0.5075084 = fieldWeight in 1063, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.09375 = fieldNorm(doc=1063)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Wright, L.W.; Nardini, H.K.G.; Aronson, A.R.; Rindflesch, T.C.: Hierarchical concept indexing of full-text documents in the Unified Medical Language System Information sources Map (1999) 0.03
```
0.025166333 = product of:
  0.050332665 = sum of:
    0.050332665 = product of:
      0.10066533 = sum of:
        0.10066533 = weight(_text_:indexing in 2111) [ClassicSimilarity], result of:
          0.10066533 = score(doc=2111,freq=8.0), product of:
            0.19835205 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.051817898 = queryNorm
            0.5075084 = fieldWeight in 2111, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.046875 = fieldNorm(doc=2111)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Full-text documents are a vital and rapidly growing part of online biomedical information. A single large document can contain as much information as a small database, but normally lacks the tight structure and consistent indexing of a database. Retrieval systems will often miss highly relevant parts of a document if the document as a whole appears irrelevant. Access to full-text information is further complicated by the need to search separately many disparate information resources. This research explores how these problems can be addressed by the combined use of 2 techniques: 1) natural language processing for automatic concept-based indexing of full text, and 2) methods for exploiting the structure and hierarchy of full-text documents. We describe methods for applying these techniques to a large collection of full-text documents drawn from the Health Services / Technology Assessment Text (HSTAT) database at the NLM and examine how this hierarchical concept indexing can assist both document- and source-level retrieval in the context of NLM's Information Source Map project

McMahon, J.G.; Smith, F.J.: Improved statistical language model performance with automatic generated word hierarchies (1996) 0.02

0.024572134 = product of:
  0.049144268 = sum of:
    0.049144268 = product of:
      0.098288536 = sum of:
        0.098288536 = weight(_text_:22 in 3164) [ClassicSimilarity], result of:
          0.098288536 = score(doc=3164,freq=2.0), product of:
            0.18145745 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051817898 = queryNorm
            0.5416616 = fieldWeight in 3164, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=3164)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Computational linguistics. 22(1996) no.2, S.217-248

Ruge, G.: ¬A spreading activation network for automatic generation of thesaurus relationships (1991) 0.02

0.024572134 = product of:
  0.049144268 = sum of:
    0.049144268 = product of:
      0.098288536 = sum of:
        0.098288536 = weight(_text_:22 in 4506) [ClassicSimilarity], result of:
          0.098288536 = score(doc=4506,freq=2.0), product of:
            0.18145745 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051817898 = queryNorm
            0.5416616 = fieldWeight in 4506, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=4506)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 8.10.2000 11:52:22

Somers, H.: Example-based machine translation : Review article (1999) 0.02

0.024572134 = product of:
  0.049144268 = sum of:
    0.049144268 = product of:
      0.098288536 = sum of:
        0.098288536 = weight(_text_:22 in 6672) [ClassicSimilarity], result of:
          0.098288536 = score(doc=6672,freq=2.0), product of:
            0.18145745 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051817898 = queryNorm
            0.5416616 = fieldWeight in 6672, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=6672)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 31. 7.1996 9:22:19

New tools for human translators (1997) 0.02

0.024572134 = product of:
  0.049144268 = sum of:
    0.049144268 = product of:
      0.098288536 = sum of:
        0.098288536 = weight(_text_:22 in 1179) [ClassicSimilarity], result of:
          0.098288536 = score(doc=1179,freq=2.0), product of:
            0.18145745 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051817898 = queryNorm
            0.5416616 = fieldWeight in 1179, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=1179)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 31. 7.1996 9:22:19

Baayen, R.H.; Lieber, H.: Word frequency distributions and lexical semantics (1997) 0.02

0.024572134 = product of:
  0.049144268 = sum of:
    0.049144268 = product of:
      0.098288536 = sum of:
        0.098288536 = weight(_text_:22 in 3117) [ClassicSimilarity], result of:
          0.098288536 = score(doc=3117,freq=2.0), product of:
            0.18145745 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051817898 = queryNorm
            0.5416616 = fieldWeight in 3117, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=3117)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 28. 2.1999 10:48:22

¬Der Student aus dem Computer (2023) 0.02

0.024572134 = product of:
  0.049144268 = sum of:
    0.049144268 = product of:
      0.098288536 = sum of:
        0.098288536 = weight(_text_:22 in 1079) [ClassicSimilarity], result of:
          0.098288536 = score(doc=1079,freq=2.0), product of:
            0.18145745 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051817898 = queryNorm
            0.5416616 = fieldWeight in 1079, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=1079)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 27. 1.2023 16:22:55

Fox, C.: Lexical analysis and stoplists (1992) 0.02

0.023727044 = product of:
  0.04745409 = sum of:
    0.04745409 = product of:
      0.09490818 = sum of:
        0.09490818 = weight(_text_:indexing in 3502) [ClassicSimilarity], result of:
          0.09490818 = score(doc=3502,freq=4.0), product of:
            0.19835205 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.051817898 = queryNorm
            0.47848347 = fieldWeight in 3502, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.0625 = fieldNorm(doc=3502)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: Lexical analysis is a fundamental operation in both query processing and automatic indexing, and filtering stoplist words is an important step in the automatic indexing process. Presents basic algorithms and data structures for lexical analysis, and shows how stoplist word removal can be efficiently incorporated into lexical analysis

Search (99 results, page 1 of 5)

Authors

Years

Languages

Types

Themes

Subjects

Classifications