Search (56 results, page 3 of 3)

Liu, G.Z.: Semantic vector space model : implementation and evaluation (1997) 0.01

0.0073736827 = product of:
  0.014747365 = sum of:
    0.014747365 = product of:
      0.02949473 = sum of:
        0.02949473 = weight(_text_:5 in 161) [ClassicSimilarity], result of:
          0.02949473 = score(doc=161,freq=2.0), product of:
            0.15247129 = queryWeight, product of:
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.052250203 = queryNorm
            0.19344449 = fieldWeight in 161, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.046875 = fieldNorm(doc=161)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Journal of the American Society for Information Science. 48(1997) no.5, S.395-417

Goller, C.; Löning, J.; Will, T.; Wolff, W.: Automatic document classification : a thourough evaluation of various methods (2000) 0.01
```
0.0073736827 = product of:
  0.014747365 = sum of:
    0.014747365 = product of:
      0.02949473 = sum of:
        0.02949473 = weight(_text_:5 in 5480) [ClassicSimilarity], result of:
          0.02949473 = score(doc=5480,freq=2.0), product of:
            0.15247129 = queryWeight, product of:
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.052250203 = queryNorm
            0.19344449 = fieldWeight in 5480, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.046875 = fieldNorm(doc=5480)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

(Automatic) document classification is generally defined as content-based assignment of one or more predefined categories to documents. Usually, machine learning, statistical pattern recognition, or neural network approaches are used to construct classifiers automatically. In this paper we thoroughly evaluate a wide variety of these methods on a document classification task for German text. We evaluate different feature construction and selection methods and various classifiers. Our main results are: (1) feature selection is necessary not only to reduce learning and classification time, but also to avoid overfitting (even for Support Vector Machines); (2) surprisingly, our morphological analysis does not improve classification quality compared to a letter 5-gram approach; (3) Support Vector Machines are significantly better than all other classification methods
Buckley, C.; Allan, J.; Salton, G.: Automatic routing and retrieval using Smart : TREC-2 (1995) 0.01
```
0.0073736827 = product of:
  0.014747365 = sum of:
    0.014747365 = product of:
      0.02949473 = sum of:
        0.02949473 = weight(_text_:5 in 5699) [ClassicSimilarity], result of:
          0.02949473 = score(doc=5699,freq=2.0), product of:
            0.15247129 = queryWeight, product of:
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.052250203 = queryNorm
            0.19344449 = fieldWeight in 5699, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.046875 = fieldNorm(doc=5699)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The Smart information retrieval project emphazises completely automatic approaches to the understanding and retrieval of large quantities of text. The work in the TREC-2 environment continues, performing both routing and ad hoc experiments. The ad hoc work extends investigations into combining global similarities, giving an overall indication of how a document matches a query, with local similarities identifying a smaller part of the document that matches the query. The performance of ad hoc runs is good, but it is clear that full advantage of the available local information is not been taken advantage of. The routing experiments use conventional relevance feedback approaches to routing, but with a much greater degree of query expansion than was previously done. The length of a query vector is increased by a factor of 5 to 10 by adding terms found in previously seen relevant documents. This approach improves effectiveness by 30-40% over the original query

Bloomfield, M.: Indexing : neglected and poorly understood (2001) 0.01

0.0073736827 = product of:
  0.014747365 = sum of:
    0.014747365 = product of:
      0.02949473 = sum of:
        0.02949473 = weight(_text_:5 in 5439) [ClassicSimilarity], result of:
          0.02949473 = score(doc=5439,freq=2.0), product of:
            0.15247129 = queryWeight, product of:
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.052250203 = queryNorm
            0.19344449 = fieldWeight in 5439, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.046875 = fieldNorm(doc=5439)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 5. 6.2001 12:25:05

Fauzi, F.; Belkhatir, M.: Multifaceted conceptual image indexing on the world wide web (2013) 0.01

0.0073736827 = product of:
  0.014747365 = sum of:
    0.014747365 = product of:
      0.02949473 = sum of:
        0.02949473 = weight(_text_:5 in 2721) [ClassicSimilarity], result of:
          0.02949473 = score(doc=2721,freq=2.0), product of:
            0.15247129 = queryWeight, product of:
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.052250203 = queryNorm
            0.19344449 = fieldWeight in 2721, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.046875 = fieldNorm(doc=2721)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 5. 9.2014 10:18:18

Snajder, J.; Dalbelo Basic, B.D.; Tadic, M.: Automatic acquisition of inflectional lexica for morphological normalisation (2008) 0.01

0.0073736827 = product of:
  0.014747365 = sum of:
    0.014747365 = product of:
      0.02949473 = sum of:
        0.02949473 = weight(_text_:5 in 2910) [ClassicSimilarity], result of:
          0.02949473 = score(doc=2910,freq=2.0), product of:
            0.15247129 = queryWeight, product of:
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.052250203 = queryNorm
            0.19344449 = fieldWeight in 2910, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.046875 = fieldNorm(doc=2910)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Information processing and management. 44(2008) no.5, S.1720-1731

Greiner-Petter, A.; Schubotz, M.; Cohl, H.S.; Gipp, B.: Semantic preserving bijective mappings for expressions involving special functions between computer algebra systems and document preparation systems (2019) 0.01

0.0070791813 = product of:
  0.0141583625 = sum of:
    0.0141583625 = product of:
      0.028316725 = sum of:
        0.028316725 = weight(_text_:22 in 5499) [ClassicSimilarity], result of:
          0.028316725 = score(doc=5499,freq=2.0), product of:
            0.18297131 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052250203 = queryNorm
            0.15476047 = fieldWeight in 5499, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=5499)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 20. 1.2015 18:30:22

Golub, K.; Lykke, M.; Tudhope, D.: Enhancing social tagging with automated keywords from the Dewey Decimal Classification (2014) 0.01

0.0061447355 = product of:
  0.012289471 = sum of:
    0.012289471 = product of:
      0.024578942 = sum of:
        0.024578942 = weight(_text_:5 in 2918) [ClassicSimilarity], result of:
          0.024578942 = score(doc=2918,freq=2.0), product of:
            0.15247129 = queryWeight, product of:
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.052250203 = queryNorm
            0.16120374 = fieldWeight in 2918, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2918)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Journal of documentation. 70(2014) no.5, S.801-828

Vlachidis, A.; Tudhope, D.: ¬A knowledge-based approach to information extraction for semantic interoperability in the archaeology domain (2016) 0.01

0.0061447355 = product of:
  0.012289471 = sum of:
    0.012289471 = product of:
      0.024578942 = sum of:
        0.024578942 = weight(_text_:5 in 2895) [ClassicSimilarity], result of:
          0.024578942 = score(doc=2895,freq=2.0), product of:
            0.15247129 = queryWeight, product of:
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.052250203 = queryNorm
            0.16120374 = fieldWeight in 2895, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2895)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Journal of the Association for Information Science and Technology. 67(2016) no.5, S.1138-1152

Husevag, A.-S.R.: Named entities in indexing : a case study of TV subtitles and metadata records (2016) 0.01

0.0061447355 = product of:
  0.012289471 = sum of:
    0.012289471 = product of:
      0.024578942 = sum of:
        0.024578942 = weight(_text_:5 in 3105) [ClassicSimilarity], result of:
          0.024578942 = score(doc=3105,freq=2.0), product of:
            0.15247129 = queryWeight, product of:
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.052250203 = queryNorm
            0.16120374 = fieldWeight in 3105, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3105)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Proceedings of the 15th European Networked Knowledge Organization Systems Workshop (NKOS 2016) co-located with the 20th International Conference on Theory and Practice of Digital Libraries 2016 (TPDL 2016), Hannover, Germany, September 9, 2016. Edi. by Philipp Mayr et al. [http://ceur-ws.org/Vol-1676/=urn:nbn:de:0074-1676-5]

Strobel, S.; Marín-Arraiza, P.: Metadata for scientific audiovisual media : current practices and perspectives of the TIB / AV-portal (2015) 0.01

0.0061447355 = product of:
  0.012289471 = sum of:
    0.012289471 = product of:
      0.024578942 = sum of:
        0.024578942 = weight(_text_:5 in 3667) [ClassicSimilarity], result of:
          0.024578942 = score(doc=3667,freq=2.0), product of:
            0.15247129 = queryWeight, product of:
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.052250203 = queryNorm
            0.16120374 = fieldWeight in 3667, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3667)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 5. 6.2017 11:19:32

Li, X.; Zhang, A.; Li, C.; Ouyang, J.; Cai, Y.: Exploring coherent topics by topic modeling with term weighting (2018) 0.01

0.0061447355 = product of:
  0.012289471 = sum of:
    0.012289471 = product of:
      0.024578942 = sum of:
        0.024578942 = weight(_text_:5 in 5045) [ClassicSimilarity], result of:
          0.024578942 = score(doc=5045,freq=2.0), product of:
            0.15247129 = queryWeight, product of:
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.052250203 = queryNorm
            0.16120374 = fieldWeight in 5045, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5045)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 5. 9.2014 10:18:18

Zhang, Y.; Zhang, C.; Li, J.: Joint modeling of characters, words, and conversation contexts for microblog keyphrase extraction (2020) 0.01

0.0061447355 = product of:
  0.012289471 = sum of:
    0.012289471 = product of:
      0.024578942 = sum of:
        0.024578942 = weight(_text_:5 in 5816) [ClassicSimilarity], result of:
          0.024578942 = score(doc=5816,freq=2.0), product of:
            0.15247129 = queryWeight, product of:
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.052250203 = queryNorm
            0.16120374 = fieldWeight in 5816, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5816)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Journal of the Association for Information Science and Technology. 71(2020) no.5, S.553-567

Kajanan, S.; Bao, Y.; Datta, A.; VanderMeer, D.; Dutta, K.: Efficient automatic search query formulation using phrase-level analysis (2014) 0.00

0.0049157883 = product of:
  0.009831577 = sum of:
    0.009831577 = product of:
      0.019663153 = sum of:
        0.019663153 = weight(_text_:5 in 1264) [ClassicSimilarity], result of:
          0.019663153 = score(doc=1264,freq=2.0), product of:
            0.15247129 = queryWeight, product of:
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.052250203 = queryNorm
            0.128963 = fieldWeight in 1264, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.03125 = fieldNorm(doc=1264)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Journal of the Association for Information Science and Technology. 65(2014) no.5, S.1058-1075

Tavakolizadeh-Ravari, M.: Analysis of the long term dynamics in thesaurus developments and its consequences (2017) 0.00
```
0.0049157883 = product of:
  0.009831577 = sum of:
    0.009831577 = product of:
      0.019663153 = sum of:
        0.019663153 = weight(_text_:5 in 3081) [ClassicSimilarity], result of:
          0.019663153 = score(doc=3081,freq=2.0), product of:
            0.15247129 = queryWeight, product of:
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.052250203 = queryNorm
            0.128963 = fieldWeight in 3081, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.03125 = fieldNorm(doc=3081)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Die Arbeit analysiert die dynamische Entwicklung und den Gebrauch von Thesaurusbegriffen. Zusätzlich konzentriert sie sich auf die Faktoren, die die Zahl von Indexbegriffen pro Dokument oder Zeitschrift beeinflussen. Als Untersuchungsobjekt dienten der MeSH und die entsprechende Datenbank "MEDLINE". Die wichtigsten Konsequenzen sind: 1. Der MeSH-Thesaurus hat sich durch drei unterschiedliche Phasen jeweils logarithmisch entwickelt. Solch einen Thesaurus sollte folgenden Gleichung folgen: "T = 3.076,6 Ln (d) - 22.695 + 0,0039d" (T = Begriffe, Ln = natürlicher Logarithmus und d = Dokumente). Um solch einen Thesaurus zu konstruieren, muss man demnach etwa 1.600 Dokumente von unterschiedlichen Themen des Bereiches des Thesaurus haben. Die dynamische Entwicklung von Thesauri wie MeSH erfordert die Einführung eines neuen Begriffs pro Indexierung von 256 neuen Dokumenten. 2. Die Verteilung der Thesaurusbegriffe erbrachte drei Kategorien: starke, normale und selten verwendete Headings. Die letzte Gruppe ist in einer Testphase, während in der ersten und zweiten Kategorie die neu hinzukommenden Deskriptoren zu einem Thesauruswachstum führen. 3. Es gibt ein logarithmisches Verhältnis zwischen der Zahl von Index-Begriffen pro Aufsatz und dessen Seitenzahl für die Artikeln zwischen einer und einundzwanzig Seiten. 4. Zeitschriftenaufsätze, die in MEDLINE mit Abstracts erscheinen erhalten fast zwei Deskriptoren mehr. 5. Die Findablity der nicht-englisch sprachigen Dokumente in MEDLINE ist geringer als die englische Dokumente. 6. Aufsätze der Zeitschriften mit einem Impact Factor 0 bis fünfzehn erhalten nicht mehr Indexbegriffe als die der anderen von MEDINE erfassten Zeitschriften. 7. In einem Indexierungssystem haben unterschiedliche Zeitschriften mehr oder weniger Gewicht in ihrem Findability. Die Verteilung der Indexbegriffe pro Seite hat gezeigt, dass es bei MEDLINE drei Kategorien der Publikationen gibt. Außerdem gibt es wenige stark bevorzugten Zeitschriften."

Needham, R.M.; Sparck Jones, K.: Keywords and clumps (1985) 0.00

0.0043013147 = product of:
  0.008602629 = sum of:
    0.008602629 = product of:
      0.017205259 = sum of:
        0.017205259 = weight(_text_:5 in 3645) [ClassicSimilarity], result of:
          0.017205259 = score(doc=3645,freq=2.0), product of:
            0.15247129 = queryWeight, product of:
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.052250203 = queryNorm
            0.11284262 = fieldWeight in 3645, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.9180994 = idf(docFreq=6494, maxDocs=44218)
              0.02734375 = fieldNorm(doc=3645)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Footnote: Original in: Journal of documentation 20(1964) no.1, S.5-15.

Search (56 results, page 3 of 3)

Authors

Years

Types

Themes