Search (6 results, page 1 of 1)

HaCohen-Kerner, Y. et al.: Classification using various machine learning methods and combinations of key-phrases and visual features (2016) 0.01

0.011471494 = product of:
  0.03441448 = sum of:
    0.03441448 = product of:
      0.06882896 = sum of:
        0.06882896 = weight(_text_:22 in 2748) [ClassicSimilarity], result of:
          0.06882896 = score(doc=2748,freq=2.0), product of:
            0.17789805 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05080146 = queryNorm
            0.38690117 = fieldWeight in 2748, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=2748)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 1. 2.2016 18:25:22

Teich, E.; Degaetano-Ortlieb, S.; Fankhauser, P.; Kermes, H.; Lapshinova-Koltunski, E.: ¬The linguistic construal of disciplinarity : a data-mining approach using register features (2016) 0.01
```
0.00740211 = product of:
  0.022206329 = sum of:
    0.022206329 = product of:
      0.06661899 = sum of:
        0.06661899 = weight(_text_:basis in 3015) [ClassicSimilarity], result of:
          0.06661899 = score(doc=3015,freq=2.0), product of:
            0.22594824 = queryWeight, product of:
              4.4476724 = idf(docFreq=1406, maxDocs=44218)
              0.05080146 = queryNorm
            0.2948418 = fieldWeight in 3015, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4476724 = idf(docFreq=1406, maxDocs=44218)
              0.046875 = fieldNorm(doc=3015)
      0.33333334 = coord(1/3)
  0.33333334 = coord(1/3)
```
Abstract

We analyze the linguistic evolution of selected scientific disciplines over a 30-year time span (1970s to 2000s). Our focus is on four highly specialized disciplines at the boundaries of computer science that emerged during that time: computational linguistics, bioinformatics, digital construction, and microelectronics. Our analysis is driven by the question whether these disciplines develop a distinctive language use-both individually and collectively-over the given time period. The data set is the English Scientific Text Corpus (scitex), which includes texts from the 1970s/1980s and early 2000s. Our theoretical basis is register theory. In terms of methods, we combine corpus-based methods of feature extraction (various aggregated features [part-of-speech based], n-grams, lexico-grammatical patterns) and automatic text classification. The results of our research are directly relevant to the study of linguistic variation and languages for specific purposes (LSP) and have implications for various natural language processing (NLP) tasks, for example, authorship attribution, text mining, or training NLP tools.

Zhu, W.Z.; Allen, R.B.: Document clustering using the LSI subspace signature model (2013) 0.01

0.006882896 = product of:
  0.020648688 = sum of:
    0.020648688 = product of:
      0.041297376 = sum of:
        0.041297376 = weight(_text_:22 in 690) [ClassicSimilarity], result of:
          0.041297376 = score(doc=690,freq=2.0), product of:
            0.17789805 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05080146 = queryNorm
            0.23214069 = fieldWeight in 690, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=690)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 23. 3.2013 13:22:36

Egbert, J.; Biber, D.; Davies, M.: Developing a bottom-up, user-based method of web register classification (2015) 0.01

0.006882896 = product of:
  0.020648688 = sum of:
    0.020648688 = product of:
      0.041297376 = sum of:
        0.041297376 = weight(_text_:22 in 2158) [ClassicSimilarity], result of:
          0.041297376 = score(doc=2158,freq=2.0), product of:
            0.17789805 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05080146 = queryNorm
            0.23214069 = fieldWeight in 2158, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2158)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 4. 8.2015 19:22:04

Liu, R.-L.: ¬A passage extractor for classification of disease aspect information (2013) 0.01

0.005735747 = product of:
  0.01720724 = sum of:
    0.01720724 = product of:
      0.03441448 = sum of:
        0.03441448 = weight(_text_:22 in 1107) [ClassicSimilarity], result of:
          0.03441448 = score(doc=1107,freq=2.0), product of:
            0.17789805 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05080146 = queryNorm
            0.19345059 = fieldWeight in 1107, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1107)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 28.10.2013 19:22:57

Groß, T.; Faden, M.: Automatische Indexierung elektronischer Dokumente an der Deutschen Zentralbibliothek für Wirtschaftswissenschaften : Bericht über die Jahrestagung der Internationalen Buchwissenschaftlichen Gesellschaft (2010) 0.00
```
0.0049347403 = product of:
  0.01480422 = sum of:
    0.01480422 = product of:
      0.044412658 = sum of:
        0.044412658 = weight(_text_:basis in 4051) [ClassicSimilarity], result of:
          0.044412658 = score(doc=4051,freq=2.0), product of:
            0.22594824 = queryWeight, product of:
              4.4476724 = idf(docFreq=1406, maxDocs=44218)
              0.05080146 = queryNorm
            0.1965612 = fieldWeight in 4051, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4476724 = idf(docFreq=1406, maxDocs=44218)
              0.03125 = fieldNorm(doc=4051)
      0.33333334 = coord(1/3)
  0.33333334 = coord(1/3)
```
Abstract

Die zunehmende Verfügbarmachung digitaler Informationen in den letzten Jahren sowie die Aussicht auf ein weiteres Ansteigen der sogenannten Datenflut kumulieren in einem grundlegenden, sich weiter verstärkenden Informationsstrukturierungsproblem. Die stetige Zunahme von digitalen Informationsressourcen im World Wide Web sichert zwar jederzeit und ortsungebunden den Zugriff auf verschiedene Informationen; offen bleibt der strukturierte Zugang, insbesondere zu wissenschaftlichen Ressourcen. Angesichts der steigenden Anzahl elektronischer Inhalte und vor dem Hintergrund stagnierender bzw. knapper werdender personeller Ressourcen in der Sacherschließun schafft keine Bibliothek bzw. kein Bibliotheksverbund es mehr, weder aktuell noch zukünftig, alle digitalen Daten zu erfassen, zu strukturieren und zueinander in Beziehung zu setzen. In der Informationsgesellschaft des 21. Jahrhunderts wird es aber zunehmend wichtiger, die in der Flut verschwundenen wissenschaftlichen Informationen zeitnah, angemessen und vollständig zu strukturieren und somit als Basis für eine Wissensgenerierung wieder nutzbar zu machen. Eine normierte Inhaltserschließung digitaler Informationsressourcen ist deshalb für die Deutsche Zentralbibliothek für Wirtschaftswissenschaften (ZBW) als wichtige Informationsinfrastruktureinrichtung in diesem Bereich ein entscheidender und auch erfolgskritischer Aspekt im Wettbewerb mit anderen Informationsdienstleistern. Weil die traditionelle intellektuelle Sacherschließung aber nicht beliebig skalierbar ist - mit dem Anstieg der Zahl an Online-Dokumenten steigt proportional auch der personelle Ressourcenbedarf an Fachreferenten, wenn ein gewisser Qualitätsstandard gehalten werden soll - bedarf es zukünftig anderer Sacherschließungsverfahren. Automatisierte Verschlagwortungsmethoden werden dabei als einzige Möglichkeit angesehen, die bibliothekarische Sacherschließung auch im digitalen Zeitalter zukunftsfest auszugestalten. Zudem können maschinelle Ansätze dazu beitragen, die Heterogenitäten (Indexierungsinkonsistenzen) zwischen den einzelnen Sacherschließer zu nivellieren, und somit zu einer homogeneren Erschließung des Bibliotheksbestandes beitragen.

Search (6 results, page 1 of 1)

Authors

Languages

Themes