Search (7 results, page 1 of 1)

  • × theme_ss:"Automatisches Klassifizieren"
  • × type_ss:"a"
  • × year_i:[2010 TO 2020}
  1. Yilmaz, T.; Ozcan, R.; Altingovde, I.S.; Ulusoy, Ö.: Improving educational web search for question-like queries through subject classification (2019) 0.02
    0.022583602 = product of:
      0.09033441 = sum of:
        0.09033441 = weight(_text_:engines in 5041) [ClassicSimilarity], result of:
          0.09033441 = score(doc=5041,freq=4.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.39693922 = fieldWeight in 5041, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5041)
      0.25 = coord(1/4)
    
    Abstract
    Students use general web search engines as their primary source of research while trying to find answers to school-related questions. Although search engines are highly relevant for the general population, they may return results that are out of educational context. Another rising trend; social community question answering websites are the second choice for students who try to get answers from other peers online. We attempt discovering possible improvements in educational search by leveraging both of these information sources. For this purpose, we first implement a classifier for educational questions. This classifier is built by an ensemble method that employs several regular learning algorithms and retrieval based approaches that utilize external resources. We also build a query expander to facilitate classification. We further improve the classification using search engine results and obtain 83.5% accuracy. Although our work is entirely based on the Turkish language, the features could easily be mapped to other languages as well. In order to find out whether search engine ranking can be improved in the education domain using the classification model, we collect and label a set of query results retrieved from a general web search engine. We propose five ad-hoc methods to improve search ranking based on the idea that the query-document category relation is an indicator of relevance. We evaluate these methods for overall performance, varying query length and based on factoid and non-factoid queries. We show that some of the methods significantly improve the rankings in the education domain.
  2. Malo, P.; Sinha, A.; Wallenius, J.; Korhonen, P.: Concept-based document classification using Wikipedia and value function (2011) 0.02
    0.015949111 = product of:
      0.063796446 = sum of:
        0.063796446 = product of:
          0.12759289 = sum of:
            0.12759289 = weight(_text_:programming in 4948) [ClassicSimilarity], result of:
              0.12759289 = score(doc=4948,freq=2.0), product of:
                0.29361802 = queryWeight, product of:
                  6.5552235 = idf(docFreq=170, maxDocs=44218)
                  0.04479146 = queryNorm
                0.43455404 = fieldWeight in 4948, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  6.5552235 = idf(docFreq=170, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4948)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    In this article, we propose a new concept-based method for document classification. The conceptual knowledge associated with the words is drawn from Wikipedia. The purpose is to utilize the abundant semantic relatedness information available in Wikipedia in an efficient value function-based query learning algorithm. The procedure learns the value function by solving a simple linear programming problem formulated using the training documents. The learning involves a step-wise iterative process that helps in generating a value function with an appropriate set of concepts (dimensions) chosen from a collection of concepts. Once the value function is formulated, it is utilized to make a decision between relevance and irrelevance. The value assigned to a particular document from the value function can be further used to rank the documents according to their relevance. Reuters newswire documents have been used to evaluate the efficacy of the procedure. An extensive comparison with other frameworks has been performed. The results are promising.
  3. Piros, A.: Automatic interpretation of complex UDC numbers : towards support for library systems (2015) 0.01
    0.010632741 = product of:
      0.042530965 = sum of:
        0.042530965 = product of:
          0.08506193 = sum of:
            0.08506193 = weight(_text_:programming in 2301) [ClassicSimilarity], result of:
              0.08506193 = score(doc=2301,freq=2.0), product of:
                0.29361802 = queryWeight, product of:
                  6.5552235 = idf(docFreq=170, maxDocs=44218)
                  0.04479146 = queryNorm
                0.28970268 = fieldWeight in 2301, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  6.5552235 = idf(docFreq=170, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2301)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Analytico-synthetic and faceted classifications, such as Universal Decimal Classification (UDC) express content of documents with complex, pre-combined classification codes. Without classification authority control that would help manage and access structured notations, the use of UDC codes in searching and browsing is limited. Existing UDC parsing solutions are usually created for a particular database system or a specific task and are not widely applicable. The approach described in this paper provides a solution by which the analysis and interpretation of UDC notations would be stored into an intermediate format (in this case, in XML) by automatic means without any data or information loss. Due to its richness, the output file can be converted into different formats, such as standard mark-up and data exchange formats or simple lists of the recommended entry points of a UDC number. The program can also be used to create authority records containing complex UDC numbers which can be comprehensively analysed in order to be retrieved effectively. The Java program, as well as the corresponding schema definition it employs, is under continuous development. The current version of the interpreter software is now available online for testing purposes at the following web site: http://interpreter-eto.rhcloud.com. The future plan is to implement conversion methods for standard formats and to create standard online interfaces in order to make it possible to use the features of software as a service. This would result in the algorithm being able to be employed both in existing and future library systems to analyse UDC numbers without any significant programming effort.
  4. HaCohen-Kerner, Y. et al.: Classification using various machine learning methods and combinations of key-phrases and visual features (2016) 0.01
    0.007585781 = product of:
      0.030343125 = sum of:
        0.030343125 = product of:
          0.06068625 = sum of:
            0.06068625 = weight(_text_:22 in 2748) [ClassicSimilarity], result of:
              0.06068625 = score(doc=2748,freq=2.0), product of:
                0.15685207 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04479146 = queryNorm
                0.38690117 = fieldWeight in 2748, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2748)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    1. 2.2016 18:25:22
  5. Zhu, W.Z.; Allen, R.B.: Document clustering using the LSI subspace signature model (2013) 0.00
    0.0045514684 = product of:
      0.018205874 = sum of:
        0.018205874 = product of:
          0.036411747 = sum of:
            0.036411747 = weight(_text_:22 in 690) [ClassicSimilarity], result of:
              0.036411747 = score(doc=690,freq=2.0), product of:
                0.15685207 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04479146 = queryNorm
                0.23214069 = fieldWeight in 690, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=690)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    23. 3.2013 13:22:36
  6. Egbert, J.; Biber, D.; Davies, M.: Developing a bottom-up, user-based method of web register classification (2015) 0.00
    0.0045514684 = product of:
      0.018205874 = sum of:
        0.018205874 = product of:
          0.036411747 = sum of:
            0.036411747 = weight(_text_:22 in 2158) [ClassicSimilarity], result of:
              0.036411747 = score(doc=2158,freq=2.0), product of:
                0.15685207 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04479146 = queryNorm
                0.23214069 = fieldWeight in 2158, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2158)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    4. 8.2015 19:22:04
  7. Liu, R.-L.: ¬A passage extractor for classification of disease aspect information (2013) 0.00
    0.0037928906 = product of:
      0.015171562 = sum of:
        0.015171562 = product of:
          0.030343125 = sum of:
            0.030343125 = weight(_text_:22 in 1107) [ClassicSimilarity], result of:
              0.030343125 = score(doc=1107,freq=2.0), product of:
                0.15685207 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04479146 = queryNorm
                0.19345059 = fieldWeight in 1107, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1107)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    28.10.2013 19:22:57