Search (138 results, page 1 of 7)

  • × theme_ss:"Informetrie"
  1. Zhang, Y.; Wu, M.; Zhang, G.; Lu, J.: Stepping beyond your comfort zone : diffusion-based network analytics for knowledge trajectory recommendation (2023) 0.05
    0.045760885 = product of:
      0.09152177 = sum of:
        0.09152177 = sum of:
          0.056665063 = weight(_text_:learning in 994) [ClassicSimilarity], result of:
            0.056665063 = score(doc=994,freq=2.0), product of:
              0.22973695 = queryWeight, product of:
                4.464877 = idf(docFreq=1382, maxDocs=44218)
                0.05145426 = queryNorm
              0.24665193 = fieldWeight in 994, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.464877 = idf(docFreq=1382, maxDocs=44218)
                0.0390625 = fieldNorm(doc=994)
          0.03485671 = weight(_text_:22 in 994) [ClassicSimilarity], result of:
            0.03485671 = score(doc=994,freq=2.0), product of:
              0.18018405 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.05145426 = queryNorm
              0.19345059 = fieldWeight in 994, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=994)
      0.5 = coord(1/2)
    
    Abstract
    Predicting a researcher's knowledge trajectories beyond their current foci can leverage potential inter-/cross-/multi-disciplinary interactions to achieve exploratory innovation. In this study, we present a method of diffusion-based network analytics for knowledge trajectory recommendation. The method begins by constructing a heterogeneous bibliometric network consisting of a co-topic layer and a co-authorship layer. A novel link prediction approach with a diffusion strategy is then used to capture the interactions between social elements (e.g., collaboration) and knowledge elements (e.g., technological similarity) in the process of exploratory innovation. This diffusion strategy differentiates the interactions occurring among homogeneous and heterogeneous nodes in the heterogeneous bibliometric network and weights the strengths of these interactions. Two sets of experiments-one with a local dataset and the other with a global dataset-demonstrate that the proposed method is prior to 10 selected baselines in link prediction, recommender systems, and upstream graph representation learning. A case study recommending knowledge trajectories of information scientists with topical hierarchy and explainable mediators reveals the proposed method's reliability and potential practical uses in broad scenarios.
    Date
    22. 6.2023 18:07:12
  2. Rivas, A.L.; Deshler, J.D.; Quimby, F.W.; Mohammend, H.O.; Wilson, D.J.; Gonzales, R.N.; Lein, D.H.; Bruso, P.: Interdisciplinary question generation : synthesis and validity analysis of the 1993-1997 bovine mastitis related literature (1998) 0.03
    0.032054603 = product of:
      0.064109206 = sum of:
        0.064109206 = product of:
          0.12821841 = sum of:
            0.12821841 = weight(_text_:learning in 5124) [ClassicSimilarity], result of:
              0.12821841 = score(doc=5124,freq=4.0), product of:
                0.22973695 = queryWeight, product of:
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.05145426 = queryNorm
                0.55810964 = fieldWeight in 5124, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5124)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Describes research in which interdisciplinary synthesis and validity analysis (ISVA), a structured learning approach which integrates learning and communication theories, meta analytic evaluation methods, and literature management related technologies, was applied in the context of the 1993-1997 bovine mastitis research literature. The study investigated whether ISVA could facilitate the analysis and synthesis of interdisciplinary knowledge claims and generate projects or research questions
  3. Coleman, A.: Instruments of cognition : use of citations and Web links in online teaching materials (2005) 0.03
    0.028047776 = product of:
      0.05609555 = sum of:
        0.05609555 = product of:
          0.1121911 = sum of:
            0.1121911 = weight(_text_:learning in 3329) [ClassicSimilarity], result of:
              0.1121911 = score(doc=3329,freq=4.0), product of:
                0.22973695 = queryWeight, product of:
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.05145426 = queryNorm
                0.48834592 = fieldWeight in 3329, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3329)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Use of citations and Web links embedded in online teaching materials was studied for an undergraduate course. The undergraduate students enrolled in Geographic Information Science for Geography and Regional Development used Web links more often than citations, but clearly did not see them as key to enhancing learning. Current conventions for citing and linking tend to make citations and links invisible. There is some evidence that citations and Web links categorized and highlighted in terms of their importance and function to be served may help student learning in interdisciplinary domains.
  4. Nicholls, P.T.: Empirical validation of Lotka's law (1986) 0.03
    0.027885368 = product of:
      0.055770736 = sum of:
        0.055770736 = product of:
          0.11154147 = sum of:
            0.11154147 = weight(_text_:22 in 5509) [ClassicSimilarity], result of:
              0.11154147 = score(doc=5509,freq=2.0), product of:
                0.18018405 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05145426 = queryNorm
                0.61904186 = fieldWeight in 5509, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=5509)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Information processing and management. 22(1986), S.417-419
  5. Nicolaisen, J.: Citation analysis (2007) 0.03
    0.027885368 = product of:
      0.055770736 = sum of:
        0.055770736 = product of:
          0.11154147 = sum of:
            0.11154147 = weight(_text_:22 in 6091) [ClassicSimilarity], result of:
              0.11154147 = score(doc=6091,freq=2.0), product of:
                0.18018405 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05145426 = queryNorm
                0.61904186 = fieldWeight in 6091, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=6091)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    13. 7.2008 19:53:22
  6. Fiala, J.: Information flood : fiction and reality (1987) 0.03
    0.027885368 = product of:
      0.055770736 = sum of:
        0.055770736 = product of:
          0.11154147 = sum of:
            0.11154147 = weight(_text_:22 in 1080) [ClassicSimilarity], result of:
              0.11154147 = score(doc=1080,freq=2.0), product of:
                0.18018405 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05145426 = queryNorm
                0.61904186 = fieldWeight in 1080, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=1080)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Thermochimica acta. 110(1987), S.11-22
  7. Su, Y.; Han, L.-F.: ¬A new literature growth model : variable exponential growth law of literature (1998) 0.02
    0.024647415 = product of:
      0.04929483 = sum of:
        0.04929483 = product of:
          0.09858966 = sum of:
            0.09858966 = weight(_text_:22 in 3690) [ClassicSimilarity], result of:
              0.09858966 = score(doc=3690,freq=4.0), product of:
                0.18018405 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05145426 = queryNorm
                0.54716086 = fieldWeight in 3690, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=3690)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 5.1999 19:22:35
  8. Van der Veer Martens, B.: Do citation systems represent theories of truth? (2001) 0.02
    0.024647415 = product of:
      0.04929483 = sum of:
        0.04929483 = product of:
          0.09858966 = sum of:
            0.09858966 = weight(_text_:22 in 3925) [ClassicSimilarity], result of:
              0.09858966 = score(doc=3925,freq=4.0), product of:
                0.18018405 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05145426 = queryNorm
                0.54716086 = fieldWeight in 3925, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=3925)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 7.2006 15:22:28
  9. Rokach, L.; Kalech, M.; Blank, I.; Stern, R.: Who is going to win the next Association for the Advancement of Artificial Intelligence Fellowship Award? : evaluating researchers by mining bibliographic data (2011) 0.02
    0.024536695 = product of:
      0.04907339 = sum of:
        0.04907339 = product of:
          0.09814678 = sum of:
            0.09814678 = weight(_text_:learning in 4945) [ClassicSimilarity], result of:
              0.09814678 = score(doc=4945,freq=6.0), product of:
                0.22973695 = queryWeight, product of:
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.05145426 = queryNorm
                0.42721373 = fieldWeight in 4945, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4945)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Accurately evaluating a researcher and the quality of his or her work is an important task when decision makers have to decide on such matters as promotions and awards. Publications and citations play a key role in this task, and many previous studies have proposed using measurements based on them for evaluating researchers. Machine learning techniques as a way of enhancing the evaluating process have been relatively unexplored. We propose using a machine learning approach for evaluating researchers. In particular, the proposed method combines the outputs of three learning techniques (logistics regression, decision trees, and artificial neural networks) to obtain a unified prediction with improved accuracy. We conducted several experiments to evaluate the model's ability to: (a) classify researchers in the field of artificial intelligence as Association for the Advancement of Artificial Intelligence (AAAI) fellows and (b) predict the next AAAI fellowship winners. We show that both our classification and prediction methods are more accurate than are previous measurement methods, and reach a precision rate of 96% and a recall of 92%.
  10. Zuccala, A.; Someren, M. van; Bellen, M. van: ¬A machine-learning approach to coding book reviews as quality indicators : toward a theory of megacitation (2014) 0.02
    0.024536695 = product of:
      0.04907339 = sum of:
        0.04907339 = product of:
          0.09814678 = sum of:
            0.09814678 = weight(_text_:learning in 1530) [ClassicSimilarity], result of:
              0.09814678 = score(doc=1530,freq=6.0), product of:
                0.22973695 = queryWeight, product of:
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.05145426 = queryNorm
                0.42721373 = fieldWeight in 1530, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1530)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    A theory of "megacitation" is introduced and used in an experiment to demonstrate how a qualitative scholarly book review can be converted into a weighted bibliometric indicator. We employ a manual human-coding approach to classify book reviews in the field of history based on reviewers' assessments of a book author's scholarly credibility (SC) and writing style (WS). In total, 100 book reviews were selected from the American Historical Review and coded for their positive/negative valence on these two dimensions. Most were coded as positive (68% for SC and 47% for WS), and there was also a small positive correlation between SC and WS (r = 0.2). We then constructed a classifier, combining both manual design and machine learning, to categorize sentiment-based sentences in history book reviews. The machine classifier produced a matched accuracy (matched to the human coding) of approximately 75% for SC and 64% for WS. WS was found to be more difficult to classify by machine than SC because of the reviewers' use of more subtle language. With further training data, a machine-learning approach could be useful for automatically classifying a large number of history book reviews at once. Weighted megacitations can be especially valuable if they are used in conjunction with regular book/journal citations, and "libcitations" (i.e., library holding counts) for a comprehensive assessment of a book/monograph's scholarly impact.
  11. Ping, Q.; He, J.; Chen, C.: How many ways to use CiteSpace? : a study of user interactive events over 14 months (2017) 0.02
    0.024536695 = product of:
      0.04907339 = sum of:
        0.04907339 = product of:
          0.09814678 = sum of:
            0.09814678 = weight(_text_:learning in 3602) [ClassicSimilarity], result of:
              0.09814678 = score(doc=3602,freq=6.0), product of:
                0.22973695 = queryWeight, product of:
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.05145426 = queryNorm
                0.42721373 = fieldWeight in 3602, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3602)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Using visual analytic systems effectively may incur a steep learning curve for users, especially for those who have little prior knowledge of either using the tool or accomplishing analytic tasks. How do users deal with a steep learning curve over time? Are there particularly problematic aspects of an analytic process? In this article we investigate these questions through an integrative study of the use of CiteSpace-a visual analytic tool for finding trends and patterns in scientific literature. In particular, we analyze millions of interactive events in logs generated by users worldwide over a 14-month period. The key findings are: (i) three levels of proficiency are identified, namely, level 1: low proficiency, level 2: intermediate proficiency, and level 3: high proficiency, and (ii) behavioral patterns at level 3 are resulted from a more engaging interaction with the system, involving a wider variety of events and being characterized by longer state transition paths, whereas behavioral patterns at levels 1 and 2 seem to focus on learning how to use the tool. This study contributes to the development and evaluation of visual analytic systems in realistic settings and provides a valuable addition to the study of interactive visual analytic processes.
  12. Diodato, V.: Dictionary of bibliometrics (1994) 0.02
    0.024399696 = product of:
      0.04879939 = sum of:
        0.04879939 = product of:
          0.09759878 = sum of:
            0.09759878 = weight(_text_:22 in 5666) [ClassicSimilarity], result of:
              0.09759878 = score(doc=5666,freq=2.0), product of:
                0.18018405 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05145426 = queryNorm
                0.5416616 = fieldWeight in 5666, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=5666)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Footnote
    Rez. in: Journal of library and information science 22(1996) no.2, S.116-117 (L.C. Smith)
  13. Bookstein, A.: Informetric distributions : I. Unified overview (1990) 0.02
    0.024399696 = product of:
      0.04879939 = sum of:
        0.04879939 = product of:
          0.09759878 = sum of:
            0.09759878 = weight(_text_:22 in 6902) [ClassicSimilarity], result of:
              0.09759878 = score(doc=6902,freq=2.0), product of:
                0.18018405 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05145426 = queryNorm
                0.5416616 = fieldWeight in 6902, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=6902)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 7.2006 18:55:29
  14. Bookstein, A.: Informetric distributions : II. Resilience to ambiguity (1990) 0.02
    0.024399696 = product of:
      0.04879939 = sum of:
        0.04879939 = product of:
          0.09759878 = sum of:
            0.09759878 = weight(_text_:22 in 4689) [ClassicSimilarity], result of:
              0.09759878 = score(doc=4689,freq=2.0), product of:
                0.18018405 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05145426 = queryNorm
                0.5416616 = fieldWeight in 4689, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=4689)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 7.2006 18:55:55
  15. Zheng, X.; Sun, A.: Collecting event-related tweets from twitter stream (2019) 0.02
    0.024040952 = product of:
      0.048081905 = sum of:
        0.048081905 = product of:
          0.09616381 = sum of:
            0.09616381 = weight(_text_:learning in 4672) [ClassicSimilarity], result of:
              0.09616381 = score(doc=4672,freq=4.0), product of:
                0.22973695 = queryWeight, product of:
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.05145426 = queryNorm
                0.41858223 = fieldWeight in 4672, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4672)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Twitter provides a channel of collecting and publishing instant information on major events like natural disasters. However, information flow on Twitter is of great volume. For a specific event, messages collected from the Twitter Stream based on either location constraint or predefined keywords would contain a lot of noise. In this article, we propose a method to achieve both high-precision and high-recall in collecting event-related tweets. Our method involves an automatic keyword generation component, and an event-related tweet identification component. For keyword generation, we consider three properties of candidate keywords, namely relevance, coverage, and evolvement. The keyword updating mechanism enables our method to track the main topics of tweets along event development. To minimize annotation effort in identifying event-related tweets, we adopt active learning and incorporate multiple-instance learning which assigns labels to bags instead of instances (that is, individual tweets). Through experiments on two real-world events, we demonstrate the superiority of our method against state-of-the-art alternatives.
  16. Lewison, G.: ¬The work of the Bibliometrics Research Group (City University) and associates (2005) 0.02
    0.020914026 = product of:
      0.04182805 = sum of:
        0.04182805 = product of:
          0.0836561 = sum of:
            0.0836561 = weight(_text_:22 in 4890) [ClassicSimilarity], result of:
              0.0836561 = score(doc=4890,freq=2.0), product of:
                0.18018405 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05145426 = queryNorm
                0.46428138 = fieldWeight in 4890, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=4890)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    20. 1.2007 17:02:22
  17. Marx, W.; Bornmann, L.: On the problems of dealing with bibliometric data (2014) 0.02
    0.020914026 = product of:
      0.04182805 = sum of:
        0.04182805 = product of:
          0.0836561 = sum of:
            0.0836561 = weight(_text_:22 in 1239) [ClassicSimilarity], result of:
              0.0836561 = score(doc=1239,freq=2.0), product of:
                0.18018405 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05145426 = queryNorm
                0.46428138 = fieldWeight in 1239, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=1239)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    18. 3.2014 19:13:22
  18. Eijk, C.C. van der; Mulligen, E.M. van; Kors, J.A.; Mons, B.; Berg, J. van den: Constructing an associative concept space for literature-based discovery (2004) 0.02
    0.01699952 = product of:
      0.03399904 = sum of:
        0.03399904 = product of:
          0.06799808 = sum of:
            0.06799808 = weight(_text_:learning in 2228) [ClassicSimilarity], result of:
              0.06799808 = score(doc=2228,freq=2.0), product of:
                0.22973695 = queryWeight, product of:
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.05145426 = queryNorm
                0.29598233 = fieldWeight in 2228, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2228)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Scientific literature is often fragmented, which implies that certain scientific questions can only be answered by combining information from various articles. In this paper, a new algorithm is proposed for finding associations between related concepts present in literature. To this end, concepts are mapped to a multidimensional space by a Hebbian type of learning algorithm using co-occurrence data as input. The resulting concept space allows exploration of the neighborhood of a concept and finding potentially novel relationships between concepts. The obtained information retrieval system is useful for finding literature supporting hypotheses and for discovering previously unknown relationships between concepts. Tests an artificial data show the potential of the proposed methodology. In addition, preliminary tests an a set of Medline abstracts yield promising results.
  19. Huang, X.; Peng, F,; An, A.; Schuurmans, D.: Dynamic Web log session identification with statistical language models (2004) 0.02
    0.01699952 = product of:
      0.03399904 = sum of:
        0.03399904 = product of:
          0.06799808 = sum of:
            0.06799808 = weight(_text_:learning in 3096) [ClassicSimilarity], result of:
              0.06799808 = score(doc=3096,freq=2.0), product of:
                0.22973695 = queryWeight, product of:
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.05145426 = queryNorm
                0.29598233 = fieldWeight in 3096, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3096)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    We present a novel session identification method based an statistical language modeling. Unlike standard timeout methods, which use fixed time thresholds for session identification, we use an information theoretic approach that yields more robust results for identifying session boundaries. We evaluate our new approach by learning interesting association rules from the segmented session files. We then compare the performance of our approach to three standard session identification methods-the standard timeout method, the reference length method, and the maximal forward reference method-and find that our statistical language modeling approach generally yields superior results. However, as with every method, the performance of our technique varies with changing parameter settings. Therefore, we also analyze the influence of the two key factors in our language-modeling-based approach: the choice of smoothing technique and the language model order. We find that all standard smoothing techniques, save one, perform weIl, and that performance is robust to language model order.
  20. Shibata, N.; Kajikawa, Y.; Sakata, I.: Link prediction in citation networks (2012) 0.02
    0.01699952 = product of:
      0.03399904 = sum of:
        0.03399904 = product of:
          0.06799808 = sum of:
            0.06799808 = weight(_text_:learning in 4964) [ClassicSimilarity], result of:
              0.06799808 = score(doc=4964,freq=2.0), product of:
                0.22973695 = queryWeight, product of:
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.05145426 = queryNorm
                0.29598233 = fieldWeight in 4964, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.464877 = idf(docFreq=1382, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4964)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In this article, we build models to predict the existence of citations among papers by formulating link prediction for 5 large-scale datasets of citation networks. The supervised machine-learning model is applied with 11 features. As a result, our learner performs very well, with the F1 values of between 0.74 and 0.82. Three features in particular, link-based Jaccard coefficient difference in betweenness centrality, and cosine similarity of term frequency-inverse document frequency vectors, largely affect the predictions of citations. The results also indicate that different models are required for different types of research areas-research fields with a single issue or research fields with multiple issues. In the case of research fields with multiple issues, there are barriers among research fields because our results indicate that papers tend to be cited in each research field locally. Therefore, one must consider the typology of targeted research areas when building models for link prediction in citation networks.

Years

Languages

  • e 129
  • d 8
  • ro 1
  • More… Less…

Types

  • a 135
  • m 3
  • el 1
  • s 1
  • More… Less…