Search (326 results, page 1 of 17)

Candela, G.: ¬An automatic data quality approach to assess semantic data from cultural heritage institutions (2023) 0.05

0.045274492 = product of:
  0.090548985 = sum of:
    0.072418936 = weight(_text_:data in 997) [ClassicSimilarity], result of:
      0.072418936 = score(doc=997,freq=12.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.59902847 = fieldWeight in 997, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=997)
    0.01813005 = product of:
      0.0362601 = sum of:
        0.0362601 = weight(_text_:22 in 997) [ClassicSimilarity], result of:
          0.0362601 = score(doc=997,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.2708308 = fieldWeight in 997, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=997)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: In recent years, cultural heritage institutions have been exploring the benefits of applying Linked Open Data to their catalogs and digital materials. Innovative and creative methods have emerged to publish and reuse digital contents to promote computational access, such as the concepts of Labs and Collections as Data. Data quality has become a requirement for researchers and training methods based on artificial intelligence and machine learning. This article explores how the quality of Linked Open Data made available by cultural heritage institutions can be automatically assessed. The results obtained can be useful for other institutions who wish to publish and assess their collections.
Date: 22. 6.2023 18:23:31

Jia, J.: From data to knowledge : the relationships between vocabularies, linked data and knowledge graphs (2021) 0.04
```
0.04454566 = product of:
  0.08909132 = sum of:
    0.07614129 = weight(_text_:data in 106) [ClassicSimilarity], result of:
      0.07614129 = score(doc=106,freq=26.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.6298187 = fieldWeight in 106, product of:
          5.0990195 = tf(freq=26.0), with freq of:
            26.0 = termFreq=26.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=106)
    0.012950035 = product of:
      0.02590007 = sum of:
        0.02590007 = weight(_text_:22 in 106) [ClassicSimilarity], result of:
          0.02590007 = score(doc=106,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.19345059 = fieldWeight in 106, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=106)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

Purpose The purpose of this paper is to identify the concepts, component parts and relationships between vocabularies, linked data and knowledge graphs (KGs) from the perspectives of data and knowledge transitions. Design/methodology/approach This paper uses conceptual analysis methods. This study focuses on distinguishing concepts and analyzing composition and intercorrelations to explore data and knowledge transitions. Findings Vocabularies are the cornerstone for accurately building understanding of the meaning of data. Vocabularies provide for a data-sharing model and play an important role in supporting the semantic expression of linked data and defining the schema layer; they are also used for entity recognition, alignment and linkage for KGs. KGs, which consist of a schema layer and a data layer, are presented as cubes that organically combine vocabularies, linked data and big data. Originality/value This paper first describes the composition of vocabularies, linked data and KGs. More importantly, this paper innovatively analyzes and summarizes the interrelatedness of these factors, which comes from frequent interactions between data and knowledge. The three factors empower each other and can ultimately empower the Semantic Web.

Date

22. 1.2021 14:24:32
Palsdottir, A.: Data literacy and management of research data : a prerequisite for the sharing of research data (2021) 0.04
```
0.043889567 = product of:
  0.087779135 = sum of:
    0.07741911 = weight(_text_:data in 183) [ClassicSimilarity], result of:
      0.07741911 = score(doc=183,freq=42.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.6403884 = fieldWeight in 183, product of:
          6.4807405 = tf(freq=42.0), with freq of:
            42.0 = termFreq=42.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03125 = fieldNorm(doc=183)
    0.010360028 = product of:
      0.020720055 = sum of:
        0.020720055 = weight(_text_:22 in 183) [ClassicSimilarity], result of:
          0.020720055 = score(doc=183,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.15476047 = fieldWeight in 183, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=183)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

Purpose The purpose of this paper is to investigate the knowledge and attitude about research data management, the use of data management methods and the perceived need for support, in relation to participants' field of research. Design/methodology/approach This is a quantitative study. Data were collected by an email survey and sent to 792 academic researchers and doctoral students. Total response rate was 18% (N = 139). The measurement instrument consisted of six sets of questions: about data management plans, the assignment of additional information to research data, about metadata, standard file naming systems, training at data management methods and the storing of research data. Findings The main finding is that knowledge about the procedures of data management is limited, and data management is not a normal practice in the researcher's work. They were, however, in general, of the opinion that the university should take the lead by recommending and offering access to the necessary tools of data management. Taken together, the results indicate that there is an urgent need to increase the researcher's understanding of the importance of data management that is based on professional knowledge and to provide them with resources and training that enables them to make effective and productive use of data management methods. Research limitations/implications The survey was sent to all members of the population but not a sample of it. Because of the response rate, the results cannot be generalized to all researchers at the university. Nevertheless, the findings may provide an important understanding about their research data procedures, in particular what characterizes their knowledge about data management and attitude towards it. Practical implications Awareness of these issues is essential for information specialists at academic libraries, together with other units within the universities, to be able to design infrastructures and develop services that suit the needs of the research community. The findings can be used, to develop data policies and services, based on professional knowledge of best practices and recognized standards that assist the research community at data management. Originality/value The study contributes to the existing literature about research data management by examining the results by participants' field of research. Recognition of the issues is critical in order for information specialists in collaboration with universities to design relevant infrastructures and services for academics and doctoral students that can promote their research data management.

Date

20. 1.2015 18:30:22
Cerda-Cosme, R.; Méndez, E.: Analysis of shared research data in Spanish scientific papers about COVID-19 : a first approach (2023) 0.04
```
0.041494917 = product of:
  0.082989834 = sum of:
    0.0700398 = weight(_text_:data in 916) [ClassicSimilarity], result of:
      0.0700398 = score(doc=916,freq=22.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.5793489 = fieldWeight in 916, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=916)
    0.012950035 = product of:
      0.02590007 = sum of:
        0.02590007 = weight(_text_:22 in 916) [ClassicSimilarity], result of:
          0.02590007 = score(doc=916,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.19345059 = fieldWeight in 916, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=916)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

During the coronavirus pandemic, changes in the way science is done and shared occurred, which motivates meta-research to help understand science communication in crises and improve its effectiveness. The objective is to study how many Spanish scientific papers on COVID-19 published during 2020 share their research data. Qualitative and descriptive study applying nine attributes: (a) availability, (b) accessibility, (c) format, (d) licensing, (e) linkage, (f) funding, (g) editorial policy, (h) content, and (i) statistics. We analyzed 1,340 papers, 1,173 (87.5%) did not have research data. A total of 12.5% share their research data of which 2.1% share their data in repositories, 5% share their data through a simple request, 0.2% do not have permission to share their data, and 5.2% share their data as supplementary material. There is a small percentage that shares their research data; however, it demonstrates the researchers' poor knowledge on how to properly share their research data and their lack of knowledge on what is research data.

Date

21. 3.2023 19:22:02
Ilhan, A.; Fietkiewicz, K.J.: Data privacy-related behavior and concerns of activity tracking technology users from Germany and the USA (2021) 0.04
```
0.039865177 = product of:
  0.079730354 = sum of:
    0.06678032 = weight(_text_:data in 180) [ClassicSimilarity], result of:
      0.06678032 = score(doc=180,freq=20.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.5523875 = fieldWeight in 180, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=180)
    0.012950035 = product of:
      0.02590007 = sum of:
        0.02590007 = weight(_text_:22 in 180) [ClassicSimilarity], result of:
          0.02590007 = score(doc=180,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.19345059 = fieldWeight in 180, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=180)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

Purpose This investigation aims to examine the differences and similarities between activity tracking technology users from two regions (the USA and Germany) in their intended privacy-related behavior. The focus lies on data handling after hypothetical discontinuance of use, data protection and privacy policy seeking, and privacy concerns. Design/methodology/approach The data was collected through an online survey in 2019. In order to identify significant differences between participants from Germany and the USA, the chi-squared test and the Mann-Whitney U test were applied. Findings The intensity of several privacy-related concerns was significantly different between the two groups. The majority of the participants did not inform themselves about the respective data privacy policies or terms and conditions before installing an activity tracking application. The majority of the German participants knew that they could request the deletion of all their collected data. In contrast, only 35% out of 68 participants from the US knew about this option. Research limitations/implications This study intends to raise awareness about managing the collected health and fitness data after stopping to use activity tracking technologies. Furthermore, to reduce privacy and security concerns, the involvement of the government, companies and users is necessary to handle and share data more considerably and in a sustainable way. Originality/value This study sheds light on users of activity tracking technologies from a broad perspective (here, participants from the USA and Germany). It incorporates not only concerns and the privacy paradox but (intended) user behavior, including seeking information on data protection and privacy policy and handling data after hypothetical discontinuance of use of the technology.

Date

20. 1.2015 18:30:22

Wu, P.F.: Veni, vidi, vici? : On the rise of scrape-and-report scholarship in online reviews research (2023) 0.03

0.03466899 = product of:
  0.06933798 = sum of:
    0.051207926 = weight(_text_:data in 896) [ClassicSimilarity], result of:
      0.051207926 = score(doc=896,freq=6.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.42357713 = fieldWeight in 896, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=896)
    0.01813005 = product of:
      0.0362601 = sum of:
        0.0362601 = weight(_text_:22 in 896) [ClassicSimilarity], result of:
          0.0362601 = score(doc=896,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.2708308 = fieldWeight in 896, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=896)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: JASIST has in recent years received many submissions reporting data analytics based on "Big Data" of online reviews scraped from various platforms. By outlining major issues in this type of scape-and-report scholarship and providing a set of recommendations, this essay encourages online reviews researchers to look at Big Data with a critical eye and treat online reviews as a sociotechnical "thing" produced within the fabric of sociomaterial life.
Date: 22. 1.2023 18:33:53

Das, S.; Paik, J.H.: Gender tagging of named entities using retrieval-assisted multi-context aggregation : an unsupervised approach (2023) 0.03

0.025689062 = product of:
  0.051378123 = sum of:
    0.035838082 = weight(_text_:data in 941) [ClassicSimilarity], result of:
      0.035838082 = score(doc=941,freq=4.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.29644224 = fieldWeight in 941, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=941)
    0.015540041 = product of:
      0.031080082 = sum of:
        0.031080082 = weight(_text_:22 in 941) [ClassicSimilarity], result of:
          0.031080082 = score(doc=941,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.23214069 = fieldWeight in 941, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=941)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: Inferring the gender of named entities present in a text has several practical applications in information sciences. Existing approaches toward name gender identification rely exclusively on using the gender distributions from labeled data. In the absence of such labeled data, these methods fail. In this article, we propose a two-stage model that is able to infer the gender of names present in text without requiring explicit name-gender labels. We use coreference resolution as the backbone for our proposed model. To aid coreference resolution where the existing contextual information does not suffice, we use a retrieval-assisted context aggregation framework. We demonstrate that state-of-the-art name gender inference is possible without supervision. Our proposed method matches or outperforms several supervised approaches and commercially used methods on five English language datasets from different domains.
Date: 22. 3.2023 12:00:14

Kang, M.: Dual paths to continuous online knowledge sharing : a repetitive behavior perspective (2020) 0.02
```
0.024763562 = product of:
  0.049527124 = sum of:
    0.03657709 = weight(_text_:data in 5985) [ClassicSimilarity], result of:
      0.03657709 = score(doc=5985,freq=6.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.30255508 = fieldWeight in 5985, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5985)
    0.012950035 = product of:
      0.02590007 = sum of:
        0.02590007 = weight(_text_:22 in 5985) [ClassicSimilarity], result of:
          0.02590007 = score(doc=5985,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.19345059 = fieldWeight in 5985, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5985)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

Purpose Continuous knowledge sharing by active users, who are highly active in answering questions, is crucial to the sustenance of social question-and-answer (Q&A) sites. The purpose of this paper is to examine such knowledge sharing considering reason-based elaborate decision and habit-based automated cognitive processes. Design/methodology/approach To verify the research hypotheses, survey data on subjective intentions and web-crawled data on objective behavior are utilized. The sample size is 337 with the response rate of 27.2 percent. Negative binomial and hierarchical linear regressions are used given the skewed distribution of the dependent variable (i.e. the number of answers). Findings Both elaborate decision (linking satisfaction, intentions and continuance behavior) and automated cognitive processes (linking past and continuance behavior) are significant and substitutable. Research limitations/implications By measuring both subjective intentions and objective behavior, it verifies a detailed mechanism linking continuance intentions, past behavior and continuous knowledge sharing. The significant influence of automated cognitive processes implies that online knowledge sharing is habitual for active users. Practical implications Understanding that online knowledge sharing is habitual is imperative to maintaining continuous knowledge sharing by active users. Knowledge sharing trends should be monitored to check if the frequency of sharing decreases. Social Q&A sites should intervene to restore knowledge sharing behavior through personalized incentives. Originality/value This is the first study utilizing both subjective intentions and objective behavior data in the context of online knowledge sharing. It also introduces habit-based automated cognitive processes to this context. This approach extends the current understanding of continuous online knowledge sharing behavior.

Date

20. 1.2015 18:30:22

Becker, C.; Maemura, E.; Moles, N.: ¬The design and use of assessment frameworks in digital curation (2020) 0.02

0.023845656 = product of:
  0.09538262 = sum of:
    0.09538262 = weight(_text_:becker in 5508) [ClassicSimilarity], result of:
      0.09538262 = score(doc=5508,freq=2.0), product of:
        0.25693014 = queryWeight, product of:
          6.7201533 = idf(docFreq=144, maxDocs=44218)
          0.03823278 = queryNorm
        0.3712395 = fieldWeight in 5508, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.7201533 = idf(docFreq=144, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5508)
  0.25 = coord(1/4)

Jiao, H.; Qiu, Y.; Ma, X.; Yang, B.: Dissmination effect of data papers on scientific datasets (2024) 0.02
```
0.02301258 = product of:
  0.09205032 = sum of:
    0.09205032 = weight(_text_:data in 1204) [ClassicSimilarity], result of:
      0.09205032 = score(doc=1204,freq=38.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.7614136 = fieldWeight in 1204, product of:
          6.164414 = tf(freq=38.0), with freq of:
            38.0 = termFreq=38.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1204)
  0.25 = coord(1/4)
```
Abstract

Open data as an integral part of the open science movement enhances the openness and sharing of scientific datasets. Nevertheless, the normative utilization of data journals, data papers, scientific datasets, and data citations necessitates further research. This study aims to investigate the citation practices associated with data papers and to explore the role of data papers in disseminating scientific datasets. Dataset accession numbers from NCBI databases were employed to analyze the prevalence of data citations for data papers from PubMed Central. A dataset citation practice identification rule was subsequently established. The findings indicate a consistent growth in the number of biomedical data journals published in recent years, with data papers gaining attention and recognition as both publications and data sources. Although the use of data papers as citation sources for data remains relatively rare, there has been a steady increase in data paper citations for data utilization through formal data citations. Furthermore, the increasing proportion of datasets reported in data papers that are employed for analytical purposes highlights the distinct value of data papers in facilitating the dissemination and reuse of datasets to support novel research.
Li, K.; Greenberg, J.; Dunic, J.: Data objects and documenting scientific processes : an analysis of data events in biodiversity data papers (2020) 0.02
```
0.022398803 = product of:
  0.08959521 = sum of:
    0.08959521 = weight(_text_:data in 5615) [ClassicSimilarity], result of:
      0.08959521 = score(doc=5615,freq=36.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.7411056 = fieldWeight in 5615, product of:
          6.0 = tf(freq=36.0), with freq of:
            36.0 = termFreq=36.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5615)
  0.25 = coord(1/4)
```
Abstract

The data paper, an emerging scholarly genre, describes research data sets and is intended to bridge the gap between the publication of research data and scientific articles. Research examining how data papers report data events, such as data transactions and manipulations, is limited. The research reported on in this article addresses this limitation and investigated how data events are inscribed in data papers. A content analysis was conducted examining the full texts of 82 data papers, drawn from the curated list of data papers connected to the Global Biodiversity Information Facility. Data events recorded for each paper were organized into a set of 17 categories. Many of these categories are described together in the same sentence, which indicates the messiness of data events in the laboratory space. The findings challenge the degrees to which data papers are a distinct genre compared to research articles and they describe data-centric research processes in a through way. This article also discusses how our results could inform a better data publication ecosystem in the future.
Schöpfel, J.; Farace, D.; Prost, H.; Zane, A.; Hjoerland, B.: Data documents (2021) 0.02
```
0.021946255 = product of:
  0.08778502 = sum of:
    0.08778502 = weight(_text_:data in 586) [ClassicSimilarity], result of:
      0.08778502 = score(doc=586,freq=24.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.7261322 = fieldWeight in 586, product of:
          4.8989797 = tf(freq=24.0), with freq of:
            24.0 = termFreq=24.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=586)
  0.25 = coord(1/4)
```
Abstract

This article presents and discusses different kinds of data documents, including data sets, data studies, data papers and data journals. It provides descriptive and bibliometric data on different kinds of data documents and discusses the theoretical and philosophical problems by classifying documents according to the DIKW model (data documents, information documents, knowledge documents and wisdom documents). Data documents are, on the one hand, an established category today, even with its own data citation index (DCI). On the other hand, data documents have blurred boundaries in relation to other kinds of documents and seem sometimes to be understood from the problematic philosophical assumption that a datum can be understood as "a single, fixed truth, valid for everyone, everywhere, at all times".
Bossaller, J.; Million, A.J.: ¬The research data life cycle, legacy data, and dilemmas in research data management (2023) 0.02
```
0.021946255 = product of:
  0.08778502 = sum of:
    0.08778502 = weight(_text_:data in 966) [ClassicSimilarity], result of:
      0.08778502 = score(doc=966,freq=24.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.7261322 = fieldWeight in 966, product of:
          4.8989797 = tf(freq=24.0), with freq of:
            24.0 = termFreq=24.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=966)
  0.25 = coord(1/4)
```
Abstract

This paper presents findings from an interview study of research data managers in academic data archives. Our study examined policies and professional autonomy with a focus on dilemmas encountered in everyday work by data managers. We found that dilemmas arose at every stage of the research data lifecycle, and legacy data presents particularly vexing challenges. The iFields' emphasis on knowledge organization and representation provides insight into how data, used by scientists, are used to create knowledge. The iFields' disciplinary emphasis also encompasses the sociotechnical complexity of dilemmas that we found arise in research data management. Therefore, we posit that iSchools are positioned to contribute to data science education by teaching about ethics and infrastructure used to collect, organize, and disseminate data through problem-based learning.
Huang, T.; Nie, R.; Zhao, Y.: Archival knowledge in the field of personal archiving : an exploratory study based on grounded theory (2021) 0.02
```
0.021407552 = product of:
  0.042815104 = sum of:
    0.02986507 = weight(_text_:data in 173) [ClassicSimilarity], result of:
      0.02986507 = score(doc=173,freq=4.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.24703519 = fieldWeight in 173, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=173)
    0.012950035 = product of:
      0.02590007 = sum of:
        0.02590007 = weight(_text_:22 in 173) [ClassicSimilarity], result of:
          0.02590007 = score(doc=173,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.19345059 = fieldWeight in 173, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=173)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

Purpose The purpose of this paper is to propose a theoretical framework to illustrate the archival knowledge applied by archivists in their personal archiving (PA) and the mechanism of the application of archival knowledge in their PA. Design/methodology/approach The grounded theory methodology was adopted. For data collection, in-depth interviews were conducted with 21 archivists in China. Data analysis was performed using the open coding, axial coding and selective coding to organise the archival knowledge composition of PA and develops the awareness-knowledge-action (AKA) integration model of archival knowledge application in the field of PA, according to the principles of the grounded theory. Findings The archival knowledge involved in the field of PA comprises four principal categories: documentation, arrangement, preservation and appraisal. Three interactive factors involved in archivists' archival knowledge application in the field of PA behaviour: awareness, knowledge and action, which form a pattern of awareness leading, knowledge guidance and action innovation, and archivists' PA practice is flexible and innovative. The paper underscored that it is need to improve archival literacy among general public. Originality/value The study constructs a theoretical framework to identify the specialised archival knowledge and skills of PA which is able to provide solutions for non-specialist PA and develops an AKA model to explain the interaction relationships between awareness, knowledge and action in the field of PA.

Date

22. 1.2021 14:20:27
Guo, T.; Bai, X.; Zhen, S.; Abid, S.; Xia, F.: Lost at starting line : predicting maladaptation of university freshmen based on educational big data (2023) 0.02
```
0.021407552 = product of:
  0.042815104 = sum of:
    0.02986507 = weight(_text_:data in 1194) [ClassicSimilarity], result of:
      0.02986507 = score(doc=1194,freq=4.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.24703519 = fieldWeight in 1194, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1194)
    0.012950035 = product of:
      0.02590007 = sum of:
        0.02590007 = weight(_text_:22 in 1194) [ClassicSimilarity], result of:
          0.02590007 = score(doc=1194,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.19345059 = fieldWeight in 1194, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1194)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

The transition from secondary education to higher education could be challenging for most freshmen. For students who fail to adjust to university life smoothly, their status may worsen if the university cannot offer timely and proper guidance. Helping students adapt to university life is a long-term goal for any academic institution. Therefore, understanding the nature of the maladaptation phenomenon and the early prediction of "at-risk" students are crucial tasks that urgently need to be tackled effectively. This article aims to analyze the relevant factors that affect the maladaptation phenomenon and predict this phenomenon in advance. We develop a prediction framework (MAladaptive STudEnt pRediction, MASTER) for the early prediction of students with maladaptation. First, our framework uses the SMOTE (Synthetic Minority Oversampling Technique) algorithm to solve the data label imbalance issue. Moreover, a novel ensemble algorithm, priority forest, is proposed for outputting ranks instead of binary results, which enables us to perform proactive interventions in a prioritized manner where limited education resources are available. Experimental results on real-world education datasets demonstrate that the MASTER framework outperforms other state-of-art methods.

Date

27.12.2022 18:34:22
Hagen, L.; Patel, M.; Luna-Reyes, L.: Human-supervised data science framework for city governments : a design science approach (2023) 0.02
```
0.02101194 = product of:
  0.08404776 = sum of:
    0.08404776 = weight(_text_:data in 1016) [ClassicSimilarity], result of:
      0.08404776 = score(doc=1016,freq=22.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.6952187 = fieldWeight in 1016, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=1016)
  0.25 = coord(1/4)
```
Abstract

The importance of involving humans in the data science process has been widely discussed in the literature. However, studies lack details on how to involve humans in the process. Using a design science approach, this paper proposes and evaluates a human-supervised data science framework in the context of local governments. Our findings suggest that the involvement of a stakeholder group, public managers in this case, in the process of data science project enhanced quality of data science outcomes. Public managers' detailed knowledge on both the data and context was beneficial for improving future data science infrastructure. In addition, the study suggests that local governments can harness the value of data-driven approaches to policy and decision making through focalized investments in improving data and data science infrastructure, which includes culture and processes necessary to incorporate data science and analytics into the decision-making process.
Yoon, A.; Copeland, A.: Toward community-inclusive data ecosystems : challenges and opportunities of open data for community-based organizations (2020) 0.02
```
0.020447217 = product of:
  0.08178887 = sum of:
    0.08178887 = weight(_text_:data in 5) [ClassicSimilarity], result of:
      0.08178887 = score(doc=5,freq=30.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.6765338 = fieldWeight in 5, product of:
          5.477226 = tf(freq=30.0), with freq of:
            30.0 = termFreq=30.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5)
  0.25 = coord(1/4)
```
Abstract

The benefits of open data for helping to address societal problems and strengthen communities are well recognized, and unfortunately previous studies found that smaller communities are often excluded from the current data ecosystem because of existing technological, technical, cognitive, and practical barriers. This study aims to investigate the process of communities' data use for community development and decision-making-focusing on the opportunities and challenges of data for communities. From the interviews with 25 staff from community-based organizations (CBOs) in nine small, medium, and large cities in the United States, the findings of this study describe data's role in supporting communities' development while reporting several major challenges that hinder CBOs data use: difficulty accessing data, limitations of open data (un-local nature, excluding essential data from being open), limited data capacity (especially in data literacy skills), and difficulties using and accessing existing data infrastructures. Our findings suggest opportunities for addressing these challenges, particularly by creating educational programming, building partnerships within data ecosystems, and bringing community voices forward in current data ecosystems, which are critical to realizing data's potential for all citizens.
Fonseca, F.: Whether or when : the question on the use of theories in data science (2021) 0.02
```
0.020447217 = product of:
  0.08178887 = sum of:
    0.08178887 = weight(_text_:data in 409) [ClassicSimilarity], result of:
      0.08178887 = score(doc=409,freq=30.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.6765338 = fieldWeight in 409, product of:
          5.477226 = tf(freq=30.0), with freq of:
            30.0 = termFreq=30.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=409)
  0.25 = coord(1/4)
```
Abstract

Data Science can be considered a technique or a science. As a technique, it is more interested in the "what" than in the "why" of data. It does not need theories that explain how things work, it just needs the results. As a science, however, working strictly from data and without theories contradicts the post-empiricist view of science. In this view, theories come before data and data is used to corroborate or falsify theories. Nevertheless, one of the most controversial statements about Data Science is that it is a science that can work without theories. In this conceptual paper, we focus on the science aspect of Data Science. How is Data Science as a science? We propose a three-phased view of Data Science that shows that different theories have different roles in each of the phases we consider. We focus on when theories are used in Data Science rather than the controversy of whether theories are used in Data Science or not. In the end, we will see that the statement "Data Science works without theories" is better put as "in some of its phases, Data Science works without the theories that originally motivated the creation of the data."
Geras, A.; Siudem, G.; Gagolewski, M.: Should we introduce a dislike button for academic articles? (2020) 0.02
```
0.020440696 = product of:
  0.04088139 = sum of:
    0.02534135 = weight(_text_:data in 5620) [ClassicSimilarity], result of:
      0.02534135 = score(doc=5620,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.2096163 = fieldWeight in 5620, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=5620)
    0.015540041 = product of:
      0.031080082 = sum of:
        0.031080082 = weight(_text_:22 in 5620) [ClassicSimilarity], result of:
          0.031080082 = score(doc=5620,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.23214069 = fieldWeight in 5620, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=5620)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

There is a mutual resemblance between the behavior of users of the Stack Exchange and the dynamics of the citations accumulation process in the scientific community, which enabled us to tackle the outwardly intractable problem of assessing the impact of introducing "negative" citations. Although the most frequent reason to cite an article is to highlight the connection between the 2 publications, researchers sometimes mention an earlier work to cast a negative light. While computing citation-based scores, for instance, the h-index, information about the reason why an article was mentioned is neglected. Therefore, it can be questioned whether these indices describe scientific achievements accurately. In this article we shed insight into the problem of "negative" citations, analyzing data from Stack Exchange and, to draw more universal conclusions, we derive an approximation of citations scores. Here we show that the quantified influence of introducing negative citations is of lesser importance and that they could be used as an indicator of where the attention of the scientific community is allocated.

Date

6. 1.2020 18:10:22

Lorentzen, D.G.: Bridging polarised Twitter discussions : the interactions of the users in the middle (2021) 0.02

0.020440696 = product of:
  0.04088139 = sum of:
    0.02534135 = weight(_text_:data in 182) [ClassicSimilarity], result of:
      0.02534135 = score(doc=182,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.2096163 = fieldWeight in 182, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=182)
    0.015540041 = product of:
      0.031080082 = sum of:
        0.031080082 = weight(_text_:22 in 182) [ClassicSimilarity], result of:
          0.031080082 = score(doc=182,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.23214069 = fieldWeight in 182, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=182)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: Purpose The purpose of the paper is to analyse the interactions of bridging users in Twitter discussions about vaccination. Design/methodology/approach Conversational threads were collected through filtering the Twitter stream using keywords and the most active participants in the conversations. Following data collection and anonymisation of tweets and user profiles, a retweet network was created to find users bridging the main clusters. Four conversations were selected, ranging from 456 to 1,983 tweets long, and then analysed through content analysis. Findings Although different opinions met in the discussions, a consensus was rarely built. Many sub-threads involved insults and criticism, and participants seemed not interested in shifting their positions. However, examples of reasoned discussions were also found. Originality/value The study analyses conversations on Twitter, which is rarely studied. The focus on the interactions of bridging users adds to the uniqueness of the paper.
Date: 20. 1.2015 18:30:22

Search (326 results, page 1 of 17)

Authors

Languages

Types

Themes

Subjects

Classifications