What does Transparent Science mean?
Scientific claims should be subject to scrutiny by other researchers and the public at large. An important requirement for such scrutiny is that researchers make their claims transparent in a way that other researchers are able to use easily available resources to form a complete understanding of the methods that were used by the original. In the social sciences, especially given the personal computing and Internet revolutions and the wide availability of data and processing power, it is essential that data, code, and analyses be transparent.
Christensen (2016, p. 5)
Difference Between Open Science and Transparent Science
While Open Science aims at making research outputs available to the public, Transparent Science‘s goal is to ensure that the results are reproducible. In other words, to be be open in a meaningful way and for achieving real openness of research data, science has to be transparent.
Essentials of Transparent Science
- Document your data collection and share your data (see also the knowledge base’s section on data sharing)
- To ensure transparent science and comprehensible data you should document your data collection and analyses; this entails for example your hypotheses (which should be created before data collection!), research goals, research method, population, sample size and characteristics, power analysis results, data collection method, etc.
- Reproducibility of your results will be strongly increased if data that is the foundation of the publication is shared
- Note: If the data collection process itself is reproducible, the profit of providing data underlying your publication is small. Thus, data sharing may not be necessary in the following cases:
- for secondary research: if only existing research data are used that can be retrieved with a persistent identifier
- for simulation studies: if the data generating code is provided and the amount of generated data is too large to be efficiently shared
- Always document the instruments and software you used for collecting and analyzing your data
- You should always state which versions were used. Note that this documentation is inevitable if you want to ensure that other researchers can reproduce your results. For example in Mplus, the default option for the treatment of missing values changed between Version 6 and 6.1 (see Mplus Version History). As a consequence, researchers using version 6 or earlier will obtain different results if they execute the same code on exactly the same data.
- For commercial statistical software, such as SPSS, documenting the version will be quite easy (assuming all calculations are carried out using the same SPSS version).
- For R, the version of your R-installation and versions of all packages that are used has to be documented. You can simply execute the sessionInfo() command after all packages are loaded to generate this overview. Moreover, the package Packrat helps you managing these dependencies.
- For Mplus the software version should be reported. Fortunately, your .out-file will include this documentation automatically.
- You should always state which versions were used. Note that this documentation is inevitable if you want to ensure that other researchers can reproduce your results. For example in Mplus, the default option for the treatment of missing values changed between Version 6 and 6.1 (see Mplus Version History). As a consequence, researchers using version 6 or earlier will obtain different results if they execute the same code on exactly the same data.
If syntax and data are stored or published in order to document data processing or statistical analyses, this syntax has to be executable. This may seem trivial, but experience shows that it is not. Among others, you should ensure that:
- filenames inside the syntax and names of files that are provided for download match with each other
- all programs and packages that are needed in order to run the syntax are documented (including version numbers)
- syntax runs without errors
- results based on the published research outputs (syntax and data) do not differ from results that were published within articles
A possible solution for combining executable syntax and documentation are R Markdown Scripts that use KnitR (Mair’s introductory article).
Best Practices and Guides on Transparent Science
- Project TIER offers best practice for Stata and R users as well as a neutral version (with focus on analyses, organization of files and documentation of syntax)
- Berkeley Initiative for Transparency in the Social Sciences offers a Manual of Best Practices in Transparent Social Science Research (by Garret Christensen, 2016)
- Pre-Registrations. Pre-registrations of studies are increasingly demanded. There are already some journals which only accept pre-registered articles. A core element of these pre-registered reports are pre-analysis plans. See for example, the guidance for pre-registration and analysis of the OSF preregistration challenge. Pre-Registrations can be easily generated using aspredicted.org. For clinical trials, established platforms like clinicaltrials.gov exist. A comprehensive introduction to pre-registrations is provided by Schönbrodt, Scheel and Stachl (n.d.).
- Standard Operating Procedures (SOP). Standard Operating Procedures can be an efficient way of establishing default practices (e.g. for statistical analyses). If these practices are specified in separate documents, researchers do no longer have to outline every single aspect of data analyses in per-analysis plans. Therefore, SOPs minimize effort required for the creation of these plans and increases the transparency of research practices.
- Collection of Standard Operating Procedures by the Open Science Framework
- Example SOPs of the experimental research group led by Donald P. Green at Columbia University as well as an article outlining benefits of standard operating procedures are freely available at GitHub under the following link https://github.com/acoppock/Green-Lab-SOP
- The German Technologie- und Methodenplattform für die vernetzte medizinische Forschung e.V. (TMF) offers templates for SOPs which are aimed at medical research projects.
References
- Christensen, G. (2016). Manual of best practices in transparent social science research. Retrieved from https://github.com/garretchristensen/BestPracticesManual/blob/master/Manual.pdf
Further Resources
-
- Badges. The open science framework promotes Badges to Acknowledge Open Practices. The badges introduce an open science standard, as well as a visual icon that allows to see whether a study has been conducted following open-science principles. There are different badges for achieving different open science ‘goals’:
- Open Data badge: earned for making your data publicly available in order to make reproduction and reuse possible
- Open Materials badge: earned for making the components of the research methodology publicly available
- Preregistered/Preregistered + Analysis Plan badges: earned for preregistering research without/with an analysis plan for the research
- Badges. The open science framework promotes Badges to Acknowledge Open Practices. The badges introduce an open science standard, as well as a visual icon that allows to see whether a study has been conducted following open-science principles. There are different badges for achieving different open science ‘goals’:
-
- CRediT. The initiative proposed a Contributor Roles Taxonomy which aims at giving each contributor (to the research’s output) the credit he/she deserves. See for example this video about the integration of CRediT in Aries.
-
- The Netzwerk der Open Science Initiativen (NOSI) provides information on German Open Science initiatives
- Lakens, D. (2016, 28 october). Open Science and Good Research Practices. Retrieved from https://mfr.osf.io/render?url=https://osf.io/h4ue5/?action=download%26mode=render