Transparent Science

What does Transparent Science mean?

Scientific claims should be subject to scrutiny by other researchers and the public at large. An important requirement for such scrutiny is that researchers make their claims transparent in a way that other researchers are able to use easily available resources to form a complete understanding of the methods that were used by the original. In the social sciences, especially given the personal computing and Internet revolutions and the wide availability of data and processing power, it is essential that data, code, and analyses be transparent.

Christensen (2016, p. 5)

Difference Between Open Science and Transparent Science

While Open Science aims at making research outputs available to the public, Transparent Science‘s goal is to ensure that the results are reproducible. In other words, to be be open in a meaningful way and for achieving real openness of research data, science has to be transparent.

Essentials of Transparent Science

  • Document your data collection and share your data (see also the knowledge base’s section on data sharing)
    • To ensure transparent science and comprehensible data you should document your data collection and analyses; this entails for example your hypotheses (which should be created before data collection!), research goals, research method, population, sample size and characteristics, power analysis results, data collection method, etc.
    • Reproducibility of your results will be strongly increased if data that is the foundation of the publication is shared
    • Note: If the data collection process itself is reproducible, the profit of providing data underlying your publication is small. Thus, data sharing may not be necessary in the following cases:
      • for secondary research: if only existing research data are used that can be retrieved with a persistent identifier
      • for simulation studies: if the data generating code is provided and the amount of generated data is too large to be efficiently shared
  • Always document the instruments and software you used for collecting and analyzing your data
    • You should always state which versions were used. Note that this documentation is inevitable if you want to ensure that other researchers can reproduce your results. For example in Mplus, the default option for the treatment of missing values changed between Version 6 and 6.1 (see Mplus Version History). As a consequence, researchers using version 6 or earlier will obtain different results if they execute the same code on exactly the same data.
      • For commercial statistical software, such as SPSS, documenting the version will be quite easy (assuming all calculations are carried out using the same SPSS version).
      • For R, the version of your R-installation and versions of all packages that are used has to be documented. You can simply execute the sessionInfo() command after all packages are loaded to generate this overview. Moreover, the package Packrat helps you managing these dependencies.
      • For Mplus the software version should be reported. Fortunately, your .out-file will include this documentation automatically.

If syntax and data are stored or published in order to document data processing or statistical analyses, this syntax has to be executable. This may seem trivial, but experience shows that it is not. Among others, you should ensure that:

  • filenames inside the syntax and names of files that are provided for download match with each other
  • all programs and packages that are needed in order to run the syntax are documented (including version numbers)
  • syntax runs without errors
  • results based on the published research outputs (syntax and data) do not differ from results that were published within articles

A possible solution for combining executable syntax and documentation are R Markdown Scripts that use KnitR (Mair’s introductory article).

Best Practices and Guides on Transparent Science

References

Further Resources

    • Badges. The open science framework promotes Badges to Acknowledge Open Practices. The badges introduce an open science standard, as well as a visual icon that allows to see whether a study has been conducted following open-science principles. There are different badges for achieving different open science ‘goals’:
      • Open Data badge: earned for making your data publicly available in order to make reproduction and reuse possible
      • Open Materials badge: earned for making the components of the research methodology publicly available
      • Preregistered/Preregistered + Analysis Plan badges: earned for preregistering research without/with an analysis plan for the research