Manuscript etiquette

Quality control

I care about the time of my colleagues and collaborators. Therefore, I want to limit their efforts as much as possible but facilitate their flow of thoughts to make meaningful comments to the work.

The checklist below is meant to be run through before sending out a draft. This is leading to less time being spent with the obvious and practicalities. More time is going directly into giving content-centered comments and proposals for revision.

1. Make ‘feeling’ notes
Often, I already get the feeling when writing/revising that a certain comment will arise or that something is unclear. But I don’t know yet how to pinpoint what it is exactly. Make a small mark, then go back later and take a moment to think about where you stumbled.

2. Read every page out loud
Serves as a language control and flow of thought / words

3. Check reference list
Any weird formatting?

4. Tables and in-text numbers
Check against sources (prevent mistakes). Do counts sum up correctly? Do percentages add up? Are the numbers correct and the same in tables & text?

5. Have all previous comments be adressed?
Consider adding a version guide with requests & how the requests have been implemented.

Productivity

Making revisions in a mauscript is a dextrous task. Often revisions of a manuscript are delayed because of the busy schedule of co-authors, who only manage to give their comments after additional reminders.

I was amazed by the comment of a colleague who told me that, when he was working in a two-person research group, they would be able to send out full manuscripts within two days. Because their communication was so straightforward and the focus was maximized, discussion could happen easily, and textual adaptions could readily be implemented.

Therefore, I have decided to adopt a similar two-author centered strategy for my manuscripts. Although I usually collaborate with multiple people and thus have more co-authors, I try to produce a very good first draft with only one other colleague. This happens in close collaboration and with rapid communications. Only after we both are satisfied, we would send out the manuscript for comments and revisions to the other co-authors.
In subsequent rounds of comments, the same principle would be implemented again, thus always producing good intermediate versions before they are shared with other contributors.

In this way, I am striving to make sure that every contributors’ time is being used efficiently, and that it becomes less dextrous to work through a draft manuscript, because one has naturally less comments.

More reading

Why are papers rejected? Read these common reasons

Anticipate how others will peer review your work. Help from Matt Might on how to peer review.

How to respond to peer review. Again a great resource from Matt.

My collection of reads on specific statistical problems

  • Common statistical mistakes you should avoid (link)
  • What makes a statistical analysis wrong? (link)
  • P values in small sample sizes (link)
  • the common usage of a Welch type t-test (link)
  • T test assumptions (link)
  • A good review of the terminological uncertanties concerning dependent  / independent variables (link)
  • Wilcoxon significance with similar means (link), check out the ancient median test (wikipedia)
  • Alphas, P-Values, and Confidence Intervals: Oh My! (link)
  • What is the Lan-DeMets approach to interim analysis? Calculation of corrected p-values for increased Type I error (link)
  • Stop criteria for clinical trials: O’Brien–Fleming–type boundaries (link)
  • How to choose a statistical test (link and link2)
  • Analysis of covariance (here and here)
  • Logistic Regression in R (link), estimated regression equation (link)
  • 6 Types of Dependent Variables that will Never Meet the GLM Normality Assumption (link)
  • Spearman vs Pearson correlation for non-normally distributed data (link)
  • Median vs mean: when the median doesn’t mean what it means (link) and the median isn’t the message (link)
  • Shifting from hypothesis testing to prefferential techniques. “The new statistics.” Cumming (2014) Psychological science (PDF link)
  • Understanding Bayes: A Look at the Likelihood (link) by Alex Etz
  • Krzywinski M, Altman N. Points of significance: Power and sample size. Nat Meth. 2013 Nov 26;10(12):1139–40.
  • Button KS, Ioannidis JPA, Mokrysz C, Nosek BA, Flint J, Robinson ESJ, et al. Power failure: why small sample size undermines the reliability of neuroscience. Nat Rev Neurosci. Nature Publishing Group; 2013 Apr 10;14(5):365–76.
  • Statistical inference is only mostly wrong (link)
  • Interpreting regression output in R
  • Beginners Guide to Bayesian statistics (PDF)
  • The confusing case of choosing x- and y-axes (link)
  • General linear models: an introduction (link)
  • Growth curve analysis in R (PDF)
  • Generating a nested case-control cohort study using Epi in R (link)
  • Understanding MatchIt output in R (StackExchange)
  • Matching R package (paper PDF)
  • Optmatch on CRAN (pdf)
  • What is a propensity score (NCBI) and intro for observational studies (NCBI)

Read more:

Improving reproducibility: approaching the individual researcher

Science is in the midst of a reproducibility crisis – that’s old news anno 2015. However, less is known amgonst the research community about what the individual scientist can contribute towards making scientific findings more robust.

Yet, it is at the level of the individual researcher where changes of research routines can transform a flawed current system into a better one, with both improved reproducibility and integrity of research. In other words, through the adaptation of his day-to-day research pratices, each researcher has the chance to make important contributions.

So what exactly can the individual researcher do?

Experimental design

  • Publish your experimental design and data analysis strategy online before conducting the study. Discuss and improve it openly with colleagues and collaborators. Depending on the experimental characteristics, the study plan can be placed on a repository (see below) or into an academic journal.
  • Register your study before commencing it. This counts especially for all studies involving humans. Go to clinicaltrials.gov
  • Consider a multi-center approach also for experimental studies. Even for animal studies.
  • Plan an advanced statistical analysis. Don’t focus on a p-value only. Describe confidence intervals and effect sizes. More statistial tips from Cumming (free access) and on my statistics resources collection. Consult a statistician early!
  • Include a sustainable science statement in project proposals and grant requests. Signal to others what you do to improve the integrity of your research. Set your own standards. Guidance from the COS badges project.

Doing the experiment

  • Conduct the experiment triple-blinded. This also counts for laboratory research.
  • Use an electronic laboratory notebook. Fully searchable, never loose anything, improves collaboration. Recommendations: as simple version Labguru, the advanced people go with Open Science Framework / GitHub. Consider having your labnotebook entirely public or even collaborative (open notebook science).

Analysis, interpretation and write-up/presentation

  • Publish your dataset.  Increasing the transparency of your work, yet often in conflict with privacy issues. Use a repository online (read below), or choose a journal, e.g. Scientific Data.
  • Publish your analysis. This can be done for all statistical programs. R is unquestionable the best solution for this so far, especially when you’re using Rmarkdown and knitr. shiny allows the user to interact with the data.
  • Publish negative findings. If you’d like it as a full manuscript, use e.g. PLOS or the Journal of Negative Results. Don’t want to write it up to manuscript level? Put it on Open Science Framework or figshare!
  • Validate your findings. Verify! For example, use a secondary technique that may strengthen confidence in the present results.  Have a collaborator replicate the study in his lab. Find an independent reviewer of your statistical analysis.
  • Follow a reporting guideline when writing up for publication. They are available for RCT’s, observational studies, systematic reviews/meta analyses, case reports, animal studies, qualitative research, economic evaluations at the Equator Network.
  • Add a reproducibility declaration to your manuscript. In the methodology section or at the end of the paper. State where the data, code, and analysis can be found. If you want, choose a Zenodo DOI with an embargo before publication.
  • Publish your article pre-print online. Discuss it with others in the field (open pre-publication peer review). Have them repeat analyses, let them interact with your data. The last moment you want to do it is upon submission to a journal.
  • Submit your article to a journal with open peer review. This usually means that the whole reviews are being published online together with the final article to promote transparency.
  • Make presentations available online. Posters and seminars are readily available and publishing them is a low-effort way to get more input from other researchers. For example at figshare, but choose whatever you like.
  • Publish your article post-print for open access. Let the world access your research for free. Many publishers allow you to put your manuscript on your own website as long as you don’t use their print PDF. You can see an example here. Whether your publisher allows this can be checked on RoMEO.

Change your microsystem

  • Promote departmental research integrity policies. Expect every graduate thesis to have minimal standards in terms of reproducibility and integrity. Require senior researchers to peer-review in light of the same standards.
  • Improve scientific training at your institutions. Include sessions on open science, data sharing, etc. Expand the statistical literacy of your trainees.

Online scientific data and project repositories

Where to store datasets, files and documents?

  • Open Science Framework. Offers collaborative projects, wikis. Projects can be public or private.
  • Figshare. Put on everything on the web, have it timestamped & DOI citable.
  • Zenodo. Gives you a DOI for almost everything on the web, stored at their servers or elsewhere. Also offers an ’embargo’ version, only making contents available after their publication.
  • GitHub. Steep learning curve, yet best tool out there. Collaborate, make a wiki, share files, excellent version control.
  • arXiv and bioRXiv. These are pre-print archives of research.
  • PeerJ. Both pre-print publications (free) and life-long open access publishing.

Best practices / examples

  • Departmental change: introduce a department open science committee. Example of Felix Schoenbrodt at Psychology / LMU Munich. Develop standards on student evaluation, career & tenure decisions.
  • Commit yourself towards teaching your grad students good research practices. The Reproducibility PI Manifesto of Lorena Barba with reference to the science code manifesto.
  • A great example on full online publication of data, analysis, and paper: How much of the world is wood? by FitzJohn and colleagues.
  • “Is my brilliant idea any good? I am not sure, so I’ve pre-printed it on PeerJ”. Read on Keil’s website.

Some people change the world. Follow them

  • rOpenSci. “We are changing how science works through open data”.
  • Center for Open Science. “We foster the openness, integrity, and reproducibility of scientific research”. Few know that they offer free statistical consulting and online workshops.

Summing up

All the solutions stated above were chosen to be displayed because (1) they are concrete actions, (2) can be readily introduced into the daily work of a scientist and (3) should become the new methodological standard.

I have not included many other tools that are coming up (e.g. scientific markdown, open version control) because they are to techy at this moment and not yet accessible and usable for a broad researcher community.

Where publication of a project is often regarded as the project’s end, it should rather be seen as its beginning. Science needs to be discussed and critically evaluated. Nobody likes to produce papers that are never being read in the end. It has been shown that articles which adopted a more robust methodological strategy are cited more often.

A note on paradigm-based research

Traditional research theory teaches that science evolves over a long term. Cumulative evidence eventually leads to paradigm shifts that change our models of thought (Kuhn). At the same time, experimental results can never prove a theory, but provide evidence for or against a theory (falsification, Popper). While replication of a study may increase our confidence in its results, this does not necessarily make the results a closer approximation to truth. Yet, we provide new datapoints that challenge previous interpretations (models) and lead us to better understand the uncertanties involved.

Bayesian theory embraces this function of certainty and uncertainty. It can be argued that null-hypothesis testing leads to a dichotomous scientific worldview (true / false). On the other hand, Bayesian statistics allow a view based on probabilities for or against a clearly defined model, relative to the current state of evidence.  This also requires the researcher to specifically define an alternative model (paradigm), i.e. to ask the right questions.

Commit yourself: sign a declaration

A model from several open science symposia has been to sign a declaration of intention. Such a declaration would entail that in the future, the signing parties will commit to open science practices in their work. Such statements are embracing the responsibility of individual researchers to work under high qualitative standards.

A team of researchers at LMU Munich around Felix Schoenbrodt have published an open statement Voluntary Commitment to Research Transparency and Open Science (blog | OSF) that can be used to signal to yourself and other researchers/co-authors: I’m determined. See for yourself, and make your commitment!

 

Changelog

August 31, 2015: Added note on paradigm-based research after discussion of the post with @markomanka.

September 28, 2015: Added ‘sign a declaration’ at suggestion of @nicebread303.

Detection of elevated INR by thromboelastometry and thromboelastography in warfarin treated patients and healthy controls.

Study resource & repository

Detection of elevated INR by thromboelastometry and thromboelastography in warfarin treated patients and healthy controls.
David E. Schmidt, Margareta Holmstrom, Ammar Majeed, Doris Naslin, Hakan Wallen, Anna Agren.
Thrombosis Research (2015).


Thrombosis Research allows to publish the final manuscript post-print.

>> Retrieve the final manuscript as PDF

Abstract

Introduction: The diagnostic potential of whole blood viscoelastic tests thrombelastography (TEG®) and thrombelastometry (ROTEM®) to detect warfarin-induced INR elevation remains elusive.

Methods: Viscoelastic tests were performed in 107 patients on warfarin and 89 healthy controls. Tests were activated by kaolin for TEG, and ellagic acid (INTEM) or tissue factor (EXTEM) for ROTEM.

Results: Viscoelastic tests revealed significant differences in clotting profiles between controls and warfarin-treated patients. Compared with healthy controls, patients treated with warfarin had prolonged EXTEM clotting and TEG reaction time (p < 0.001), both of which were also increased beyond the reference range. Increased INR values correlated with EXTEM CT (Spearman rho= 0.87) and TEG R-time (rho= 0.73). EXTEM CT had a sensitivity and specificity of 0.89 and 1.00, respectively, to detect elevated INR above 1.2 units, with a positive and negative predictive values (PPV and NPV) of 1.00 and 0.88, respectively. Similarly, TEG R-time had a sensitivity and specificity of 0.86 and 0.87, respectively, with a PPV of 0.89 and a NPV of 0.83. The corresponding receiver operator characteristic area under the curve was 0.99 (95% confidence interval [CI], 0.99 – 1.00) for EXTEM CT and 0.94 (95% CI, 0.91 – 0.97) for TEG R-time.

Conclusions: Tissue factor-activated viscoelastic testing (EXTEM) revealed individuals with warfarin-induced INR elevation accurately, while TEG – activated through the intrinsic pathway – still was of acceptable diagnostic value. Further studies are required to evaluate the diagnostic potential of viscoelastic tests in relation to standard laboratory tests in other mixed patient populations, where the PPV and NPV may be inferior.

DOI to publisher PDF: http://dx.doi.org/10.1016/j.thromres.2015.02.022

 

Circulating endothelial cells in coronary artery disease and acute coronary syndrome.

Study resource & repository

Circulating endothelial cells in coronary artery disease and acute coronary syndrome.
David E. Schmidt, Marco Manca, Imo E. Hoefer.
Trends in Cardiovascular Medicine (2015).


Trends in Cardiovascular Medicine allows to publish the final manuscript pre-print.

Abstract
Circulating endothelial cells (CEC) have been put forward as a promising biomarker for diagnosis and prognosis of coronary artery disease and acute coronary syndromes. This review entails current insights into the physiology and pathobiology of CEC, including their relationship with circulating endothelial progenitor cells and endothelial microparticles. Additionally, we present a comprehensive overview of the diagnostic and prognostic value of CEC quantification, as well as possibilities for improvement, for example by inclusion of CEC morphology, transcriptomics, and proteomics. The current stand of knowledge calls out for improved counting methods and consensus on a validated cell definition. Finally, our review accentuates the importance of large, well-designed population-based prospective studies that will have to show the clinical value of CEC as cardiovascular biomarker.

DOI to publisher PDF: http://dx.doi.org/10.1016/j.tcm.2015.01.013

Collection: How to do a succesful PhD

Successful PhD students
http://matt.might.net/articles/successful-phd-students/

A thesis proposal is a contract
http://matt.might.net/articles/advice-for-phd-thesis-proposals/

Productivity tips for academics
http://matt.might.net/articles/productivity-tips-hints-hacks-tricks-for-grad-students-academics/

10 reasons PhD students fail
http://matt.might.net/articles/ways-to-fail-a-phd/

12 resolutions for grad students
http://matt.might.net/articles/grad-student-resolutions/

Why choose a PhD?
http://blogs.nature.com/naturejobs/2013/10/17/back-to-school-why-choose-a-phd?WT.mc_id=FBK_NatureJobs

http://blogs.nature.com/naturejobs/2013/10/18/the-involuntary-phd

Write a PhD thesis in record time and keep your sanity while you do it

Part 1/3
https://www.youtube.com/watch?v=SeZRsYy3kRc

full lecture (no slides in view) https://www.youtube.com/watch?v=4MkRMp3roKQ

How I wrote a PhD thesis in 3 months

How I wrote a PhD thesis in 3 months


Best advice from this article

  • clear yourself from distractions
  • when you’re mindfucked, get out, walk a bit (or take a run)
  • daily minimum = 500 words (you can meet this even on the least productive day)
  • routine for starting in the morning and going home in the evening
  • re-read and revise often (whether directly or after hours/a day, that’s your choice)

How to do a PhD: top 10 tips

How to do a PhD: top 10 tips

How to choose a topic

How to choose a thesis topic

 

Towards the end of a PhD

7 Career Killing Mistakes PhDs Make That Keep Them Poor And Unhappy

Free Open Access Meducation for beginners [Revised 2016]

Free Open Access Meducation (FOAM) is an international movement of physicians engaging in the production and distribution of free and openly accessible online resources about medicine. The goal is to educate colleagues, learners and whomever is interested.

The great thing is that you can often learn from masters in their professions, even if you are on the other end of the globe. The content is so good, often exceeded the efforts of many of my university lecturers.

It’s all about medical knowledge, skills, and clinical reflections. These are usually delivered in a fun way, so that you enjoy exploring the contents. The biggest community so far is Emergency Medicine/Critical Care, but other specialities begin to pick it up as well. Anybody who produces FOAM content will extend his own knowledge base. Additionally, your colleagues will be able to learn.

It’s probably best to see some FOAM in action.

To get you started: three of my favorite talks

  1. Healthcare in Hippocrates Shadow by David Newman delivered at SMACC (link)
  2. Karim Brohi on Tranexamic Acid in Trauma (link)
  3. The Day I Didn’t Use Ultrasound by Mike Mallin hosted on Emcrit (link)

My favorite podcasts

More online resources

The deeper you dive, the more you find. The contents at this moment are already outrageously diverse and often of really high quality. Focus and persistence with 1 or 2 of the options above = winning strategy.

R resources

This collection contains

  • beginner stuff first and
  • advanced stuff later.

For the book fans

Dalgaard, Peter Introductory statistics with R 2. ed. : New York : Springer, cop. 2008 – xvi, 363 s. ISBN:978-0-387-79053-4 (pbk. : alk. paper)  LIBRIS-ID:11305121 

Download the simpleR / Using R for Introductory Statistics manual by John Verzani! SimpleR (PDF 2.1 Mb)

Distance learning

edX offers free online courses from Harvard, Karolinska (good!) and Microsoft in the form of massive open online courses / MOOCs.

R course on datacamp.

Tutorials

Misc

  • R search engine: rseek
  • Some R tips for beginners: impatient R
  • Another collection of R resources for beginners at PortfolioProbe
  • The R resource on Stack Overflow
  • OpenIntro Statistics: Labs for R (understand statistics through applied data analysis with R). They also offer a free college-level textbook on statistics!
  • Creating publication quality graphs with R (link)
  • Power & Sample Size Curves on quick-r
  • Google Refine is a great program for cleaning data & getting datasets into usable formats.
  • Effective frameworks for thinking about data analysis/data science problems in R – RStudio webinar (Vimeo)
  • Escape the Land of LaTeX/Word for Statistical Reporting: The Ecosystem of R Markdown Webinar – RStudio webinar (Vimeo)

Resources for Medical Statistics

Three great books

Statistics Done Wrong by Alex Reinhart. A big favourite of mine. Available for free online at http://www.statisticsdonewrong.com/ If anybody bought the book, please let me know about the added value.

How to Lie with Statistics by Darrell Huff. Review at ALiEM bookclub and available online for free to download (PDF). There’s a cool overview of letters on the subject. The book Naked Statistics (Economist review) has been dedicated to it, however, it does my no means approach the original.

OpenIntro Statistics: a college level textbook downloadable as free PDF!

Three good medical journal resources

Statistical Notes in the British Medical Journal (BMJ). Overview available here and here

JAMA Guide to Statistics and methods (here). Paywall for non-subscribers

Methodologie van onderzoek. Nederlandse Tijdschrijft voor Geneeskunde (NTvG). Available here (Dutch language)

Nature Collection: Statistics for biologists. here

Websites

Good statistical behaviour, a great PLOS blog article on ‘mind your p’s, RRs and NNTs’, available here.

One R tip a day http://onertipaday.blogspot.nl/

Wikipedia Statistics portal https://en.wikipedia.org/wiki/Portal:Statistics

And a great idea

Keep a Statistics Diary: Write every day 1 note on statistics. Andre Gelman‘s take and Alex Etz‘ take

 

Read more: