Enhance your critical thinking by looking for SMART results – phil kearney.

Critical thinking is one of the most important skills for a sport scientist to develop. With the rapid growth in the number of published research papers and concerns over the quality in published research, sport science students need to learn how to critically evaluate scientific articles. One effective critical thinking technique is breaking a complex question into smaller, more manageable questions.

On a recent episode of the Normal Curves podcast, Kristin Sainani introduced a helpful mnemonic for evaluating study results: S.M.A.R.T. This acronym helps the reader to break the big question – What did this study find? – into five focused checks:

S – Signal

Ask: Does the signal exceed the noise?
In other words, is the observed effect likely real or just random variation? With conventional inferential statistical tests, we are interested in establishing whether two groups differ on some variable or whether an intervention has improved a group’s performance from one timepoint to another. But there is always “noise” in the data too. Inferential statistics help us see the signal through the noise. Look for statistical significance (p-values, confidence intervals) and consider whether the effect is discernible from background noise. A strong signal (typically, p < 0.05) suggests the study detected something more than chance fluctuations. 

M – Magnitude

Ask: Is the effect big enough to matter?
Statistical significance alone isn’t enough – tiny effects can be “significant” but practically irrelevant. For example, consider a study that compares two different forms of visualisation exercises for 1000 basketball players: imaging yourself as though on camera (external imagery) or looking through your own eyes (internal imagery). The results of this hypothetical study show that players using external imagery make 1% more free throws than those using internal imagery, and – because the sample size is very large – this difference is statistically significant at p < 0.05. Statistically, the signal is real – the difference is unlikely to be due to chance. But practically, 1% is tiny. For a player who shoots 100 free throws, that’s only one extra made shot. With the most fouled players averaging about 10 free throws per game, a 1% improvement is unlikely to affect game outcomes or justify changing training protocols. There is probably a better investment of time and effort for the player and sport scientist. In addition to checking the Signal (statistical significance), check the descriptive statistics, effect sizes (e.g., Cohen’s d, odds ratios) and/or confidence intervals. Then ask: Would this difference make a real-world impact on sport performance?

A – Alternative Explanations

Ask: Could something else explain these results?
Just because you can discern the signal from the noise and the magnitude of the effect suggests a real world impact, that doesn’t mean the difference is caused by what you manipulated. A core part of the scientific process is to consider what other factors might have caused any observed effect. These other factors can appear at many different stages of an experiment. For example, were participants randomly allocated to groups or could some bias have been introduced here? Were the groups equivalent on key characteristics such as gender, experience, etc?  Was a reliable assessment used? Were the assessors blinded as to what group participants were in so as to remove (unconscious) expectation effects? If you can rule out alternative possible explanations for the observed effect, you increase your confidence in the finding.

R – Reality Check

Ask: Do the results pass the smell test?
Errors in statistical reporting or illustrations are not uncommon in published research. For example, a bar chart might show 70 participants when the sample size is described as 50. Other errors may be less obvious, but on closer inspection the reported numbers simply do not make sense. Consider 5 participants all completing a Likert-type scale with responses ranging from 1-5 scale. If they all answer 1, you get the minimum possible mean of 1. If they all answer 5, you get the maximum possible mean of 5. A mean of 0.5 or 6 is obviously not possible because 1 is the lowest possible answer and 5 is the highest possible answer. But there are also less obvious restrictions on what the mean might be for any combination of participant responses. Answers of 1, 3, 3, 4, and 5 give a mean of 3.2. Answers of 1, 3, 3, 3, 5 gives a mean of 3. But there is no possible combination of 5 participants scoring a Likert-type scale from 1-5 that gives a mean of 3.1. Means with decimal places of .0, .2, .4, .6 and .8 are possible, but no other decimal places. Tests such as GRIM and GRIMMER are designed to check the plausibility of means and standard deviations reported in papers, and when applied within sport science have on occasion documented serious flaws. While it is not necessary to run all papers through these rigorous checks, it is certainly worthwhile checking for any obvious red flags.

T – Trustworthiness (or Transparency)

Ask: Were the researchers open about their data and process?
Transparency builds trust. There are many ways in which researchers can enhance a reader’s trust in their study. For example, look for pre-registration of the study where a detailed plan, including hypotheses, data collection methods, and analysis plans, is shared on a public registry such as the Open Science Framework before data collection begins. Ideally, researchers should share their raw data and any code used within the analysis to facilitate tests of reproducibility (reproducibility means getting the same results by running the same analysis on the same data, whereas replication means getting the same results by running the same study with new data). Look to see if the authors clearly distinguished between confirmatory and exploratory analyses. Exploratory research involves exploring a broad set of questions in relation to a dataset to detect patterns that might be present. Although a useful part of the broader scientific process, there is a high chance that claims based on exploratory analyses are incorrect because of random variation. As such, any patterns identified in exploratory analyses need to be tested in a more carefully controlled confirmatory study before researchers can claim they have observed an effect. Clearly stating which analyses were exploratory and which were confirmatory provides valuable additional information to the reader and is thus an important indicator of transparency. Together, these practices of pre-registration, sharing data and distinguishing between confirmatory and exploratory analyses increase a reader’s confidence in the reported findings.

For a deeper dive into the evaluation of scientific papers, listen to the full episode of the Normal Curves podcast (S.M.A.R.T. is introduced at 32 min 46 seconds).

Next time you evaluate the results of a scientific study, think S.M.A.R.T. This mnemonic should help you think critically and to avoid being misled by flashy but flawed results.

 

Dr. Phil Kearney is the Course Leader for the MSc Applied Sports Coaching at the University of Limerick. A Fellow of the Higher Education Authority, his teaching and research centres on the domain of skill acquisition, particularly as it relates to youth sport. A regular contributor to RTÉ Brainstorm, Phil is a member of the Gaelic Athletic Association’s Player & Coach Development Advisory Group, an Associate Editor for Perceptual and Motor Skills and is on the Editorial Board of the Journal of Motor Learning and Development. Phil is a co-founder of Movement and Skill Acquisition Ireland.

Tagged with: