If anything, the 2020 coronavirus pandemic has shown us that we don’t do a good job at interpreting medical literature. I expect news outlets to get the headlines wrong. They always do. Eating chocolate cake for breakfast isn’t going to make you lose weight. Drinking 8 glasses of wine a day won’t make you look younger (it might make you care less about the way you look). The media’s job is to grab your attention with a flashy headline. They’re good at that. They aren’t so good at dissecting studies. It’s often not the scientists’ fault. They weren’t even trying to make the same conclusion as the headlines are.
The latest one is a face mask study from Duke University. The headline claims “Wearing a neck gaiter may be worse than no mask at all, researchers find.” I think it’s a fine study, but it doesn’t say what you might think it says. Let’s go through the steps I usually take when I see a study quoted in a pop-science article.
1. Find the Source
Most of these pop-science articles are based on a real study of some kind. Sometimes the article links the source directly. Often, the sources are tricky to find. Pop-science press often says things like “A study done out of John Hopkins found. . .” They don’t always mention the author’s name or the name of the study. If you’re familiar with PubMed you can usually find it. Sometimes, especially during the pandemic, the “studies” are actually preprints and they can be harder to find (preprints.org, https://www.medrxiv.org/ are two good places to look for health sciences, I’ll list some more at the bottom). Sometimes, I have to bounce around to a few different media takes on the article to get enough information to find it. You can always just email the journalist and ask too. They are usually happy to help (and it’s not always the journalist that writes the headline). In this case, the mask article was published in Science Advances and is free to read: https://advances.sciencemag.org/content/early/2020/08/07/sciadv.abd3083
2. Find the Question the Researchers were Answering
Every study has a thing it’s looking for. Sometimes the researchers find other interesting things along the way, but they all have one thing they set out to find. Those other things are usually not as significant as the main thing they were looking for. Is the thing they are looking for the same thing the headline claims? Let’s look.
The headline claims “Wearing a neck gaiter may be worse than no mask at all, researchers find.” That seems like the question was, “Which face masks are more effective”, “Are neck gaiters effective?” or maybe even “What is the relative efficacy of different face masks.” Let’s see if this study was designed to answer a question like that.
In the introduction, the authors write (emphasis mine):
“While some textiles used for mask fabrication have been characterized, the performance of actual masks in a practical setting needs to be considered. The work we report here describes a measurement method that can be used to improve evaluation in order to guide mask selection and purchase decisions”
“Below we describe the measurement method and demonstrate its capabilities for mask testing. In this application, we do not attempt a comprehensive survey of all possible mask designs or a systematic study of all use cases. We merely demonstrated our method on a variety of commonly available masks and mask alternatives with one speaker, and a subset of these masks were tested with four speakers. Even from these limited demonstration studies, important general characteristics can be extracted by performing a relative comparison between different face masks and their transmission of droplets.”
That doesn’t really sound like the question the headline was answering. It sounds to me like they are testing the testing method, right? Their question seems to be more like, “Is our method a valid method for evaluating masks?” Let’s see what their method and design were. We can see if the study would be adequate to answer the headline question from that.
They had someone wear a face mask and speak into the laser beam in a dark enclosure. They recorded the light scatter from the droplets with a cell phone camera. They tested 14 types of masks. Some masks were tested by only one person. Four masks (a cotton mask, a surgical mask, a bandana, and the control group of no mask) were tested by four speakers.
They don’t compare material weave or fit (but the N95 they used was fit tested). They do discuss that the fleece neck gaiter transmitted a larger number of droplets than the control trial (though not significantly larger). They think gaiters disperse larger droplets into several smaller droplets, therefore increasing the droplet count. That’s where the headline “Wearing a neck gaiter may be worse than no mask at all, researchers find” comes from.
I’m not sure that one speaker and one neck gaiter have me convinced. This study really isn’t designed to tell us that. It tells us maybe we need to do some more tests on that type of mask before we recommend it. That what these other findings usually do. They tell us, “Maybe we should look into that.”
3. Look At the “Power” of the Study
There are things that biostatisticians look at: was this the right study design for the problem, did they pick the appropriate statistical test, are they calculating the right p-value, was there a type-1 or type-2 error. Those are things you should consider if you’re making a medical judgment off the study (see my biostatistics section for some basic tips). However, if you’re just trying to learn about a topic, you can get a good estimation of what we call the study’s power without even knowing what test they should have chosen.
Technically, power is the probability of rejecting the null hypothesis when, in fact, it is false. I know, null what? It’s just asking if your study is good enough to pick up a difference between two groups if there is a real difference between those groups. The biggest tip-off is usually the sample size. If I have 100 people and I give 50 people a Tylenol and 50 people a cube of sugar, I have a better chance of picking up a side effect from the Tylenol than if I had only 10 people and gave 5 people a Tylenol. With 1000 people, I’d have an even better chance. Just because I didn’t pick up a side effect in those 10 people, doesn’t mean that one doesn’t exist. I could miss even a significant side effect if my sample was too small.
The sample size here was four speakers on four masks and one speaker on 14 masks (the gaiter only had one speaker). I’d say it’s a pretty small sample to make any real decisions.
Here’s a fun one to look at to do this quick and dirty estimation of power. This has been highly circulated as “proof” that asymptomatic patients don’t spread coronavirus: A study on infectivity of asymptomatic SARS-CoV-2 carriers. It’s also free to read. So, do we think this study is powerful enough to “prove” that claim?
This one is pretty easy. I saw someone claiming the study had 455 participants. If you just quickly read the abstract, you might think so too. If you read the paper, you’ll see that they followed ONE person. Just one. One asymptomatic person was contacted traced to 455 people (who has that many friends and family? Some of them were at the hospital the patient was at) and didn’t transmit it to any of them. To me, that isn’t even close to adequate enough to say no asymptomatic carriers spread coronavirus (and the study authors don’t claim it is, it’s a case study). Can you find another interesting thing about the study that the headlines leave out?
The patient, and pretty much everyone else, wore a mask when they made contact. That’s neither here nor there, but I’ve seen media report this story and leave that out. Those extra, possibly significant, tidbits are things to look out for too.
4. Is it significant to you?
If this study had 455 participants, we might have had to look a little deeper. I like to look at the study population in that case for a quick and dirty, no maths, check. Are they like my patients? Were they all 20-year old males? Did they have something else in common (like were they all wearing masks)? Sometimes study designs are aimed at a specific population. They may have shown a significant difference to that population, but those results might not apply to everyone. That’s often the cause with conflicting studies. The conclusions are different because the populations were different.
In this particular study, we can look at the way droplets were detected and the characteristics of the one speaker. The one speaker that tested all the masks spread significantly more droplets than the other speakers. Could that impact the data? How does that speaker compare with most of us? Does the average person spread like the speaker who tested all masks or more like the speakers who tested just the four?
As for the measurement, the study only looked at the number (and somewhat size, but not really) of the droplets, and not how far or in what direction they spread. That information is probably important.
5. Is there prior data?
If want to take your search a bit further, you can see if there is prior data to support or detract from this study. A study done in June had data demonstrating that, while bandanas (similar to gaiters) let more droplets through, they spread less than 4 feet, while uncovered faces spread droplets 8 feet. That study has some limitations too, but that’s probably important information to consider.
6. Is there missing data?
What else would we need to make the claim that gaiters were worse than nothing at all? I think you probably need to know more about the fabric. They don’t talk much about the fabric weave tightness or how the various fabrics compare. In order to adequately test a hypothesis that gaiters are worse than nothing at all, you would have to test a wide variety of gaiters of popular fabricks. Gaiters come in all of kinds of weaves. In order to test a hypothesis that the style is worse than other masks, it would be best to try masks with the same fabric type and weight. Maybe the “cotton” style would show worse results when made with the gaiter fleece. We don’t know if it’s the fabric, the style, or the speaker.
7. What do you do?
With all that information, you can now decide what to believe. Do I believe neck gaiters are worse than no mask at all? I think they are probably better than nothing, but they may not be the best protection. We need more data. That’s why places like Disney have banned them for use in the parks. I personally would feel comfortable wearing one in a situation where I thought I would be able to adequately social distance, but thought I might occasionally come in contact with people (like running in the park). I would not wear one in a situation where I thought I would have occasion to come in contact with more people (like in Wal-Mart), but that’s just me. The great thing about actually looking at the data this way is that you can be fully informed to make your own choices.
Preprint Servers
https://preprints.org – general
https://www.medrxiv.org/ – medical
https://therapoid.net/ – life science
https://mindrxiv.org/ – mind and contemplative practices
https://psyarxiv.com/– psychological
https://osf.io/preprints/focusarchive– ultrasound
https://osf.io/preprints/nutrixiv – nutrition
https://www.sportrxiv.org/ – sports and sports medicine
https://www.biorxiv.org/ – general biology
https://arxiv.org/ – mostly physics, computing, etc
https://peerj.com/preprints/ – biomedical sciences
https://figshare.com/ – all sciences