23andMe Is Working to Make DNA Data More Diverse

By Kristen V Brown on at

When you mail off a sample of your spit to find out about your ancestry, companies like 23andMe compare your DNA to other people around the world, seeing how closely your genes match the genes of people in, say, Norway, in order to deduce whether your ancestors might have been Norwegian, too.

Since our earliest days, humans have mixed and migrated and gotten it on, so figuring out where someone’s ancestors hailed from is not an exact science. But if, like me, you’re a person whose recent ancestors were not from Europe, those results are even less likely to be accurate. Ditto for all the powerful genomics research that promises to usher in the era of personalised medicine. That’s because, so far, the vast majority of genomic data that has been collected has come from people who are white.

A new initiative from 23andMe seeks to fix this, offering funding and support to scientists who work in communities that are underrepresented in genetic research. The programme is twofold: 23andMe will fund these scientists, thus helping to increase the diversity of genetic data in biomedical research, while the scientists will in turn provide 23andMe with DNA data to help make its ancestry reports more accurate for customers.

“It’s a win-win,” senior director of research Joanna Mountain told Gizmodo. “We get access to the genetic data and so do the researchers.”

Scientific research in places like Africa is often ethically fraught, since local communities do not always gain from participating or even ever find out results of the work. Mountain said that the company plans to fund a few such studies every year, evaluating the scientific merit of the work as well as how the research benefits the community being studied. Preference will be given to work that also includes working with local researchers and community members. Researchers must obtain consent from participants and the work must undergo review by ethics boards both at the researchers’ institutions and in countries where the research is being conducted.

Currently, the reference data set that 23andMe uses to determine a person’s ancestry includes 150 different regions. But while if you have Scandinavian heritage, the test can deduce whether your ancestors likely came from Denmark, Iceland, Norway, or Sweden, the most specific result a person whose ancestors lived in South Africa can get is “Broadly Sub-Saharan African.”

The new Populations Collaborations program is not the first effort 23andMe has launched to correct the problem of diversity in its reference population. The Global Genetics Project, for example, launched in February and doled out complimentary kits to people in the US whose grandparents were born in regions not well-represented in 23andMe’s data. Similarly, the African Genetics Project aimed to increase genomic data from Africa. Such efforts have paid off. In February, the company announced a major update that increased the regions represented on its test from 31 to 150. The test now includes much more detail for areas including Africa and Asia. Still, the company has a long list of nations that are still not represented in its data. And larger sets of data from Europe mean those results are still more likely to be accurate.

Mountain said that for years scientists have approached 23andMe and asked to collaborate on one-off studies of populations in places like the Democratic Republic of the Congo, Honduras, and Angola. The new program is a way to scale those efforts up.

“If you aren’t studying people in Southern Africa, how are you going to get information about them?” said Mountain, who used to conduct genetic research in Africa herself. “And if we miss studying those populations, it misconstrues all of our other data, too.”

One study of the Greenlandic Inuit population, for example, found a variation in a single DNA base pair that affected height. It turned out the same variation impacted height in Europeans, too, much more so than other previously identified variations, but researchers hadn’t noticed it in other studies because it was not all that common among Europeans.

One 2016 analysis found that 81 percent of participants in genome-wide association studies were of European descent.

For researchers, the big benefit is financing.

“Even though everyone knows that sampling diverse populations is important, there aren’t that many financial grants to cover it,” Brenna Henn, a population geneticist at UC Davis, told Gizmodo.

Practically speaking, genotyping people in places like Southern Africa, where Henn, works is expensive. It doesn’t just mean travelling to those places and spending the money to analyse the data, but spending lots of time on the ground establishing trust with people who might have little understanding of DNA.

“This program could really fill a gap,” said Henn.

“Studying Europeans gives you one answer to these questions, but that answer may not be applicable to other populations.”

She pointed to skin colour, which has been widely studied in populations of European descent, but, until a study by Henn last year, had never been studied in African populations. It turned out that there were many more genes that contributed to skin pigmentation than previously identified. The work reframed common ideas about how genetics and skin colour works. But to get there, researchers spent seven years with the KhoeSan people in order to collect data from 400 people.

Meanwhile, said Mountain, 23andMe gains the access that researchers already have in such locations. Samples of just 500 people from one location she said, could allow 23andMe to add the region to its reports.

“We’re limiting ourselves by mostly studying people from Europe,” said Mountain.