Microsoft is Storing Data on 10 Million Strands of DNA

By Jamie Condliffe on at

Deoxyribonucleic acid (DNA) contains the information that defines life – but it can also be used to store digital content, too. Now, Microsoft has announced that it’s seriously investigating the technique as a means of storing data, paying a bioscience company to create ten million strands of digital storage DNA.

Microsoft has partnered with the San Francisco-based biotech startup Twist to investigate DNA data storage. It’s purchased the rights to ten million strings of DNA on which it will encode data, to assess the technique as a long-term, secure data storage system. Microsoft simply hands over the data as a digital DNA sequence, which Twist creates it in physical form using synthetic biology. Then it hands the DNA to Microsoft to play around with. It’s not clear how much data Microsoft is trying to store, but as IEEE Spectrum points out, a single gram of the long-chain molecules can store a zettabyte of data, which is 1 million gigabytes.

DNA is built up of four bases: Adenine (A), thymine (T), guanine (G) and cytosine (C). They each appear along the long chain of the DNA molecule, usually encoding information that’s used to define biological function. To store data in a strand of DNA, scientists simply have to convert ones and zeroes that make up a digital file using combinations of the the four basic building blocks. A team of scientists from the University of Washington recently showed that it was able to store images within DNA then retrieve them perfectly.

Amusingly, Twist’s CEO Emily Leproust told IEEE Spectrum that it doesn’t know what any of the data is, because it doesn’t know how the original files are encoded into the DNA sequence. Cat GIFs, maybe? [Twist via IEEE Spectrum]