Peter Ney (University of Washington), Luis Ceze (University of Washington), Tadayoshi Kohno (University of Washington)
Here, we evaluate the security of a consumer-facing, third party genetic analysis service, called GEDmatch, that specializes in genetic genealogy: a field that uses genetic data to identify relatives. GEDmatch is one of the most prominent third-party genetic genealogy services due to its size (over 1 million genetic data files) and the large role it now plays in criminal investigations. In this work, we focus on security risks particular to genetic genealogy, namely relative matching queries -- the algorithms used to identify genetic relatives -- and the resulting relative predictions. We experimentally demonstrate that GEDmatch is vulnerable to a number of attacks by an adversary that only uploads normally formatted genetic data files and runs relative matching queries. Using a small number of specifically designed files and queries, an attacker can extract a large percentage of the genetic markers from other users; 92% of markers can be extracted with 98% accuracy, including hundreds of medically sensitive markers. We also find that an adversary can construct genetic data files that falsely appear like relatives to other samples in the database; in certain situations, these false relatives can be used to make the de-identification of genetic data more difficult. These vulnerabilities exist because of particular design choices meant to improve functionality. However, our results show how security and the goals of genetic genealogy can come in conflict. We conclude with a discussion of the broader impact of these results to the entire consumer genetic testing community and provide recommendations for genetic genealogy services.