Privacy is a societal issue, not a personal one

The recent SCOTUS decision in the Maryland v. King case has produced a lot of interesting follow-up commentaries. A sampling of a few of them are linked below:

Panopticon, keep your eyes on the word (Ronald Collins, SCOTUSBlog)
DNA Fingerprinting as Routine Arrest Booking Procedure Upheld as Anticipated (Jennifer Wagner, Genomic Law Report)
Supreme Court Fails the Fourth Amendment Test (Barry Friedman, Slate)
The Coming California Sequels (Hank Greely, Stanford Law School)
A few thoughts on Maryland v King (Orin Kerr, The Volokh Conspiracy)
The dark side of DNA evidence (Jason Silverstein, The Nation)

A primary theme in many of these commentaries is the issue of privacy. To what extent do you as an individual have the right to keep your genetic profile (or, more accurately, CODIS profile in the Maryland v. King case) private?

Right on the heels of the Supreme Court decision came the even bigger story (based on media coverage) of the NSA data monitoring program, leaked by Edward Snowden, known as PRISM. Again, the central theme in the reporting and commentary on PRISM is the issue of privacy. To what extent do you, as an individual, have the right to engage in social behavior (phone calls, internet usage) unobserved?

In teaching about personal genomics this past Spring, personal privacy was a major issue my students read about and discussed. For nearly every student in the class it was one of their primary concerns either at the beginning of the class (based on their pre-conceived notion of genomics) or at the end of the class (brought on by readings outlining the challenges of maintaining genetic anonymity). What I want to bring up now is that for every student who brought up privacy as a concern, the perspective through which privacy was articulated was from the role of an individual. As I tried to convey to my students, from both a genetic and anthropological perspective, the concern with individual privacy, rather than a more collective valuing of privacy, is misguided.

The go-to example of genetic privacy that gets brought up by students is access to the genomic profile of high-profile public individuals. What if the American people knew in 1984 that Ronald Reagan was an APOE4[1] carrier and had a higher risk of developing Alzheimer’s (I don’t know, by the way, that Ronald Reagan was an APOE4 carrier…he quite possibly wasn’t)? What if Barack Obama, with a known history of smoking, released his genomic profile, enabling a clinical risk assessment[2] of his health prior to his election? Or what if someone got ahold of a stray hair from Senator Elizabeth Warren and tried to test the idea that she has Native American ancestry?

These are all interesting thought questions to consider, but I would argue they do little in furthering a practical stance on privacy. The reason is that in any of the above cases, if anyone was really motivated to take those steps, it would be nearly impossible to prevent such actions (though one could certainly critique the usefulness of whatever information was generated). The reason being that as an individual it is very hard to keep track of our DNA. We shed it throughout our life through hair, skin, feces, saliva (a fact which is very valuable for primatologists and other biologists). That all of the above individuals are public figures, who regularly engage with large crowds of people, merely adds to the challenge of maintaining anonymity against a concerted effort to reveal such data. Even if an individual were able to maintain their own secrecy, presumably through Osama Bin Laden like seclusion, it would be possible to get a pretty good sense of that individual’s DNA by getting access to the DNA of close relatives and imputing[3] the likely DNA of the individual in question. Indeed, this is a tool used by forensic geneticists to identify the remains of individuals who lack alternative identifying marks, such as mass burial victims or, in the example above, the positive genetic ID of Osama Bin Laden. Just as your DNA reveals important information about your relatives, their DNA can also be used to get a good sense of your genetic profile (much like a family medical history). One of the books we read in the class was Here is a Human Being, by Misha Angrist (@MishaAngrist), a recurring theme of which is the difficulty (and limited value) of maintaining genomic anonymity (and this was published well before last summer’s Gymrek, et al. [4] Science paper on re-identification of DNA profiles).

So is privacy a lost cause not worth guarding? No. Privacy is an important value and one that should be protected, even if the individual is not the primary locus of concern. The problem is not that an individual’s privacy might be compromised, but that whole groups of people might be passively (or actively) discriminated against based on identifiable and trackable information. This is where the NSA and Maryland v. King stories converge.

The fear should not be that you are being targeted for because you are person X (that is really difficult to prevent), but rather, you are being indiscriminately targeted because you are a person with trait Y. If someone wants your DNA, or if someone wants to spy on you, because of who you are, they probably can. But you shouldn’t be subject to discrimination just because you are an I1a1 mtDNA haplogroup carrier (my mtDNA haplogroup) or because you regularly make phone international phone calls to a particular part of the world.

Much of contemporary genomic research, much as the apparent NSA program, relies on a faith in “big data.” This is the idea that complex problems can be solved by using intensive computational methods utilizing vast amounts of data. This, interestingly, was a recurring theme in another text we read for our genomics seminar, The Decision Tree, by Thomas Goetz (@tgoetz). Big data, however, do not inherently make complex issues tractable. Those complex issues remain complex. The genetic and environmental architecture that underlies the development of cancer, or heart disease, or neurological conditions like Alzheimer’s, is complex. The limited gains in clinical approaches made by a decade of genome-wide association studies (GWAS) reflect not a failure of genomics so much as a reality of the complexity of these conditions. I am not a terrorism expert, but I would guess you can analogize the complexity of a terrorist (in terms of causation) to the complexity of a condition like heart disease. Big data make the identification of central tendencies in systems easier to identify, but they don’t increase the scope of central tendencies in complex patterns. Sometimes there is no easy solution. Just to use a recent example, the NSA wasn’t able to successfully identify the Tsarnaev brothers intentions despite a lot of prompting that could have pointed the agency in that direction.

There is a distinction I have tried to draw out repeatedly on this blog between the production of information and the production of knowledge. Since the human genome project, and particularly since the development of next-generation sequencing technologies, we have produced a VAST amount of genomic information (an increase of information more than an order of magnitude greater than Moore’s Law). And that information has certainly led to huge improvements to what we “know” about how the genome works and is organized, particularly at a structural level. But having the data is only the first step towards understanding the hows and whys of human phenotypic variation. Those data make it easy to draw lots of correlations and associations, which is what GWAS studies have done again and again, but they do not provide a magic solution to a complete understanding of cancer, or intelligence (if you want to define what intelligence is), or terrorism.

Complexity creates limits not only on what we know, but these ease with which we can develop new knowledge. From an individual privacy standpoint, this should be encouraging. There is only so much you can glean from looking at your DNA, much as I would guess there is only so much that can be gleaned by looking at your phone calls.

However, what can be done to a vastly greater degree is to classify you on the basis of a given set of genetic variants or, in the case of the NSA, social network patterns. From an anthropological genetic standpoint, this is where the real danger sits. The public, and often those of us in the field, still tend to think of individual genetic variants as good or bad. Angelina Jolie had a double-mastectomy because of her “flawed” BRCA1 gene. In some limited cases, this characterization may be correct, in the sense that a gene simply does not function. More often, what we are identifying as good or bad represent variations in how a gene functions and interacts with other genes and the environment in which it sits, something that makes a simple good/bad binary misleading. And yet…the appeal of such a binary classification makes it possible to suddenly identify groups of people who share to a greater or lesser frequency good or bad genes.

Another reference that students come back to again and again is the spectre of the eugenics movement and some kind of dystopian Gattaca-like future. The future is not going to be Gattaca (genes alone do not produce that degree of certainty). But it is worth probing the relationship between eugenics and genomic knowledge a little further. The really insidious part of the eugenics history, largely based in the US and UK (Germany was a great importer of American and British eugenics law), was how prevalent actively eugenicist ideas were within society. The notion that we should use hereditary knowledge to improve the human species was held by prominent writers, politicians and scientists on both the right and left end of the spectrum. And why not? Why not use the knowledge we have? In the case of the early 20th century eugenics movement, a reason not to was that the “knowledge” of that era regarding heredity was extremely limited. It was also mediated by existing societal structures that were sexist, racist, xenophobic and homophobic. 100 years later, on the tailwind of the human genome project, we certainly know a lot more about heredity. But knowing more does not mean we have a complete understanding of the topic or even an actionable understanding of the topic (and it also doesn’t mean our currently knowledge is any less mediated by existing social structures). Indeed, a complete understanding of the role of human heredity in shaping phenotypes requires vast exploration of variation on a case by case basis. Big data alone are not enough. Big data are good at making us think we know a lot, though, and that is a big problem (big data are also good at serving as a giant magnet for limited funding and resources).

*****

1. Poirier, et al. Apolipoprotein E polymorphism and Alzheimer’s disease, The Lancet, Volume 342, Issue 8873, 18 September 1993, Pages 697–699 DOI:http://dx.doi.org/10.1016/0140-6736(93)91705-Q

2. Ashely, et al. Clinical assessment incorporating a personal genome, The Lancet, Volume 375, Issue 9725, 1–7 May 2010, Pages 1525–1535 DOI:http://dx.doi.org/10.1016/S0140-6736(10)60452-7

3. Li, et al. Genetic Imputation. Annu. Rev. Genomics Hum. Genet. 2009.
10:387–406. DOI:10.1146/annurev.genom.9.081307.164242

4. Gymrek, et al. Identifying Personal Genomes by Surname Inference. Science 18 January 2013:
Vol. 339 no. 6117 pp. 321-324 DOI: 10.1126/science.1229566

About Adam Van Arsdale

I am biological anthropologist with a specialization in paleoanthropology. My research focuses on the pattern of evolutionary change in humans over the past two million years, with an emphasis on the early evolution and dispersal of our genus, Homo. My work spans a number of areas including comparative anatomy, genetics and demography.
This entry was posted in Anthropology, Genetics and tagged , , , , . Bookmark the permalink.