This is the follow-up to a piece I posted earlier on what exactly do we mean by fossil data. In that piece, I suggested that at a primary level, fossil data is the fossil itself together with its associated context. This primary data is nearly impossible to employ in any meaningful way in the context of hypothesis testing. Instead, we simplify the complexity of the fossil by looking at some representation of the fossil, whether it be anatomical, metric, or a 3D representation. How we construct the “data” presented by fossils in turn establishes what is meant by “access” to fossil data.
For several years, the NSF has required fund recipients to specify how they will grant access to the data produced in the funded project. There are limited guidelines as to what this means, however, and there is a lot of variation in what form it takes in practice. So what does access mean?
In a perfect world, we would all be able to pass through time and collectively share in the joy and amazement of every fossil discovery. We would be able to minimize the uncertainty around the exact stratigraphic positions of fossils and correctly identify their associations. Obviously that will not happen. One bit of access that is impossible to grant is that which goes along with the moment and important details of discovery. Instead, there is a great emphasize on documenting the details surrounding a discovery, though less consistency in how those details ultimately get published (if at all) and disseminated.
Beyond the discovery, access becomes tricky as it can take a variety of forms, dependent in part on the nature of the fossils themselves. Under the best circumstances, the fossil itself is made available to researchers to examine and record secondary data from. Despite surviving the Earth’s changes over millions of years in some cases, the half-life of your typical fossil in the hands of a researcher is much reduced. Accidents happen, fossils get dropped, metal calipers and sloppy measurement efforts lead to damages and alteration to the original fossil. As a result, the handling of fossils is kept to a necessary minimum. Destructive analysis (e.g. stable isotope sampling) is used only when there is sufficient grounds for doing so.
Nearly every time I have worked with primary fossil materials I have been questioned by curators about how I feel about virtual curation through images and particularly 3D CT-scan data. If it allows me to get measurement or other spatial data from the fossil without handling the fossil…wonderful. But it doesn’t replace having the actual fossil present. Our eyes and our brain provide a different, and complementary, visual insight into a fossil. Nevertheless, the use of CT-data for fossil analysis is increasingly becoming the professional standard.
Technology has a double-edged effect on issues of access. In theory, the creation of virtual fossil representations preserves the fossils themselves for future researchers while simultaneously making fossil data more mobile (i.e. digital format) and accessible. In practice, however, technology requires an investment in infrastructure, both in data acquisition and processing, that is seldom distributed equally.
Consider, for example, this fantastic google map of high output genomic sequencing platforms (h/t Holly Bik):
There are essentially as many down the road from me in Boston/Cambridge (focused on the Broad Institute) than there are in the entire Southern Hemisphere. The situation is probably not as stark when it comes to high-res CT devices capable of processing fossils, but establishing common technological formats for CT-data that make such data widely available remains a challenge. The NESPOS project is a nice example of an attempt to do just that, but such projects are the exception.
This becomes an even greater challenge when, as is the case for many fossils, reconstructions are done virtually using CT-data. These reconstructions are typically a combination of mathematical fitting and visual jigsaw work, making an exegesis of the process from an outside perspective impossible. Such reconstructions are not that different from the historical norm, in which one person’s reconstruction became the de facto version of a given fossil, with the difference being the accessibility of the raw data. Raw CT-data of fragmentary fossil specimens are not as widespread as casts of fragmentary specimens, making independent reconstructions based on the latter more accessible. When I was in graduate school, nearly everyone who passed through our lab spent some time attempting to produce a satisfactory reconstruction of ER-1470 based on neurocranium, maxillary and mid-facial casts.
The nature of paleoanthropological data, fundamentally the fossils themselves, creates problems for representation and analysis. The rarity of fossils also places a inherent constraint on access. New technology that is capable of rendering fossil virtually and digitally has the potential to dramatically expand access, but only if an effort is made to actually prioritize access by standardizing data and pursuing open platform methodologies.
Infrastructure and data transmission are a challenge, but at a bare minimum the data should be archived on physical media and housed at the institution that houses the specimen. Unfortunately, museums (in my experience) are not yet consistent in asking for this basic bit of information, and even when it is requested scientists are pretty horrible about complying.
(I’ve really enjoyed this series of posts, BTW)
Good point, Andy. And part of the issue with hominin fossils of course is that the vast majority of them are found in countries that have far less capacity to invest heavily in infrastructure of this sort.