JRA1 - Del. 4.3

advertisement

JRA1 - Del. 4.3

Aims

The central direction of the work of WP4

was to create an online predictive tool for curators.

Originally termed PrediCtoR, this was posted to a new website thermal-age.eu

as a result of the shift to next-generation sequencing as the primary tool for DNA sequence for museum remains.

Thermal-age.eu

1. Calculates the thermal history of a site and use this to predict DNA fragment length.

2. Keeps indexed records of past calculations so these may be published and subject to scrutiny.

3. Collects data from users to help refine the quality of predictions.

4. Provides detailed explanations and supporting materials to help users understand the strengths and limitations of the numbers we can produce.

Deliverable 4.3

Thermal-age.eu

is the site for designed to help collections managers and users to quantify the risks associated with destructive analysis of specimens. The website originally to be entitled PrediCtoR was to predict amplification success of PCR, in particular to highlight the importance of sample size vs. PCR amplicon length. It appears non-intuitive to most researchers that copy number scales in direct proportion to the size of the sample, whilst DNA fragment length survival decreases as an exponent of fragment length. PrediCtoR was therefore a web tool to encourage researchers to reduce sample size for destructive analysis (see WP6). Changes in DNA sequencing technologies

(so-called next generation sequencing platforms) required us to change the focus of the site. The site now reports predicted fragment length, rather than the copy number of a PCR fragment of a specific length. This has meant that the original objective in D4.3 of normalising results to sample size is both no longer relevant to users and (as next-generation sequencing has largely replaced

PCR experiments) that data on sample size are not available from those undertaking experimental work. Instead the model now reports a probability distribution for recovery of different fragment lengths which is more useful as a tool for excluding specimens where DNA has degraded below a recoverable threshold.

Refinement of the web tool and model code

Following user feedback and in response to the advances in technology discussed above, both the web tool and the underlying modelling software were refined as follows:

1.

Improved temperature resolution, switching from a 1° x 1° resolutions (ISLSCP) to ~1 km x

1 km (0.05° x 0.05° WorldClim).

2. Improved altitude correction, using WorldClim, Google Maps API and Wikipedia (e.g.

Elevation for central Brussels, PMIP = 128 m, WorldClim = 62.5 m, Google = 64.84 [153m resolution)

3. Ability to search for all places on Wikipedia with a latitude and longitude: this has the benefit

(over searching e.g. Google Maps API for places) of including many archaeological sites and geographical features in addition to place names. A search results can be previewed on the map and automatically fill in elevation and brief description of the place when selected.

4. Improved soil types input - using a sliding scale to estimate thermal diffusivity in different soils of typical granularity based on a collection of known values given soil type and water content. The model can use multiple soil layers.

5. Improved processing speed by refining calculation software in parallel with adding new features and ability to process arbitrarily large datasets in a spreadsheet.

6. Ensuring the site was a mobile compatible, for use on phones and tablet computers.

7. Values previously entered into any job can be loaded instantly into and screen of the wizard allowing users to easily run additional jobs where some of the input is the same as previous runs (e.g. where more than one specimen comes from the same site).

8. Enabling large anumbers to be added using a spreadsheet input ( thermal-age.eu

generates a ‘quick start’ spreadsheet based on the user’s requirements including the correct column headings and example rows to show how to enter data correctly.)

9. Enabling many users to simultaneously queue up spreadsheets as well as wizard runs for a single specimen. The processing of large (spreadsheet) jobs is suspended and resumed such that the longer a job has been running, the lower its priority against competing jobs in the queue. This means smaller jobs are always turned around as quickly as possible while the system cannot be “blocked” by one very large job.

10. Providing a Dashboard which lists all your activity on the site and shows the status of currently running jobs. This is especially useful as large spreadsheets of results can take some time to process.

11. Proving an interactive report in PDF form, which can be printed (see 4 in Roll-out)

12. Providing a means of making searches publicly accessible (including an ability to embargo the results to a set date - and change this at will).

User-entered database of results & analysis

Researchers are able to upload actual results of DNA sequencing and compare them against previously run predictions, allowing data capture and comparison between predicted and measured DNA survival. These can optionally be published en-masse, again with the option to embargo results.

Once results are made public they are assigned a permanent canonical URL

(suitable for publication) and may not then be removed from the site.

A graph showing a comparison between the predicted and actual experimental results (where these have been provided) is automatically generated. An example is shown to the right. This provides a qualitative indicator of the quality of predictions (which appears strong) as well as giving an overview of the relative predicted and actual survival of DNA. The “traffic lights” colour coding indicates preservation from good (green) to complete destruction (black).

Improving access to collections

Thermal-age.eu

enables curators to assess the likelihood of DNA preservation, the original aim of the project. However the Synthesys II management team realised that a more effective method of using the tool was to offload the analysis onto the researcher collecting material. Therefore an interactive PDF Report was produced (which enables each of the figures to downloaded as a PDF,

PNG, or SVG format).

This has a number of advantages

1) If the sample is identified as unsuitable the request will never be made

2) The research gets an insight into where and when samples s/he is interested in analysing are predicted to fail.

3) The reporting tool offers a way to report on thermal-age.eu

data highlighting instances in which the predicted fragment length is incorrect.

How good is it?

The tool is currently slightly ahead of its time, as there have been relatively few studies which have reported DNA fragment length. GoogleScholar lists three articles which have reported having used the tool in publication. A total of 145 unique visitors spend on average of 6 minutes each time they visit the site (a total of 237 visits), this is the time to run a minimum on one analysis.

However working with leading researchers in the field of ancient DNA results appear to be very promising. The following from a recently submission to a high profile publication illustrates the steps within thermal-age.eu

and the quality of the prediction (references not given).

A thermal age (Smith et al., 2003) - equivalent age if held at a constant 10 ° C - of 4200 years was estimated from the sample, based upon an effective burial temperature Teff of 4.3 ° C. This estimation assumed the sample was buried to 1 m is soil with a thermal diffusivity of 0.029 m2 day-

1 (silt-loam, 10% water) for the 12,740 years (until 1968; maximum age 12,722 CAL BP 2σ C.I.) then in highly variable conditions (1

5 °C +/- 7 °C) until the present day. Seasonal fluctuations in monthly temperature at the site was estimated from the WorldClim dataset (Hijmans et al., 2005).

The altitude difference between the 1 x 1 km squares of WorldClim and the site was corrected by comparing the altitude of the WorldClim grid with the altitude from the DEM in GoogleEarth (1552 m) and corrected using a standard environmental lapse rate of 6.49 °C/1,000 m. The extent to which temperature decreased beyond the Holocene was estimated from the difference in the 1° ×

1° PIMP2 grid for the region at three time intervals Modern (pre-industrial), Holocene (6ka) and

LGM (Braconnot et al., 2007). These data were correlated against the equivalent time intervals from Bintanja et al. (2005) curve, and the correlation used to transform the latter temperature

series to reflect the temperature change in this region. Using the rate data estimated in Allentoft et al. (2012), our thermal age estimates slightly under-predict the extent of degradation (rate of 0.99E-

6 yr -1), when compared with that estimated from the observed fragment length (1.3E-6 yr -1).

We do not have detailed data on the net thermal diffusivity of the soil, precise burial depth, burial depth over time, nor storage history post 1968. The sensitivity of the model to these estimated factors is illustrated by the fact that if the burial depth is reduced to 0.5 m, the estimated rate (1.4E-

6 yr -1) is higher than the observed value.

Achievements

Scheduled Deliverables

4.1 Production of pilot website for PrediCtoR now http://thermal-age.eu

4.2 Proof of concept using both temperature plus large DNA datasets & DNA validation

(presented as report)

4.3 Software refined: thermal modelling, normalise to sample size; develop and testing User entered database & analysis and reporting

Other benefits

Funding obtained as a result of JRA1 & JRA3 activities

Systematics Association (201112) £9 501. Collagen: the Barcode of Death. PI Matthew Collins

(York: JRA 1), co-I with Sam Turvey (IoZ).

What next?

The site is now live and stable, generating reports, and requesting feedback on predictions (to improve the accuracy of the site). We would like to increase the number of types of tissue modelled to include (for example) dried plants and insects.

We are seeking additional funding to support this - either via a UK funded CASE studentship between York and the NHM or a fellowship application.

Download