Data QA/QC and DIVA products for the North Sea

advertisement
EMODnet Thematic Lot n° 4 - Chemistry
Data QA/QC and DIVA products for the north Sea
Martin M. Larsen (AU-DCE)
Date: 12/09/2014
EMODnet Thematic Lot n° 4 - Chemistry
QA/QC and DIVA report
Contents
Introduction ................................................................................................................................................ 3
1.
Common methodology for data QA/QC .............................................................................................. 4
Additional step: ....................................................................................................................................... 4
Broad-range check values in the Mediterranean .................................................................................... 8
2.
Common rules for products generation .............................................................................................. 9
3.
General guidelines for DIVA settings ................................................................................................. 10
2
EMODnet Thematic Lot n° 4 - Chemistry
QA/QC and DIVA report
Introduction
This data report describes the steps based on the commen methodology for data QA/QC adopted at
the expert meeting september 3rd in Paris
The final dataset from 28 october 2014 was imported in to ODV version 4.3.3 (PC version, manually
updated), and aggregated. The final checks was done using ODV version 4.3.4 (mainly mac version).
3
EMODnet Thematic Lot n° 4 - Chemistry
QA/QC and DIVA report
\
1. Common methodology for data QA/QC
ODV data: 626529/626611
According to EMODnet website fertilizers (nutrients) 139464 results, silicates 101986, chlorophyll
103577
No data with QF 6 was found. No depths <0 found. A number of results of 0 and <0 was found, as
Danish data for some years was reported even as negative due to biologists idea that it would
somehow average out to 0 for <DL, and that would somehow be helpful. These data was instead
transformed to half-DL values. The concentration ranges was checked on the full dataset, and values
above a “check value” (reasonable highest concentration in coastal waters) was performed to see how
much data would be lost.
Table 1: first check of concentration ranges
Concentration
ranges
Max found
Check value
#above
PO4
110 µM
50 µM
2859
NO2
97 µM
20µM
591
NO3
646 µM
100 µM
1004
NO23
658 µM
100 µM
11171
NH4
878 µM
200 µM
409
Si
696 µM
200 µM
4287
comment
Reported NO2+NO3 results only
199583 duplicates in 85666 groups found – re-run with depth as distinguishing parameter and export
of EDMO code and local CDI no. resulted in 200341 duplicate stations in 86637 groups found
Additional checks for Nutrients
The Nutrients was checked visually, and all was looking reasonable, with concentrations lowest below
100 m in most cases (PO4, NOx, NH4) and opposite for Silicate, with the highest concentrations in the
deep waters.
Typically the highest concentrations were found in the upper 5 meters.
4
EMODnet Thematic Lot n° 4 - Chemistry
QA/QC and DIVA report
Only two examples of QF values 2 (probably OK) for PO4 were found to be probably not OK, showing
jumps of a factor of 10 to 200 with lower concentrations both above and below the suspected outlier
value. For the other nutrients, the concentration profiles looks reasonable and no further exclusions
for these were deemed necessary.
Table 2 PO4 results with changed QF flag to 4:
Accession no.
“jump” Factor
180932
200
180511
10-20
Excluded value
PO4 = 200
PO4 = 100
Nitrite was found to be larger than nitrate in a few samples, mostly from the PHABMO II cruise (may
2003, EDMO 486 = Ifremer). A few other results for very large nitrite/nitrate where nitrite results QF
was set to 4.
Table 3 NO2 results with changed QF flag to 4:
Accession no.
EDMO code, cruise
180932
32, NMMP0105M
206956
545, 77CB1997
Excluded value
One case of very high nitrite concentrations showed increase with depth, and was therefore kept
despite 3-6 µM nitrite to 0.1 µM nitrate in the deepest samples (Cruise 77CB1992 accession 206454,
Saltøfjord by EDMO545).
QF for results where Nitrate ~0.02 with positive Nitrite was not changed, as results are close to
detection limit (DL) as the DL for nitrate is higher than nitrite so uncertainty in the low range is larger.
The comparison between inorganic and total N or P, with the following suggestion for setting of flags:
Where ratio of inorganic to total is defined as:
RP = OP/TP (ortho phosphate/total phosphorus)
RN = (NH4+NOx [+other N-inorganic species as e.g. Urea])/TN
Table 4: Total to inorganic ratio checks
Ratio of inorganic to total
1 – 1.15
QF value
Not changed
1.15 – 2
Probably 4
>2
4
Description
<15%* difference, data could be correct within
uncertainty of measurements
Data are very probably incorrect for inorganic or total
measurement. Inspect profile data.
Some of the data are likely incorrect
The PO4/TP ratio was found bad for several stations, each with Ratio>3 inspected and following
corrections made (table 5)
5
EMODnet Thematic Lot n° 4 - Chemistry
QA/QC and DIVA report
Table 5: Check of nitrogen ratios RP
Accession /EDMO
Cruise
Station
Ratio
comment
629061/1181
Haithabu 2004
OM225019 (B)
3.4;
36.4
TP value (0.39) changed to QF4,
OP considered ok
OM704 (B)
6.6
TP value (0.39) changed to QF4,
OP considered ok
OM225003 (B)
4.9
TP VALUE (0.39) changed to QF4
62013/729
0
NOR5503 (B)
8.4
NH4 very high (main part of TN),
suggests Bottom water impacted
by something, TP QF changed to 4
633710/2537
LLUR 2010
225004 (B)
5.7
No indication of higher P in
bottom water, OP QF changed to
4
228680
BE2009/20A
W10 (B)
3.8
Dissolved tot.P , TP set to PO4
OP/TP ratio between 2-3 was only found for further 9 samples . OP/TP ratio between 1.2 and 2 was
found for 65 samples in all.
Nitrogen ratios were checked and 9 incidences of 1.15-2 and three above 2 was found (table 5).
Table 5 Check of phosphate ratios RN
Accession /EDMO
Cruise
Station
Ratio
comment
637642/1850
Pelagia 64PE364
43 (B)
12
TN value at 11 m very low, TN
result changed to 10 from 0.10
65155/729
0
NOR7715 (B)
2.7
TN value lower at 14 m, NO3
similar (higher), TN QF flag set to
4
626215/
Celtic
Explorer 41 (B)
CE/11/010B
2.2
TN value similar at 6 and x m, NO3
only at 6 m, QF NO3 set to 4
6
EMODnet Thematic Lot n° 4 - Chemistry
QA/QC and DIVA report
606004 stations exported30995 duplicate stations in 28918 groups found
Values under detection limits and 0 values
Values under detection limit (namely with QF=6, or in Danish samples values <0) but having the
accompanying measured value=0 are incorrect. These data values need to be changed with ½ of the
detection limit for the technique.
Table 6 Detection limit suggestions from AU-DCE (accreditated methods) as valid for the North sea
area
Nutrient
Detection limit
Expected
Relative
standard deviation
NO2
0.04 µM
7%
NO3
0.1 µM
7%
NOx
0.1 µM
7%
NH4
0.3 µM
7%
TN
1 µM
12.5%
OP
TP
0.06 µM
0.1 µM
5%
10%
SiO4
0.2 µM
4%
A general value of 0.05 for inorganic N-species could be applied (0.5 for TN), and 0.03/0.05 for P.
Table 7 Detection limit suggestions from OGS Laboratory of Marine Chemistry (accreditated
methods) as valid for the Adriatic sea
Nutrient
Detection limit
Expected
Relative
standard deviation
NO2
0.0015 µM
10%
NO3
0.01 µM
3%
NOx
0.01 µM
7%
NH4
0.04 µM
12%
TN
1 µM
1%
OP
TP
0.02 µM
0.02 µM
5%
8%
SiO4
0.016 µM
3%
7
EMODnet Thematic Lot n° 4 - Chemistry
QA/QC and DIVA report
Measured values 0 with QF =1 or 2: if we know the detection limit, the values should be replaced by ½
detection limit and set QF=6 otherwise use above defined detection limits, change data values with ½
and set QF=6.
#1 1 <
0.5
#1 IFTE [if conc. < DL then DL/2 else conc; in this case #1=TN, DL=1 umol/l]
As a result, it is suggested to involve internal regional expert to review the aggregated data set.
Finally, make an Odv regional collection (Export SDN ODV collection) as basis for Diva and other
products and provide copy to Maris including detailed QC report. The same report should be sent to
the Originator or erroneous data together with the corrected data.
Broad-range check values in the North-sea
The data availability on a substance by substance basis
Substance
O2
Chlorofyl-a
Si
No. of
datapoints
(total)
4 911 756
662 107
5 646 601
PO4
Total P
716 463
153 409
NO2
NO3
NO2+3
NH4
Total N
276 339
361 028
311 744
253 078
Range
outlier-range
(typicalle QF 4)
-500 -> 1000 umol/l
0 ->692
0 ->695
Phosphate nutrients
0-155
0-1000
Nitrogen nutrients
0-658
0-1100
0-644
0-1000
8
>1000
>700
>700
number of outliers
1
159
102 983
200- 1669
1000-5071
82
174
1100-10367
650-2502
1000-1621
0
87
102
21
EMODnet Thematic Lot n° 4 - Chemistry
QA/QC and DIVA report
2. Common rules for products generation
I’m having some difficulties with getting the dataset to DIVA in a way that makes sense. Despite the
tricks learned at Stareso, I still just get errors when doing the same exports as on Stareso, using the
new version of ODV, and changing Depth to dybde where it shouldn’t be. I will contact Jean to solve
this problem ASAP, and continue with the products…
Martin
Timing: 10-year moving average from 1960 to 2014, by season and levels
Seasons: defined per region (check definition...)
Vertical layers: defined per region (Baltic uses Helcom standard depths, Med using IODE standard
depths… are the same? Please check)
Seasons as adopted in the Mediterranean: winter (January to March), spring (April to June), summer
(July to September) and autumn (October to December).
IODE standard levels as adopted in the Mediterranean: 0, 5, 10, 20, 30, 40, 50, 75, 100, 125, 150, 200,
250, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1750, 2000, 2500, 3000,
3500, 4000, 4500, 5000. Please check if are the same in the 5 regions.
Variable list: already defined to NO3, NOx, Total Nitrogen, PO4, Total Phosphorus, SiO4, NH4. Can we
evaluate to add also Dissolved Oxygen and Chlorophyll-a to complete the water column?
Diva parameters/settings are defined in the following chapter.
Masking for Diva maps to error field to 0.3 and 0.5.
Use only data with QF=1, 2, 6 for Diva.
9
EMODnet Thematic Lot n° 4 - Chemistry
QA/QC and DIVA report
3. General guidelines for DIVA settings
While performing a DIVA analysis, the following steps has to be followed.
1. Domain definition and topography: Domain definition and topography: should be ok (check
resolution not too fine nor too coarse). Masking by definition of regions should be left until the
very end if any. Eliminate lowlands right from the start.
2. Output resolution: Decades with sliding window every year, by seasons. Regional definition of
vertical levels and seasons.
Seasons as adopted in the Mediterranean: winter (January to March), spring (April to June),
summer (July to September) and autumn (October to December).
IODE standard levels as adopted in the Mediterranean: 0, 5, 10, 20, 30, 40, 50, 75, 100, 125,
150, 200, 250, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1750,
2000, 2500, 3000, 3500, 4000, 4500, 5000. Please check if are the same in the 5 regions.
3. Data sets: Aggregated data: make sure data outside of your region are eliminated (via option in
driver). QF selection: make sure that SeaDataNet QF scheme is used (Diva is not aware of QF
schemes, it only selects data from a list of QF to be accepted). In aggregated data set there
should be no 0 flags, so 1,2,6 retained only, the others too dangerous to use.
4. Background fields: make sure it has correct vertical coherence. Use climatological average (all
data for a given season, with large L, low SN and possibly detrending).
5. Statistical parameter: SN and L: optimize but take with a grain of salt and provide reasonable
bounds. VERTICAL coherence (via option -30 in driver).
6. Outliers: use the function outlier elimination ONLY if you are very confident or if you see a few
bulleyes in the analysis (too many bullseyes indicate a too high SNR) in statistical parameters
and quality of your products (final fine-tuning). In all cases check if reasonable amount of data
are flagged "outliers".
7. Error fields: always mask the results where relative error field exceeds 0.3 and 0.5 using the
same approach as in SDN and EMODnet Pilot (zero means analysis is expected to be perfect, 1
means the analysis has an expected error as large as your first guess, the reference field).
8. Advanced features: use advection if you have info (provide velocity fields) or if you really have
currents that are coastal (use second parameter in driver to create pseudo along-coast
velocities). Detrending: if trends in years are expected. Change of variables: specially for
concentrations: apply log or logit.
9. Checking:
• Work on 4D netCDF file (the one which will be published and includes already masked fields).
• Vertical coherence via vertical sections
10
EMODnet Thematic Lot n° 4 - Chemistry
QA/QC and DIVA report
• Presence of bullseye or other artifacts (too high SN or too small L, suspect data)
• Verify data coverage field to make sure you did not "loose" some data
• Look at Output/3Danalysis/Variablename.Metainfo.txt
Discussion Reiner mentionned the good practise of looking at residuals. ULg will add automatic
plot of residuals and global indicator of residuals follow expected distribution. If not warning
message will be issued
Next Diva workshop: 3-7 November Calvi: intensive work, if special requests/questions
send them before, in particular for features which are presently not
possible to exploit by the driver options
Example of DIVA settings: finetuning when more or less satisfied:
Data extraction: 0 = do nothing
boundary lines and coastlines generation: 0 = nothing
cleaning data on mesh: 4 = 1 + outliers elimination
minimal number of data in a layer: 0
Parameters estimation and vertical filtering: -30
Minimal L (larger than output grid spacing): 0.25
Maximal L (domain length): 10
Minimal SN: 0.1
Maximal SN: 3
Analysis and reference field: 1
Note: SN= Signal to Noise ratio; L= correlation Length
11
Download