EngDBDesign_2011_3_15

advertisement
The Engineering Database
Assumptions and Considerations About the Data



















The database does not contain ITAR data
The content of the database has no proprietary information
1 GByte/day (from FOS)
Different parameters have different types (integer, floating point (single or
double precision?), string)
Different parameters have different cadences
Same parameter may have different cadences
Required Time stamp precision
o For SDP, = seconds (expected to add a small padding in requests)
o For Calibration = highest sampling rate (distinguish between 2
consecutive samples (Since highest rate = 64 ms, I assume this is the
required precision)
Number of different parameters in the database ~ 20,000
Typical time_range = 100 s … 1000 s (JV, PG) similar to the typical exposure
time (this range may be longer when combining exposures, or if data is asked
per visit, observation) (other possible values up to 900, 7,200, 10,000 s (MS,
DS) ) visit time limit = 50 hours = 180,000 s
Exception: If thermal history is important, calibration pipelines may need
data streaming (Probably no need to worry until it happens) (PG)
For a given parameter and range, how many values are expected in the series?
Depends on cadence
o High-rate = 64ms (16 samples/s, 112,500 samples in 7200 s) (MS, DS)
= 4 samples/s (PG)
o Low -rate = ??
Out of the 20,000, what is the distribution of High-rate, Low-Rate parameters?
o High-rate = applies only to a few
Out of the 20,000, what is the distribution by type integer, string, float, double?
Out of the 20,000 which parameters will be usually queried?
Out of the 20,000 which parameters are usually queried using aggregate
functions?
Size of parameter list for simultaneous retrieval [p1,p2, ..., pn] where n = 5…
50 on average
Out of the 20,000, only a few hundred will be retrieved frequently (mainly by
the calibration software).
Some parameters will be retrieved a few times over their lifetime
Most parameters will NEVER be retrieved
It may be necessary to add modeled parameters based on long history to the
database to optimize access
1

Compression options may need to be considered
Requirements on the Service Interface





REST or similar
No SOAP
Internal (Science Processing & Calibration Pipeline) and External (External
Calibration, Archive User Interface) access
Provides Authentication/Restriction to avoid abusive behavior that produces
“denial of service”
Optimized for Calibration pipelines
The 20 Queries
1. What are the engineering parameters, type, domain, and units?
GetParameters(*, type=1, domain=1, units=1)
2. What are the engineering parameters associated with a given instrument, type,
domain, and units?
GetParameters(instrument = ‘instrument’, type=1, domain=1, units = 1)
3. What are the engineering parameters with name like reg expression = ‘*blah*’,
type, domain, and units?
GetParameters (regexpression = ‘*blah*’, type=1, domain=1, units=1)
4. Given a parameter, what are its values between time_start and time_end ordered
by time?
GetValues_Parameter (p, [ts, te])
5. Given a list of parameters, what are their values between time_start and time_end
ordered by time?
GetValues_ParameterList ([p1, p2, …pn], [ts, te])
Notes on results format:
R = [(p1, (t1, v1), (t2, v2), …(tz, vz)), (p2, (t1, v1), (t2, v2), …(tz, vz)) , …, (pn, (t1, v1),
(t2, v2), …(tz, vz))]
When parameters share the time stamp t, the result set could be requested to follow
a more compact format
2
R = [(t1, ((p1, v1), (p2, v1), …, (pn, v1)),
(t2, ((p1, v2), (p2, v2), …, (pn, v2)),
...,
(tz, ((p1, vz), (p2, vz), …, (pn, vz))]
Another possible compact format is
R = [t_start, t_sample, ((p1, (v1, v2, …, vz)), (p2, (v1, v2, …, vz)), …,(pn, (v1, v2, …,
vz))]
where each parameter has z = (te-ts)/t_sample values.
Note: in all cases p1, p2, …, pn could be omitted implying the input list of parameter
order
6. Given a parameter, what is the avg, min, max, median, range values between
time_start and time_end?
GetAvgValue_Parameter (p, [ts, te])
GetMedianValue_Parameter (p, [ts, te])
GetMinValue_Parameter (p, [ts, te])
GetMaxValue_Parameter (p, [ts, te])
GetRangeValue_Parameter (p, [ts, te])
7. Given a list of parameters, what are the avg, min, max, median, range values
between time_start and time_end?
GetAvgValue_ParameterList ([p1, p2, …pn], [ts, te])
GetMedianValue_ParameterList ([p1, p2, …pn], [ts, te])
GetMinValue_ParameterList ([p1, p2, …pn], [ts, te])
GetMaxValue_ParameterList ([p1, p2, …pn], [ts, te])
GetRangeValue_ParameterList ([p1, p2, …pn], [ts, te])
Note: Request for aggregates could be also packed as
GetStatistics([Avg, Median, Min, Max, Range], [p1, p2, …, pn], [ts, te])
R = [(Avg(p1), Median(p1), Max(p1), Range(p1)),
(Avg(p2), Median(p2), Max(p2), Range(p2)),
…
(Avg(pn), Median(pn), Max(pn), Range(pn))]
3
Download