D1.3 Appendix 1.4b Water Data Model and Analytics

advertisement
Confidential draft – Do not distribute
Integrating multiple analytic modules around the Operational Intelligence
Platform (OIP)
The case of water distribution
Claude Le Pape, Alfredo Samperio, Gratien Bonvin
Draft, June 2, 2014
The investigation of analytic problems related to water distribution as part of the Arrowhead
project and the examination of an exploratory use case presented by a customer of Schneider
Electric strongly suggest the need to integrate multiple “analytic” modules around a common
data basis.
In parallel, the ongoing development of the OI Platform (to manage in some unique manner
types of data often encountered in Schneider Electric, starting with time series, and gather
analytic services) suggests that such a common data basis shall be compatible or interfaced with
the OI Platform.
In this document, we propose a relational data model inspired by (i) the EPANET water
distribution standard, (ii) the OI platform, and (iii) other elements from the Arrowhead project
which we believe could be generalized and linked once and for all to the OI platform. We also
describe various analytic problems which we believe could be addressed from this basis, using
specific software already available within Schneider Electric (e.g., hydraulic simulation and
optimization), more generic software under development within Schneider Electric (e.g., for
demand prediction), or software that could be available from external partners (e.g., from
Artelys for planning and demand-response management). Part of the interest of facilitating the
integration of analytic components around a common basis consists in enabling an easier
evaluation of external components, in comparison or in complement to our current offer.
Let us note that we focus here on offline analytics, i.e., analytics aimed at planning actions in
advance. These shall be complemented with real-time control analytics (e.g., from
SpecificEnergy for real-time pump selection), which we will not consider in the present
document.
1
Confidential draft – Do not distribute
In its current version, this document is clearly intended as a draft aimed at triggering the
discussion, in order to decide how to go further. Comments and suggestions for improvement
are the most welcome.
1. Proposed Data Model
We describe a proposed data model in relational form. As much as possible, we have tried to
stick to the concepts of the EPANET model. We have allowed ourselves to deviate from EPANET
whenever there was a clear advantage in doing it, either to better represent elements of the
exploratory use case alluded to above, or to adopt concepts used in the OI Platform and the
Arrowhead project.1
1.1.
Network Nodes: Junctions and Tanks
A network in EPANET is described in terms of Nodes and Links.
We focus on two types of Nodes, i.e., Junctions and Tanks.
For both types, the coordinates of the node are generally useful, for analytic calculations and/or
to enable some geometrical display of the network. The coordinates relate to an arbitrary origin
and are expressed in meters. At this point, to keep things simple, we ignore the impact of the
height of the water in a tank and use the altitude of the tank as the only relevant height
parameter.
In both cases, we have allowed a DemandPattern to be attached to the node.2 The
DemandPattern describes either a history or a prediction of how much water leaves the
node over time, in general to serve end customers. Predicting the DemandPattern of a given
node is one of the main analytic functions we consider.
For optimization concerns, it might also be useful to associate to a source node a function
describing the cost of a cubic meter of water at this node. This will be done through the
1
As we are not specialists of EPANET, we may have missed important elements or opportunities to remain closer to
the EPANET model, while allowing easy integration with the OI Platform. For this first version, we have also tried to
keep things simple, making simplifying assumptions which could be criticized. As already mentioned, the contents
of this document are proposed as a starting point, open for debate.
2
In our understanding, EPANET enables the specification of demands only for junctions and not for tanks. Strictly
speaking this would be sufficient, as a “tank” might simply be linked to a “junction”. However, we feel it could be
appropriate to simplify the network description and allow a tank to be considered as a consuming node of the
network.
2
Confidential draft – Do not distribute
association of a WaterTariff object to the node. When no water can be produced at a given
node (or injected from a non-described node) the WaterTariff attribute is null.
Contrarily to junctions, tanks are places in which water can be stored. The minimal data needed
to manage this storage include the maximal volume of water storable in the tank, the minimal
volume that shall remain in the tank at all times (by default, 0), and the current (initial) volume.
In the relational model we propose, the JUNCTION table includes the following columns:






A JUNCTION_ID which unambiguously identifies the junction.
The X_COORDINATE of the junction.
The Y_COORDINATE of the junction.
The Z_COORDINATE (altitude) of the junction.
An optional DEMAND_PATTERN_ID identifying a demand pattern for the junction.
When no demand pattern is provided (DEMAND_PATTERN_ID = null), it is assumed
that the junction is merely an intermediate node in the network, from which water is
forwarded to other nodes.
An optional WATER_TARIFF_ID identifying a cost function for the water at the
junction if the junction is a source.
The TANK table includes the following columns:








A TANK_ID which unambiguously identifies the tank.
The X_COORDINATE of the tank.
The Y_COORDINATE of the tank.
The Z_COORDINATE (altitude) of the tank.
The minimal volume VOLUME_MIN to be kept at all times in the tank.
The maximal volume VOLUME_MAX that can be kept in the tank.
The INITIAL_VOLUME at the beginning of the overall time period under
consideration.
An optional DEMAND_PATTERN_ID identifying a demand pattern for the tank.
1.2.
Network Links: Pipes, Pumps, Pumping Stations, Valves
Nodes are connected by several types of Links. At this stage, we assume the network is
oriented; hence each Link has a START_NODE and an END_NODE.
3
Confidential draft – Do not distribute
A Pipe is a passive Link between its START_NODE and its END_NODE. Energy losses occur in
a Pipe depending on various parameters including its LENGTH and DIAMETER, as well as on
the MATERIAL constituting the Pipe.
The PIPE table contains the following columns:






A PIPE_ID which unambiguously identifies the pipe.
A START_NODE_ID where multiple links might join before the pipe.
An END_NODE_ID from where multiple links might branch after the pipe.
A MATERIAL_ID identifying the material constituting the pipe.
The LENGTH of the pipe.
The DIAMETER of the pipe.3
A Pump is an active Link in which a motor provides electrical power, which is transformed in
mechanical power used to pump water that will flow from the START_NODE to the
END_NODE. The most important characteristics of a Pump describe the relations between the
electrical power, the mechanical power, and the flow. In addition, a Pump can be controlled by
a VARIABLE_SPEED drive.4
The PUMP table contains the following columns:






A PUMP_ID which unambiguously identifies the pump.
A START_NODE_ID where multiple links might join before the pump.
An END_NODE_ID from where multiple links might branch after the pump.
A Boolean VARIABLE_SPEED indicating whether the pump is controllable or not.
Two limits FLOW_MIN and FLOW_MAX providing the minimal and maximal flow
recommended by the pump manufacturer to maintain the health of the pump.
The minimal power POWER_MIN of the pump and the POWER_SLOPE describing the
dependency between the mechanical power used by the pump and the water flow
enabled by it (when the pump is not controlled by a drive). In practice, these can be
determined from the minimal and maximal flows recommended by the pump
manufacturer and two curves: the FLOW_TO_HEAD curve providing the relation
between the flow enabled by the pump and the pressure (expressed in meters) and the
FLOW_TO_EFFICIENCY curve characteristic of the pump.
3
Aging models might be associated with pipes, suggesting the use of additional characteristics such as
INSTALLATION_TIME. At this point, it is however still unclear to us which data could be effectively useful (e.g.,
characteristics of the material). Aging models will have to be defined in a second version of this document.
4
Aging models might be associated with pumps, suggesting the use of additional characteristics such as
INSTALLATION_TIME. At this point, it is however still unclear to us which data could be effectively useful (age
of the pump, age of the engine, etc.). Aging models will have to be defined in a second version of this document.
4
Confidential draft – Do not distribute

A MOTOR_EFFICIENCY factor between 0.0 and 1.0.
A PumpingStation consists of one or several pumps in parallel, i.e., with the same
START_NODE and END_NODE. The characteristics of a PumpingStation can be inferred
from the characteristics of the individual pumps. Hence, the PUMPING_STATION table is
optional. When it is provided, it includes:



A PUMPING_STATION_ID which unambiguously identifies the pumping station.
A START_NODE_ID where multiple links might join before the pumping station.
An END_NODE_ID from where multiple links might branch after the pumping station.
A Valve is a Link in which water flow can be limited. At this stage, we associate no specific
parameter with a Valve and assume the Valve can be used to set any upper limit on the
flow.
The VALVE table contains the following columns:



A VALVE_ID which unambiguously identifies the valve.
A START_NODE where multiple links might join before the valve.
An END_NODE from where multiple links might branch after the valve.5
1.3.
Materials and ageing models
At this stage, the MATERIAL table contains the following columns:


A MATERIAL_ID which unambiguously identifies the material.
Its DARCY_FRICTION_FACTOR used in classical models for estimating energy losses
in a pipe.
According to the Darcy–Weisbach equation, the pressure loss in a Pipe can be written as
follows:
fD * (L/D) * (V2/2)
Where:

L is the LENGTH of the Pipe
5
In practice, there are multiple types of valves, depending on whether:
1 – the opening is set to a specific value, then flow and pressure drop follow hydraulic equations
2 – the valve is controlled to maintain a given pressure drop
3 – the valve is controlled in order to maintain a given flow.
In this version, we assume that the flow is controllable. We might introduce different types of valves in the future.
5
Confidential draft – Do not distribute




D is the DIAMETER of the Pipe
V is the velocity of the water flow in the Pipe (in m/s), which can also be written as Q/S
where Q is the flow of water in the pipe (in m3/s) and S =  (D/2)2 the section of the pipe
(in m2).
 is the density of the water in kg/m3 hence 1000.
fD is the dimensionless DARCY_FRICTION_FACTOR. In reality, this factor depends on
the relative roughness of the pipe and on the speed of water in the pipe. In first
approximation, however, this can be supposed constant and associated to the pipe
material.
We expect that Materials will also be used to describe ageing models. At this point, this is
still to be explored.
1.4.
Demand Patterns
A DemandPattern is a time series defining an expected output flow (to final customers or to
another non-represented portion of the network) from a given node. When appropriate, the
flow can be defined to be periodic over a given time period and renewed from one year to the
other, possibly according to a given ANNUAL_RENEWAL_FACTOR.
In the relational model we propose, the WaterDemands table can be used to specify water
demand patterns. It includes the following columns:








A WATER_DEMAND_TIME_SERIES_ID which unambiguously identifies the time
series.
A START_TIME.
An END_TIME.
The FLOW between the given START_TIME and the given END_TIME.
An optional PERIODICITY (e.g., “NONE”, “DAY”, “WEEKDAY”, “WEEKEND”) indicating
that the given demand element repeats itself periodically. When this column is not used,
it is assumed that there is no periodic repetition of the demand.
An optional PERIOD_START_TIME and an optional PERIOD_END_TIME limiting the
extent over which the periodical repetition applies.
An optional ANNUAL_RENEWAL Boolean (0 or 1) indicating whether the given demand
element repeats itself every year. When this column is not used, it is assumed that there
is no annual repetition.
An optional ANNUAL_RENEWAL_FACTOR indicating that the given demand element
repeats itself every year, multiplied by the given factor.
6
Confidential draft – Do not distribute
1.5.
Tariffs
Tariff descriptions can be used both for water costs and electricity costs. A Tariff is
described as a time series of curves, enabling the cost to vary with the flow of water or the
electrical power that is used. The TARIFFS table includes the following columns: 6




A TARIFF_TIME_SERIES_ID which unambiguously identifies the time series.
A START_TIME.
An END_TIME.
Six columns describing the curve that applies from the given START_TIME to the given
END_TIME: CAPACITY_MIN, CAPACITY_MAX, COST_MIN, COST_MAX, FIXED_COST,
and VARIABLE_COST.




o When the power or flow equals CAPACITY_MIN, the cost for being at this
power or flow level for one unit of time is COST_MIN.
o As soon as CAPACITY_MIN is exceeded, i.e., becomes CAPACITY_MIN, a
penalty corresponding to the given FIXED_COST is paid. FIXED_COST is often
equal to 0. The corresponding column is optional.
o Between CAPACITY_MIN and CAPACITY_MAX, the cost grows from
(COST_MIN + FIXED_COST) to COST_MAX as a quadratic function of the
power or flow with the given VARIABLE_COST as initial slope. In usual cases,
COST_MAX – (COST_MIN + FIXED_COST) = VARIABLE_COST *
(CAPACITY_MAX – CAPACITY_MIN) and the cost grows linearly with the
capacity.
o When the power or flow equals CAPACITY_MAX, the cost for being at this
power or flow level for one unit of time is COST_MAX.
An optional PERIODICITY (e.g., “NONE”, “DAY”, “WEEKDAY”, “WEEKEND”) indicating
that the given tariff element repeats itself periodically. When this column is not used, it
is assumed that there is no periodic repetition of the tariff.
An optional PERIOD_START_TIME and an optional PERIOD_END_TIME limiting the
extent over which the periodical repetition applies.
An optional ANNUAL_RENEWAL Boolean (0 or 1) indicating whether the given tariff
element repeats itself every year. When this column is not used, it is assumed that there
is no annual repetition.
An optional ANNUAL_RENEWAL_FACTOR indicating that the given tariff element
repeats itself every year, with all costs multiplied by the given factor.
6
Let us note that an additional table might be necessary if we want to incorporate a choice of contract (and, in
particular, of contracted power) in the optimization problem.
7
Confidential draft – Do not distribute
2. Analytic Modules
This section presents three analytical components considered at this point.
2.1.
Demand Prediction
The demand prediction component aims at extending a given water demand pattern in the
future. A prediction model linking demand with other variables (e.g., weather conditions) is first
learned. Then the model is used to extend a given demand pattern for a given period of time.
A more precise specification of such an analytic component will be provided in another
document in preparation.
2.2.
Pumping Plan Optimization / Planning for Demand Response
Multiple options for the optimization of pumping plans and demand response could be
considered. In this section, we will attempt to describe an approximate “simple” model which
would make sense in the exploratory use case we are aware of. An open question is whether
the approximations we make are reasonable. In particular, we ignore all transient factors. We
do as if we can use a steady-state approach over a given number of individual time periods.
Given are H time periods PERIOD1 PERIOD2 … PERIODH


With start time stt and end time ett (1 ≤ t ≤ H).
With electricity cost (tariff) over the period. To ease the following description, we will in
this section restrict ourselves to linear tariffs and assume that for each period t, a cost
per kWh ct is given.
Given are N water towers (tanks) TOWER1 TOWER2 … TOWERN


With minimal and maximal volumes vmini and vmaxi (1 ≤ i ≤ N)
o The minimum is supposed to be given. However, it would be interesting to study
how the energy cost and the non-delivery risks vary with this minimum.
With a (predicted) water consumption profile PF1 PF2 … PFN
o PFi is a deterministic function specifying a consumption ci,t for all t in {1 … H}
o Later we may want to play with a probabilistic function and introduce a notion of
robustness of the plan with respect to variability of the demand. We ignore such
a potential extension for the moment.
8
Confidential draft – Do not distribute
Before each water tower TOWERi there is a valve VALVEi enabling to limit the flow and a pipe
PIPEi. The goal is to define at each time t in {1 … H} the flow Fi,t between the pumping station
and the water tower TOWERi in a way that guarantees that the demand will be satisfied (in the
deterministic version) and that minimizes cost.
The volume Vi,t in the water tower TOWERi at the end of period PERIODt is obtained as follows:



Vi,t = Vi,t-1 + Fi,t * (ett – stt) – ci,t
We impose vmini ≤ Vi,t ≤ vmaxi
Vi,0 is the initial volume at the beginning of the first period. This value is given.
The discharge pressure PRt at the end node of the pumping station that is needed during period
t depends on the flows Fi,t as follows:



PRt ≥ FORMULA(Fi,t)
We want to vary the formula, using more or less precise models with influence on three
factors: (i) the amount of data needed, and hence the cost of the solution
implementation; (ii) the computation time; (iii) the precision of the results. The key point
is that if approximate models lead to pumping schedules which are close to the pumping
schedules that would be obtained with more precise models, then the approximate
models are acceptable.
Several elements shall be considered.
o Precise physical models are likely to need a lot of data on pipe characteristics:
can we avoid this need?
o Can dynamics be ignored, without getting a too bad approximation?
o Theoretically, the needed pressure also depends on the altitudes of the water
towers for which the valve is open: can we ignore this?
o An interesting option would consist in building a data-driven model (e.g., we
build from past data a table enabling to approximate the actual function) rather
than using a physical model
o If the pressure is never much higher than the minimal hydrostatic pressure
needed, an option might be to do as if the pressure can be constant or a simple
linear or piecewise linear function of the total flow.
When the pumping station is directly linked to each water tower, one specific model we may
use is the following:
PRt ≥  g hi + dffi * (Li/Di) * (Vi,t2/2)
for each i where

 is the density of the water in kg/m3 hence 1000.
9
Confidential draft – Do not distribute






g is the gravitational acceleration (9.81 m/s2).
hi is the difference of altitude between the water tower TOWERi and the pumping
station.
dffi is the DARCY_FRICTION_FACTOR of the material of the pipe PIPEi
Li is the LENGTH of PIPEi
Di is the DIAMETER of the PIPEi
Vi,t is the velocity of the water flow in the pipe, i.e., Vi,t = Fi,t /  (Di/2)2.
When there are intermediate pipes and junctions, the same formula has to be used iteratively
from the tanks to compute the discharge pressure at each junction. At each junction, the
application of the inequality for each outgoing pipe guarantees that the most constraining
branch is taken into account.
If the pumps had no loss, the power POWERt needed over time period t would be PRt * i Fi,t.7
Taking into account the efficiency of pumps brings an additional difficulty. In practice, each
pump PUMPj is contributing a flow Qj,t with i Fi,t = j Qj,t. When there is no drive, the
mechanical power deployed by each pump PUMPj is roughly in the form:
POWER_MINj + POWER_SLOPEj * Qj,t
Taking into account the efficiency of the motor leads to:
POWERt = j (1 / MOTOR_EFFICIENCYj) * (POWER_MINj + POWER_SLOPEj * Qj,t)
The total energy cost to minimize is equal to t POWERt * (ett - stt) * ct
Once a pumping plan is obtained, studying the opportunities of demand-response could be
done in multiple ways, e.g., by varying the electricity tariff or using the framework previously
developed by Schneider Electric and Artelys.
2.3.
Network Simulation
At this point, we do hope (but this needs to be checked) that a network description in the
proposed relational model can be used as an input to perform simulations using the hydraulic
tools available in Schneider Electric. This would enable us to link these tools with the OI
With some constraints on the possible values of POWERt depending on the characteristics of
the pumps, use of drives, etc. In particular, in the absence of drive, POWERt would take its
values in a discrete set {p0 = 0, p1, p2, …}.
7
10
Confidential draft – Do not distribute
Platform and hence with other analytic tools developed on top of the platform (e.g., demand
prediction).
A more precise specification of such a link needs to be written in the future.
11
Download