Distributed Power System State Estimation

Distributed Power System State Estimation Ra Sevlian1 and Umnouy Ponsukcharoen2 1. Department of Electrical Engineering Stanford University, Stanford, CA 2. Institute for Computational and Mathematical Engineering, Stanford, CA {raffisev, umnouyp}@stanford.edu June 17, 2012 Abstract The need for real time power system state estimation to be performed on a large number of smart grid components make the use of distributed algorithms very attractive. This report explores the current state of the art of distributed power system state estimation algorithms and evaluates their performance on multiple IEEE benchmark transmission and distribution systems as well as randomly generated graphs. Three distributed estimation algorithms are evaluated in terms of convergence, communication cost, computational cost and robustness to communication errors for the IEEE 14 bus network. Analysis is also performed for convergence of one algorithm for a general graph. 1 Introduction Power system state estimation (PSSE) refers to obtaining the voltage phasors of all system buses at a given moment. This is generally performed by making many redundant observations of the many power ows through the network, then performing an inference to determine the underlying phasor values. In early days, PSSE was performed in a centralized data processing center which aggregated all the observations and computed a global solution for the entire network [8, 7]. However, there is a a move towards expansion of system sensing capabilities as well higher rate of estimation making decentralized estimation very attractive. Also, the new grid structure is open for two-way electricity ow and distributed generations down to distribution level. Hence, power system state estimation needs to be performed in a large scale where the centralized operations might not as eective. Distributed power system state estimation algorithms allow system operators to deal with large scale problems by dividing the measurements and buses into control areas. Each control area will collect it's measurements, perform it's own state estimation and exchange information with other control areas. 1 Recently proposed methods for solving for state estimates come from the sensor networking literature. [9, 4]. This report provides an experimental study for the use of these two techniques. It also evaluates the performance of recent work in distributed sub gradient based optimization [1]. The study divides into two parts. In the rst part, three algorithms are implemented and their performance compared with a centralized estimator. Comparisons are made in four performance indicators of interest: (1) convergence behavior, (2) computational complexity (3) communication cost and (4) local observability requirement. In the second part, we analyze the performance of these techniques on dierent tree networks and present an empirical model for one of the algorithms. 2 Power System State Estimation The centralized state estimation problem is the following: Given an underlying state of the network X = {(V0 ), (θ1 , V1 ), . . . (θN , VN )} which represents the voltage magnitude and angle at every bus except the reference, the state estimator will use a set of measurements of the form hk (X) + k to construct it's state estimate each measurement, ∼ N (0, R), (Rii = σi ) argmin X x̂. Assuming normally distributed errors on the general estimator will be. 2 M X zk − hk (X) σk k=1 (1) For the sake of simplicity, we only consider power measurements for the entire system. In the fully non linear AC power ow, a state estimator will receive the following sets of measurements. Real and reactive power injection at bus Pi = Vi X i Vj ( Gij cos(θij ) + Bij sin(θij )) (2) Vj ( Gij sin(θij ) + Bij cos(θij )) (3) j∈N (i) Qi = V i X j∈N (i) Real and reactive branch ow from bus i to bus j. Pij = Vi2 gij + Vi Vj ( gij cos(θij ) + bij cos(θij )) Qij = −Vi2 bij + Vi Vj ( gij sin(θij ) − bij cos(θij )) Where the line admittance of branch (i, j) is yij = Gij + jBij and (4) (5) θij = θi − θj . In this paper, however we will solve the state estimation problem based on linearized DC power ow. In linearized DC power ow, we assume rst that all bus voltage magnitudes are close to 1.0 per unit. Next, that all the transmission lines are lossless and the nodal voltage phase angle dierences are small. Applying these to the nonlinear measurements to eq. (2), (4) we have the following equation relating bus angle and real bus injection and branch ow power. 2 Pi = X Bij (θi − θj ) (6) Pij = bij (θi − θj ) (7) j∈N (i) The linearized measurement model allows us to represent the measurement set as well as central estimator as a simple weighted least squares formulation. From the relations in eq. (7), the measurement vector and system state vector are related by Z = HX + . Here Z ∈ Rnobs is the column vector containing all bus injection and branch ow measurements n ×nbus is the matrix containing elements of eq. (7). From the (M = nbus + nbranch ). H ∈ R obs DC modeling assumptions the state vector only contains bus angles now on, state variable x and bus angle θ x = {θ1 . . . θN }. will be used interchangeably. Fron We can therefore simplify eq. (1) to the following centralized weighted linear least squares estimate. x̂c = (H T R−1 H)−1 H T R−1 Z (8) Note that in order to be able to estimate the state vector, mbus must be at least N − 1 rank(H T H) = nbus − 1. To make the system of equation full rank, we assume that and the bus 0 is always the slack bus with phase angle zero. We solve for each angle relative to θ0 , this leads to a reduced sensing matrix Hr ∈ Rnobs ×nbus −1 which is H but with the rst column removed. We can apply eq. (8) with Hr which results in the following. [θ1 − θ0 . . . θN − θ0 ]T = (HrT R−1 Hr )−1 HrT R−1 Z 3 (9) Distributed Estimation Algorithms for Power System State Estimation Distributed algorithms for estimation work by splitting large problems into many locally computable problems. shown in Figure 1. This can be illustrated in the model of the IEEE 14 bus network As opposed to the centralized estimator which requires each sensor to transmit information to a single location and then perform inference. A decentralized algorithms work by distributing the observations and computation into dierent domains. Here, each area receives observations of variables local only to itself as well as variables it shares with neighboring areas. For the 14 bus example, there are four domains, where each domain has access to only specic observations. It then uses specically designed messages from it's neighbors to compute estimates of it's unknown variables and possibly the unknown variables contained in other areas. Local observability of an area is an important assumption in many distributed techniques. If an area is locally observable, then estimation can occur without the rest of the network. Many early and heuristic methods required local observability, however this is not required. The algorithms presented here do not require local observability. In the following sections we will introduce the various algorithms and illustrate their behavior using the 14 bus network. 3 Figure 1: Computational abstraction of 14 Bus IEEE benchmark network. Splitting nodes into dierent computational domains. Block markers on edges represents branch ow measurement. Circle markers on vertices represents bus injection measurement. Network contains 26 measurements and is overdetermined. 3.1 CSE The algorithm introduced in [9] for use in the distributed power system state estimation originates from work in distributed estimation in sensor networks [3]. In CSE, each area builds an estimate of the entire state of the system. In each step, an area will compute a local estimate of the global state, and then transmit the full estimate to all neighbors in it's communication network. Note that in CSE, each area's state vector is the full set of unkown variables. estimate Therefore the message sent from area x̂tk ∈ Rnbus if l n from it's neighbors, the area k k l ∈ Nk . to area is in the communication set of at time i for example is the With a given set of messages performs the following update. " x̂t+1 = x̂tk − a b k # X (x̂tk − x̂tl ) − HnT (zn − Hn x̂tk ) (10) l∈Nk This method puts no limitations on the communication architecture required. That is the structure of Nk is independent of the topology of power system. Also, since the second term of eq. (10) does not involve any matrix inversion, the algorithm does not assume any local observability. Therefore, it can be used in a fully decentralized manner, in that each 4 bus can be a computing area. However, is no implicit inversion, since the update term is similar to a gradient ascent direction. Analysis of the algorithm is based mainly on introducing an aggregate state [x̂T1 (i) . . . x̂Tnarea (i)]T . x̂(i) = With this, the evolution of the estimate vector becomes. T x) x̂(i + 1) = (Inbus ⊗ Inarea − ab(L ⊗ Inarea )x) − DH (Z − DH (11) is the graph Laplacian of the communication topology and ⊗ represents a kroM ×M necker product over matrices. IM represents an identity matrix in R . Finally the matrix Here DH L is   H1T . . . 0   . . DH =  ...  . T 0 . . . HM Note the similarity of the update step with that of recursive least squares lters in standard lter theory [2] as well as gradient descent. In standard recursive lter theory, state updates are of the form. x̂(i + 1) = x̂(i) + Ak (yk − Hxk ) With this, we can show that the error process evolves as. (12) ek+1 = (I −Ak H)ek . In the case of the CSE algorithm, we can use a similar method to show that the error process evolves T as ek+1 = (IM N − ab(L ⊗ IM + DH DH ))ek . Like RLS lters, this technique has an associate Riccatti equation relating error covariance as a function of iteration. 3.2 ADMM The ADMM algorithm was rst introduced to power system state estimation in [4]. It relies on formulating an augmented Lagrangian of the combined estimation problem. Given the global optimization problem that needs to be solved in eq. (1). We can decompose the objective function into L separate objective functions dependent on their own set of variables if we introduce an extra set a constraints. min x s.t. L X kzk − Hxk k2 k=1 xk [l] = xkl , ∀l ∈ Nk . From this we can formulate the augmented Lagrangian for the system. This is given for the problem as 5 L({xk }, {xkl }, {vk,l }) = = nX area k=1 nX area " # kzk − Hxk k2 + X T vk,l (xk [l] − xkl ) + ckxk [l] − xkl k2 l∈Nk Lk ({xk }, {xkl }, {vk,l }) (13) k=1 The alternating direction term comes from the fact that the augmented Lagrangian in eq. (16) is maximized by each area partially minimizing their local Lagrangian Lk ({xk }, {xkl }, {vk,l }) then exchanging dual variables in a distributed fashion. xt+1 = k arg min x t Lk ({xk }, {xtkl }, {vk,l }) (14) xt+1 = kl argmin xkl t L({xt+1 }, {xkl }, {vk,l }) (15) t+1 vk,l = argmin vk,l L({xt+1 }, {xt+1 kl }, {vk,l }) (16) Note that in ADMM, each area's state vector is the local set of unkown variables as well as variables that eect the obsevations set in the area. For example, the state vector for area [θ1 , θ2 , θ5 ] as well as [θ4 , θ6 ] since the bus power observed at bus 5 4 and 5. As shown in [4] this reduces to the following set of recursions. 1 is the local uknowns a function of bus xt+1 = (HkT Hk + cDk )−1 (HkT zk + cDk pk ) k 1 X r+1 xl [i] st+1 = kl |Nki | i is (17) (18) l∈Nk pt+1 k = prk (i) + sr+1 k xrk + srk − 2 (19) (20) i corresponding to xk (i) dened for all l ∈ Nk . Nk is th dened as the set of all areas which share the i element of area k 's state vector (xk [i]). Dk is i a diagonal matrix with the (i, i) entry of |Nk |. With appropriate choice of c, for each control r area k, xk converges to the estimate of the subset of the whole system estimate. Combining values from all areas, one can obtain the whole system estimate. Here, 3.3 xl [i] denotes the entry of xl Distributed Dual Averaging In distributed dual averaging (DDA), estimates are constructed by each node calculating a local estimate of the global subgradient and then sharing it with neighbors dened by a communication graph. In the case of the power system estimation problem, the communication 6 graph is independent of the topology of the power network as shown in 1. This makes DDA similar to the CSE algorithm in that regard. Specically, we have a neighbor set for each node N where i ∈ V N (i) = {j ∈ V |(i, j) ∈ E} The core algorithm dened for a general optimization problem is dened as the following. For a given generic convex objective function of the form. n 1X fi (x) n k=1 min x∈X is convex but not necessarily smooth. So at iteration t, each t t node k ∈ V computes an element gk ∈ ∂fk (xk ) the subdierential of the local function fk t and receives information about the parameters wk , j ∈ N (i). It updates it's estimate of the t solution xi based on a combination of the current subdierential and the messages from it's neighbors, via a projection operation Π. The algorithm is therefore. We must assume that fi (x) X wkt+1 = pkl wj (t) + gkt (21) l∈N (k) xt+1 k = ΠψX (wit , α(t)) (22) (23) The projection operator Π is dened as. ΠψX (z, α) = argmin x∈X 1 < z, x > + ψ(x) α (24) Here ψ is the proximal function, and in this study is set to the canonical proximal function 1 kxk22 as stated in [1]. 2 In the case of distributed linear estimation for power system state estimation, the algoT −1 rithm reduces to the following. The objective function now is fk (x) = (zk − Hk x) Rk (zk − 1 T Hk x) which gives us gk (t) = 2Hkt R−1 (zk − Hk x). Next, ΠψX (z, α) = argmin z T x + 2α x x . x∈X t+1 In an unconstrained case, the solution becomes xk = −αt+1 zkt+1 . Since x ∈ [−π, π] we need to constrain the update. The nal recursion becomes. wi (t + 1) = X pij wj (t) + (t) (25) j∈N (i) xi (t + 1) = min n max 7 n πo πo −αt wi (t + 1), , 4 4 (26) 4 Numerical Experiments with IEEE 14 Bus Network The centralized and three distributed algorithms are numerically tested using MATLAB. The power system used in the test is the IEEE 14-bus system which is shown in Figure 1. The associated admittance matrix and true underlying power states/measurements are obtained using MATPOWER and veried with the benchmark source. In the IEEE 14-bus grid, measurements sites and types are shown in the gure. An abstraction of the buses, measurements and control areas is also shown in 1 The four rectangles represent local control areas with a total of 23 measurements in p.u. where boxes on an edge represents branch ow measurement. The redundancy ratio is 23/14=1.64 therefore, the system is very likely to have a unique solution. The communication between the areas forms a fully connected graph. 4.1 Preliminary test and calibration Figure 2: Performance all three algorithms under no observation noise or communication noise. MSE shown is between the current estimate and the underlying true power state. Title on plots indicates levels of measurement and communication noise in simulation. In the preliminary test, we draw measurement values as appeared in the gure above (IEEE 14-bus system) from true measurement values without adding simulating errors. Hence, we expect all algorithms to give the true underlying power state as a solution. Here we keep the iterations run until the dierence between any pair of the estimate and true underlying value diers by less than a specied threshold. The results are shown in Figure 2. • Centralized algorithm: using the iterations in (9), we found that xˆc agrees with the true underlying state vector obtained from MATPOWER. That means all pieces of centralized algorithm and related information are consistent. • CSE algorithm: using formula (10), with tuning until that the iterations terminated in 13, 626 a = 8e−10 and b = 1e10 , we found iterations, and the estimates from all control area agree with true underlying state vector obtained from MATPOWER. 8 • Distributed Dual Averaging algorithm: using formula (26), with computed from formula discussed above, with extra tuning until found that the algorithm terminated in 495, 536 α0 = 2.66e10 as αN EW = 200α0 , we we iterations, and the estimates from all control area agree with true underlying state vector obtained from MATPOWER. • ADMM-based algorithm: using formula (16), with tuning until the iterations terminate in 34 iterations , c = 9, we found that and the estimates from all control area agree after matching with true underlying state vector obtained from MATPOWER. 4.2 Convergence under Measurement and Communication Noise To study the convergence behavior of three iterative algorithms, we simulate measurements as appeared above from IEEE 14-bus system with observation noise as well as communication noise. The convergence behavior is shown by calculating the error with respect to the 1 r kx̂c − xrk k2 and the error to the true underlying centralized solution x̂c dened as ek,c = N 1 r r state θ dened as ek,c = kx̂c − xk k2 . Here is the size of vector . Note that we do not N use stopping criteria here in order to see the convergence behavior across large number of iterations. The error curves obtained from the IEEE 14-bus network are shown below. 4.2.1 Measurement Noise Only Figure 3: Performance three algorithms under only observation noise. Measurement noise indicates values of σHIGH = 0.01 and σHIGH = 0.00001. Title on plots indicates levels of measurement and communication noise in simulation. Convergence behavior for measurement noise only experiments is shown in Figure 3. In term of convergence behavior, the ADMM algorithm is the most eective method. The convergence rate is high and the number of steps to reach the cut-o level is about 35 iterations, while other two methods require 3 or 4 more order of magnitude number of iterations. Moreover, the convergence rate of the ADMM algorithm in terms of be linear after reaching the cut-o level of M SE ≈ σ . ek,c tends to The CSE algorithm tends to be at after reaching this cut-o level, while the dual averaging algorithm has very at convergence 9 rate long before reaching the cut-o level. It would be interesting to see why the convergence behavior of the dual averaging algorithm has a sudden increase and then become at. 4.2.2 Communication Noise Only Figure 4: Performance three algorithms under only communication noise. noise indicates values of σHIGH = 0.01 and σHIGH = 0.001. Measurement Title on plots indicates levels of measurement and communication noise in simulation. We now simulate the three methods using perfect observations and having additive gaussian noise in the messages transmitted between the areas. The results are shown in Figure 4. In term of convergence behavior, the ADMM algorithm is still most eective method since for both low and high communication noise, there the MSE reaches the lowest point quickly. It is interesting to note however, that when the technique is run for the same iterations as DDA and CSE the accuracy tends to decrease. In these experiments we also see the benet of decreasing step size in the estimate updates in that in the high noise, no measurement case, when the iterations tend to innity, the DDA and CSE algorithms will have higher accuracy since they weight down more recent messages as opposed to the standard ADMM which will weigh it equally with previous estimates. However, for a practical distributed estimation technique, messaging times are nite thus these techniques seem of only theoretical interest for this application. 4.2.3 Measurement and Communication Noise Here we simulate the three methods by having noise in the message as well as the observations. The results are shown in Figure 7. Using the same variances as in 4.2.3 and 4.2.3. We only present cases where both are in low and high noise conditions. In the rst experiment we see again, that given enough time, the CSE algorithm will outperform ADMM and DDA in terms of forming consensus, however like in the other situations, ADMM performs many orders of magnitude faster. 10 Figure 5: Performance three algorithms under only communication noise. noise indicates values of σHIGH = 0.01 and σHIGH = 0.001. Measurement Title on plots indicates levels of measurement and communication noise in simulation. 5 Computational and Communication cost In this section we give an overview of the computational as well as communication costs and time for the three algorithms based on their formulation. To assess computation and communication cost (time), we simulate measurements as appeared above from IEEE 14-bus system with no noise in the experiment in 4.1. Here we will estimate computation cost and communication cost (time) for the experiment and will extend it to general settings. To measure the computational complexity, we count the number of basic mathematical operations needed in the algorithm until the algorithm converges with criterion as set above. Note that we assume there is no computational complexity for inverting Σ is diagonal. Σ since we assume We separate the analysis of computational complexity by pre operational (xed) and operational (marginal) computation cost. To measure the communication cost, we dene the maximum cost of sending one number from a local control area to the central control center to be cl→c , and the maximum cost of sending one number for a local control area to its neighbor local area to be cl→cl . We also assume there is no communication cost to collect local measurements to local control areas. Notice that the communication cost is proportional to the number and size of messages sent during the algorithm. Similarly, to measure the communication time, we dene the maximum time of sending one number to for a local control area to the central control center to be tl→l , and the maximum time of sending one number for a local control area to its neighbor local area to be tl→l . We also assume there is no communication time to collect local measurements to local control areas. In addition, there is no bandwidth congestion problem within communication network. Notice that the communication time is not necessarily proportional to the number and size of messages sent during the algorithm if there is not bandwidth congestion problem. We also dene d to be the maximum degree in the communication network graph and approximate • Centralized algorithm: we use (9) for analysis. 11 Computational complexity: The pre-operational computational cost is dominated by the matrix-matrix multiplication and matrix inversion therefore: O(nbus n2area +(nbus )3 ). The operational computational complexity for each implementation is dominated by 2 matrix-vector multiplication: O(nbus narea + (nbus ) ). Communication cost: the communication cost here is the cost of sending all local measurements to the central control area. Hence, the communication cost is bounded above by: O(mcl→cl ) Communication time: The communication time is bounded above by O(tl→c ) • CSE algorithm: we use (10). We found that with a = 8e − 10 and b = 1e10. We found that ni terate = 14184 for low observation noise. It would be interesting to nd a complexity bound on the number of iterations required for a given epsilon of accuracy. Computational complexity: There is no pre-operational computational complexity for in this algorithm. The operational complexity for each implementation is dominated by matrix-vector multiplication and the summations with information from neighbors: O((narea )(niterate )(maxi |zi |) + (d)(nbus )). Communication cost: the communication cost here is the cost of sending whole vector O ((niterate )(d)(narea )(nbus )) to the neighbors. Hence, the communication cost is bounded above by Communication time: the communication time is bounded above by • Distributed Dual Averaging algorithm: we use eq. (26), with we obtained in 4.1 and the algorithms terminated after 509334 O(niterate tl→l ) αnew = 200α0 as iterations. Computational complexity: There is no pre-operational computational complexity for this algorithm. The operational complexity for each implementation is dominated by matrix-vector multiplication and the summations with information from neighbors: O((narea )(niterate )(nb us)(maxi |zi |) + (d)(nbus )). Communication cost: the communication cost here is the cost of sending whole vector wkr to the neighbors. Hence, the communication cost grows as Communication time: the communication time grows as • ADMM-based algorithm: O((niterate )(d)(nbus )(narea )(cl→l )) O((niterate )(tl→l )): we use formula (16), with c = 9 as we obtain from the previous part. We found that now the iterations terminate in niterate = 33. The pre-operational computational complexity is domi2 nated by the matrix-matrix multiplication and matrix inversion: O((narea )(maxi |zi |) + (maxi |xi |)3 ). The operational complexity for each implementation is dominated by Computational complexity: matrix-vector multiplication and the summations with information from neighbors: O((narea )(niterate )(maxi |xi |)(maxi |zi | + d)). Communication cost: The communication cost here is the cost of sending components (not whole vector) of xrk to the neighbors. The communication cost is bounded above 12 Computational Complexity Algorithm Centralized CSE DDA ADMM Pre-Operational Operational O(nbus M 2 + (nbus )3 ) 0 0 O((narea )(maxi |zi |)2 + (maxi |xi |)3 ) O(nbus M + (nbus )2 ) O((narea )(niterate )(maxi |zi |) + (d)(nbus )) O((narea )(niterate )(nb us)(maxi |zi |) + (d)(nbus )) O((narea )(niterate )(maxi |xi |)(maxi |zi | + d)) Table 1: Computational complexity of three algorithms separate with pre-operational complexity and operational complexity. Algorithm Communication Cost Communication Time Centralized O(mcl→cl ) O ((niterate )(d)(narea )(nbus )) O((niterate )(d)(nbus )(narea )(cl→l )) O(niterate )(d)(narea (maxi |xi |)cl→c + (narea )2 (maxi |xi |) O(tl→c ) O(niterate tl→l ) O((niterate )(tl→l )) O(narea tl→l ) CSE DDA ADMM Table 2: Computational Time and Cost of three algorithms. by O((niterate )(d)(narea (maxi |xi |)cl→c ) . However, to make every control area know the whole system state, it requires extra communication cost of sending incomplete state information to all other control areas through local communication channels. Only for this 4 control area system, with careful counting, we found that this extra 2 communication cost is bounded by O((narea ) (maxi |xi |). So the total communication 2 cost is O(niterate )(d)(narea (maxi |xi |)cl→c + (narea ) (maxi |xi |) Communication time: The communication time is bounded by O(narea tl→l ). However, to make every control area know the whole system state, it requires extra time to send incomplete states information to all other control areas through local communication channels. This extra communication time is bounded by communication cost is bounded by O(narea tl→l ). The total O(narea tl→l ) The computational complexity and communication time/cost for each algorithm are in Tables 5, 5, The table includes a numerical result from IEEE 14-bus system. 5.1 Discussion Pre-operational computational complexity: The CSE and dual averaging algorithms are the best among four algorithms, while the centralized algorithms is the worst. This is because the centralized algorithms require a big matrix inversion. Operational computational complexity: The centralized algorithm is the best among four algorithms, while the dual averaging algorithm is the worst. This is because the dual averaging algorithm requires high number of iterations. Overall computational complexity: In real world application, the operational computational complexity is more critical than the pre-operational computational complexity. 13 Once the pre-operational computation is done, the result can be stored and reused unless there is any update in the system (e.g. a bus connection line is cut). The numerical result here implies that, in term of overall computational complexity, the centralized algorithm is more advantageous than the distributed algorithm. Among the distributed algorithms, the ADMM algorithm is the best algorithm. In the general setting, especially when the problem is large, the ADMM algorithm may be as good or better than the centralized algorithm if niterate is in the same order as the problem size. In, [4] for the larger problem, IEEE 118-bus 4 system, niterate does not grow as the problem size grows. The 118-bus system reaches 10 accuracy in ek,c after 10 iterations. A theoretical justication is needed to show how in the ADMM algorithm actually relates to the problemâs size. Communication time (cost): cl→c ∼ cl→l and tl→c ∼ tl→l . In small communication network, we may assume that Then, the centralized algorithm is more preferable. When the communication network is large, it is possible that cl→c cl→l and tl→c tl→l . Then, the distributed algorithm might be more preferable. According to the numbers we computed, CSE and dual averaging algorithms has disadvantages due to large number of iterations and large bits of messages. The ADMM algorithm is a good candidate in which the communication time/cost might be comparable to the centralized algorithm. We need a model for relationship between local-to-local parameters and local-to-central parameters when the problem is large as well as how in the ADMM algorithm relates to the problemâs size. Roughly speaking, if the local-to-central parameters grow as fast as niterate , the ADMM algorithm is in even with the centralized algorithm. Local Observability None of the algorithms presented here require local observability. T In CSE no explicit inversion is taking place, therefore we do not require Hk Hk to be full rank for all k . In [9] the authors illustrate numerically how lack of observability leads to P T the global solution. The CSE algorithm only requires global observability, that is k Hk Hk needs to be full rank. ADMM does not require local observability, since matrix inversion step is being performed with psuedo measurments from it's neighbors. For the dual averaging algorithm, this is a new result in which one may explore its mathematical proof. 6 Convergence on Trees Networks Due to proliferation of sensing and computation capabilities on the distribution grid. Exploring the convergence properties of distributed estimation on tree like graphs seems and interesting direction of study. We simulated the PSSE problem using the ADMM only. It would be fruitful to explore the other techniques as well, however the large converegence times limited study to faster converging techniques. In the simulations, we tested randomly generated trees of size N = 251020. The algorithm used for random tree generation is documented in [6]. For these simulations, the ground truth state vectors were randomly generated as well as the observation matrices. We assumed that each bus made complete power observations of the entire system. That is, each bus had a bus power measurement as well as a branch ow measurement. This was chosen to reduce the variability of the convergence n results. Convergence was measured by the number of iterations taken for the ec,k < . We 14 chose an value of 0.01 and 0.001. 5 5 10 4 10 10 4 3 Iterations Iterations 10 10 2 10 10 0 0 0 5 10 15 10 −2 10 20 Tree Size 10 4 10 1 10 4 10 Iterations 3 10 2 10 3 10 2 10 1 1 10 10 0 0 0 0 10 5 5 10 −1 10 λ2 10 Iterations 2 10 1 1 10 10 3 10 5 10 15 20 Tree Size 10 −2 10 −1 0 10 10 1 10 λ2 k Figure 6: (Top Left) Number of iterations required for ec < 0.001 to be satied vs. tree size. k (Top Right) Number of iterations required for ec < 0.001 to be satied v.s. λ2 . (Bottom k Left) Number of iterations required for ec < 0.01 to be satied vs. tree size. (Bottom Right) k Number of iterations required for ec < 0.01 to be satied v.s. λ2 . Figure 6 illustrates the convergence properties of ADMM on randomly generated tree networks. The rst result that is apparent is that there is a large variation of termination time for a given graph size. In the second sets of plots, we see the variation of minimum iteration counts vs. the second smallest eigenvalue of the system. The loglog scale of the plots show a linear relationship. Also, there is high variation in the termination conditioning on λ2 . This is because each randomly generated tree had a randomly generated bus matrix as well as observation. The variation of bus matrices and observations might be causing the additional variation. At this moment, there has been no study of graph properties on the ADMM algorithm assuming single bus areas. However, the experimental results α point towards a polynomial relationship O(λ2 (L)). The simulation data is used to t a simple regression model. In the two experiments a exponent value of α=01 = −0.52143 and α=001 = −0.51850. The 95% percent condence intervals under linear regression is 95 95 α=(.001) = [−0.56, −0.48] and α(.01) = [−0.56 − 0.47]. It would be of theoretical interest to derive this bound based on rst principles, however the scope of this work leaves this to the future. Table 3 shows the mean number of iterations required before termination for dierent sized trees. The results hint towards a linear relationship, however there is a vary large 15 Tree Size 2 5 10 20 .01 32 125 542 2198 .001 20 116 1083 2931 Table 3: Mean computation interations required for convergence for trees of size N. 3000 ε = 0.01 ε = 0.001 Mean Iterations 2500 2000 1500 1000 500 0 0 5 10 15 20 Tree Size Figure 7: Mean computation time for various sizes of trees. varaince as shown in 6. An interesting question is how much of the variation is explainable by the sensing matrix generated (randomly generated for each test), observations (randomly generated for each test), and θ (randomly generated for each test). More study is needed to undertand the fundemental evolution of the ADMM algorithm. One avenue of interest would be interpreting the entire algorithm as a linear dynamical system. It can be shown, thatusing the following  state vector.               x1 (t + 1) s1 (t) s1 (t + 1) p1 (t + 1) x1 (t)   s1 (t − 1)       s1 (t)       p1 (t)        . . . . = A   +B z . .     xM (t)  xM (t + 1)      sM (t − 1)  sM (t + 1)      sM (t)  sM (t + 1)  pM (t + 1) pM (t) This system can be treated in the same manner as other stochastic approximation problems. It would be interesting to apply similar methodologies used in [5] in uncovering the emprically determined results relating convergence and graph spectrum as well as introducing damped update sceams to combat communication noise. 16 7 Conclusion In 4, we investigated the CSE and ADMM algorithms, which have appeared in the distributed power system estimation literature. In addition, we investigated the distributed dual averaging method. It turns out that CSE and the dual averaging algorithms suer from a large number of iterations and size of messages needed to be sent. The ADMM algorithm is experimentally justied to be a good algorithm for power system state estimation for many reasons. It converges fastest among three algorithms. It requires small computational complexity, communication cost and time. The local observability is not required for the ADMM algorithm. The only poor performance indicator for the ADMM algorithm is the tolerance to the communication noise. It is less tolerant to the communication noise than the CSE algorithm. Yet this ammount is small for moderate termination steps. In comparison with the centralized algorithm, the ADMM algorithm may be more ecient in term of computational complexity, communication time and cost when the problem size is large. According to [4], which expects the number of iterations in the power state system estimation to grow not as fast as the size of whole network, we found there to be a large increase as the size of the network grew in the case of trees. Future work should also include experimental results on connected graphs to correspond with future theoretical ndings. References [1] J. Duchi, A. Agarwal, and M. Wainwright. Dual averaging for distributed optimization: convergence analysis and network scaling. Automatic Control, IEEE Transactions on, (99):11, 2010. [2] S. Haykin. Adaptive lter theory (ise). 2003. [3] S. Kar, J.M.F. Moura, and K. Ramanan. Distributed parameter estimation in sensor networks: Nonlinear observation models and imperfect communication. Arxiv preprint arXiv:0809.0009, 2008. [4] V. Kekatos and G.B. Giannakis. Distributed robust power system state estimation. Arxiv preprint arXiv:1204.0991, 2012. [5] R. Rajagopal and M.J. Wainwright. Network-based consensus averaging with general noisy channels. Signal Processing, IEEE Transactions on, 59(1):373385, 2011. [6] A. Rodionov and H. Choo. On generating random network structures: Trees. Computational ScienceâICCS 2003, pages 677677, 2003. [7] F.C. Schweppe. Power system static-state estimation, part iii: Implementation. Power Apparatus and Systems, IEEE Transactions on, (1):130135, 1970. 17 [8] F.C. Schweppe and D.B. Rom. Power system static-state estimation, part ii: Approximate model. power apparatus and systems, ieee transactions on, (1):125130, 1970. [9] L. Xie, D.H. Choi, and S. Kar. Cooperative distributed state estimation: Local observability relaxed. In Power and Energy Society General Meeting, 2011 IEEE, pages 111. IEEE, 2011. 18

Distributed Power System State Estimation

Related documents

Products

Support

Distributed Power System State Estimation

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib