DAME Dependability and Security Study: Progress Report Howard Chivers University of York Practical Security for e-Science Projects 25 November 2003 This talk presents my personal perspective, not the considered view of the project or any of its partners. But credit and thanks must go to busy developers and industrial partners who have been consistently helpful and generous with their time, and to Martyn Fletcher who is the primary author for study deliverables. Contents DAME Introduction The Method: Dependability and Security Stage One: System Context Stage Two: Asset Analysis Summary DAME Engine flight data London Airport Airline office New York Airport Grid Maintenance Centre American data center European data center Project Aims Develop a Grid-enabled diagnostic system Demonstrate this on the Rolls-Royce AeroEngine diagnostics problem – A Diagnostic Grid – Grid management tools for unstructured data – An practical application demonstrator Develop the understanding needed for industrial deployment: – Grid middleware and application/services layer integration – Scalability and Deployment options – Security and Dependability issues Challenges Support on-line diagnostic workflow in real time Deal with the data from 1000’s engines in operation Prove distributed pattern matching methodology Address customer concerns about grids, including scalability & security Demonstrate the business case for the technology Why use a grid? Implementing a distributed, integrated, workflow has considerable potential customer value The workflow requires collaboration between multiple stakeholders An integrated business process is needed to provide evidence for any diagnosis, and traceability to subsequent action The data is high volume, and is distributed between stakeholders’ sites (eg maintenance, factory, airports) The variable computing load makes resource sharing attractive for some processes DAME – Project Partners Universities: – – – – University of York University of Sheffield University of Oxford University of Leeds Industrial: – Rolls-Royce Aeroengines – Data Systems and Solutions – Cybula Infrastructure: - White Rose Grid - National e-Science Support Centre Developers Sheffield York Modeling & Decision Support Data Mining Services EngineModel-G AURA-G GT3 Service Zmod data search facility Zmod Viewer GT3 Service Browser based data viewer for zmod files DAME GUI GT3 Service Browser based GUI to DAME services DataStore-G GT3 Service Simulates arrival & storage of QUOTE data AURA-G Database GT3 Service BD25 Engine model wrapped as Grid Service WRG GT3/2 CBRAnalysis-G GT3 Service CBR advisor SDM Database Collaboration tools GT3 Service Toolset for multiuser collaboration WRG GT3/2 DAME workbench Leeds Grid Middleware Services WRG GT3/2 Oxford Engine Data Store Engine Data Database Workflow WRG GT3/2 DAME WRG Sign-on Portal Security GT3 Security Service ProxyManagement Browser based workflow tool. Compliant with Resource Broker DataVisualiser GT3 Service Jchart Viewer for viewing XTO output XTO-G GT3 Service XTO plug-ins via a Grid Service Resource Broker GT2 Service Schedule workflow tasks on WRG resource Analysis Approach: Dependability & Security Purpose of the Study Provide analysis to enable ultimate deployment of DAME in engine domain. Provide analysis as basis for deployment in other domains. Contribute to Grid community research in dependability and security. Dependability and Security Attributes: – Reliability – Safety – Maintainability – Security (Confidentiality, Integrity, Availability) Attributes have varying significance in different systems. Security (Risk) Analysis Focus on risk to the overall business process Process (see previous talk by Jonathan Moffett) – Define system context: » Boundary / actors / assets / external assumptions. – Analyse assets: » Identify impact / threat for each. – Attackers perspective. – Vulnerabilities. » Identify likelihood. From matrix, identify unacceptable deployment risks, example: – High impact and high likelihood need to be reduced. Security (Risk) Analysis System Context System Boundary External Assumptions Actors Attackers’ Perspective Assets Asset Analysis threats Impact L Vulnerabilities M L – H x H Likelihood M o Dependability Analysis High level analysis for complex systems developed at York is rooted in the need for safety cases of layered systems. Distributed services Service 0 Service N Analysis Interface Distributed Middleware Infrastructure Distributed Hardware Infrastructure Component under analysis High level Analysis of a Complex System Focuses on infrastructure. Approach at York (based on FMEA – Failure Modes an Effects Analysis + SHARD - Software Hazard Analysis and Resolution in Design): – Define high level functions at specified interface. – Apply guidewords (omission, commission etc.) – undesirable situations. – Cause. – Effect. – Derived requirements - to prevent / mitigate. Satisfy derived requirements to provide dependability. Choice of method Approaches have complementary strengths In combination: – Use security risk analysis to establish whole-system issues – Use ‘high level analysis’ to deal with non-security attributes, and provide infrastructure vulnerabilities into the main risk analysis – Combined study minimises project cost and customer involvement Take advantage of other sources of vulnerability information Observations The security risk method provides a useful overall framework . .. but in many projects a wider set of attributes will be needed. Using both forms of analysis explicitly deals with the flexible deployment of applications envisaged in the grid. .. but it remains to be seen if the interface requirements between applications and infrastructure are mature enough to allow dependability analysis. Stage One: System Context Context System Context System Boundary External Assumptions Actors Attackers’ Perspective Assets Asset Analysis threats Impact L Vulnerabilities M L – H x H Likelihood M o System Context System Context document (DAME/York/TR/03.007) – – – – – – Business process. System boundary. Actors (primary and supporting). Assets (service and data). Service interactions. External assumptions. Purpose: – Provides a concise reference – allows stakeholders to agree on a description of the system. – Identifies Assets: Services and Data » .. but not hardware? Actors & System Context Airline / Maintenance Contractor (at Airport) Engine Manufacturer (RR) Information / request for advice Dowload Engine Data Investigate using tools Remote / Distributed Tools and Services Perform Minor Repair Perform Inspections Local Diagnosis Domain Expert (DE) - engine expert Upload Engine Data Ground Support System Distributed Aircraft Maintenance Environment (DAME) - Miscellaneous Providers. DAME Diagnosis Update Engine Records Engine Data Center (EDC) - DS&S Service Data Manager (SDM) including Workscope Generator- RR Request advice from MA Maintenance Engineer (ME) Provide Diagnosis / Prognosis / Advice Investigate using tools Update Engine Record Remove engine and dispatch for major overhaul Request advice from DE Return overhauled engine to service Information / request for advice Provide Diagnosis / Prognosis / Advice Update Engine Record Update Engine Records Maintenance Analyst (MA) - maintenance expert Data Center (DS&S) Engine Maintenance Repair and Overhaul (MRO) Facility (RR / Contractor) Service Assets stores Engine Data Record in QUOTE / GSS * * 1 EngineDataStore-G stores / retrieves DAME results, annotations, etc. gets EDR from * ArrivalNotification gets EDR from 1 EngineDataCenter 1 1 1 1 1 Encoder-G 1 The EDC contains various independent tools and facilities - only the EngineDataStore is shown here. * 1 Portal-CollaborationEnvironment ZModViewer-G 1 1 1 WorkflowManager * 1 1 1 SDM-G AURA-G 1 1 1..* 1extracts orders using * 1 1 RoleDatabase * seaches for patterns using -EncodedZmodDataFeature 1..* gets EDR from 1 XTO-G 1 1 1 gets extracted orders 1 1 gets EDR from Chart-G visualises engine data using MyProxy 1 1 models engine using EngineModel-G 1 diagnoses fault using 1 CBRAnalysis-G searches for clusters using 1 gets SDM Record from 1 DataBaseMiner-G getsWorkflowAdvice gets SDM Records from -ClusterData 1 1 1 CBRWorkflowAdvisor-G 1 Data Assets 1 1 EncodedData 1 AURAEncodedData * 1 WorkFlowRuleSet CBRRuleSet 1 * 1 1 1 TrackedOrder 0..* * XTOFeatureResult AURAResult 1 0..1 1 0..* 1 1 QUOTEFeatureResult 0..1 Case * 0..1 0..1 ZmodViewerResult 0..1 1 EngineModelResult 1..3 * * 1 0..1 UserView 1 * * User Role 1 distinguishedName 1 1 Airframe * 1 Flight 0..1 * UserRole * 1 * FlightEvent Annotations 1 1 1 0..1 1 deadline status userStatus[3] EngineDataRecord 1 * 1 1 * 1 ChartResult 1 inputParamSet 1 SuggestedWorkflow 1 processPerfomance 1 1 1 WorkflowRecord * * 0..1 1 1 WorkflowRule CBRResult 0..1 0..1 0..1 * 1 1 Engine * * SDMRecord Service & Data co-deployment SDMRecord Get Maintenance Data AURAResult Uses CBRAnalyser Uses CBRRuleSet Produces CBRResult Context: Method Business Use-Cases & initial Service diagram derived from design documents Aim for a Deployment-neutral description Checks: – Build & check data and service models from the interactions specified in the use-cases. – Is the data required by each service consistent with the data model? – Do members of the project, and its customers, think this represents their system? Context: Method (2) Control granularity: – Services at deployment granularity. – Data, sufficient to distinguish between different use or origin. – Assets must be meaningful to customers to allow a discussion of threat & impact. Result: – 24 Data Types and 14 Services. – Contrast with » ‘Initial brainstorm’ meeting: 4 data types & 4 services » Previous slide (9): 3 data types & 13 services (2 different!) Observations Methodological analysis is necessary. Need to be flexible about representations & models to align with project methods. Control: – Granularity – Avoid mechanisms, keep to requirements The ‘grid’ nature may make it difficult to establish hardware assets - may be a problem or blessing, but needs to be recognised. The system is ‘virtual’ – need to be explicit about the management needed. Stage Two: Asset Analysis Asset Analysis Just Started. Generated pro-forma of assets and generic concerns. Reviewed with Industrial Partners: – Reviewed system context document. – Preliminary assets analysis - assigned concerns and impacts to: » Data assets » Service assets Need to document and confirm results with project and industrial partners. Process Keyword list to prompt discussion on each asset: – execution, confidentiality, integrity, availability, privacy, completeness,provenance, non-repudiation… Only about half these categories used, and not all for every asset. Impact rating: L/M/H in business terms: – – – L: M: H: significant cost impact on company bottom line long term impact on company bottom line Typical Concerns Confidentiality of key industrial properties. – The most critical, at present, are algorithms Integrity of data used to make business decisions. Provenance of critical decisions made using the system. Observations New system requirements will probably emerge from this study: – Finer grain control of users within roles – The need for provenance for data items as well as decisions (workflows) – The possible separation of different types of raw data to facilitate grid processing – The need to audit services in the (virtual) system Need to be careful about responsibilities when data or services are shared with other systems– e.g. long term data integrity for some data items is important, but outside DAME. Observations The customers have real security concerns – this is not a system where all parts will be allowed to ‘run anywhere’. – security analysis informs deployment options Keywords (e.g. integrity’) are very broad – need to record the actual concern in each case. Linking impact (L/M/H) to business criteria helps prevent ‘drift’ of assessments. Summary Documents Produced Discussion / working documents: – DAME Initial Dependability Assessment AME/York/TR/03.001. From meeting with industrial partners on 17th March 2003. – Analysis of the Grid – Phillipa Conmy – Security Risk Brief – Howard Chivers – Options for Merging Dependability and Security Analysis - Howard Chivers. This includes a neutral terminology. – DAME Dependability and Security: Asset Analysis pro-forma. DAME Dependability and Security: System Context Document - DAME/York/TR/03.007. Future Work Complete System Context document and asset analysis. Assess vulnerabilities, including the use of high level analysis function and dependability key word analysis. Produce likelihood - impact matrix. Target unacceptable risks. Identify deployment constraints & requirements Identify mitigation mechanisms e.g., encryption, access controls, replication, etc. Final Observations Security risk analysis is best carried out as an integrated part of the system design: – The context can be part of the standard system documentation – Deployment and other design tradeoffs can be made early – The security analysis will highlight requirements that might otherwise be missed. Final Observations (2) The grid nature of the problem introduces new challenges: DAME is a ‘virtual system’ – Mapping to hardware is deferred – Requirements for administration of the ‘virtual’ system, as well as individual resources Appropriate security is essential before systems of this sort can be exploited commercially.