DISTRIBUTED - KBTA: A DISTRIBUTED FRAMEWORK FOR EFFICIENT COMPUTATION OF KNOWLEDGE-BASED TEMPORAL ABSTRACTIONS ARD Arkady Mishiev Maor Guetta -0- Table of Contents 1 Introduction .......................................................................... 2 1.1 Vision ............................................................................. 2 1.2 The Problem Domain ........................................................ 2 1.3 Stakeholders ................................................................... 3 1.4 Software Context ............................................................. 3 1.5 System Interfaces ............................................................ 4 1.5.1 Hardware Interfaces – N/A........................................... 4 1.5.2 Software Interfaces..................................................... 4 1.5.3 Events ....................................................................... 4 2 Functional Requirements......................................................... 5 2.1 The system should be initialized with defined knowledge base. Error! Bookmark not defined. 2.2 Handle query requests ........... Error! Bookmark not defined. 2.3 Manage array of available Computational Units ............. Error! Bookmark not defined. 2.4 Distribution of computational process by (from paper) ... Error! Bookmark not defined. 2.5 Scheduling ........................... Error! Bookmark not defined. 3 Non-Functional Requirements .................................................. 6 3.1 Throughput .......................... Error! Bookmark not defined. 3.2 Safety .................................. Error! Bookmark not defined. 4 Usage Scenarios .................................................................... 7 4.1 Use Cases ....................................................................... 9 5 Risks ................................................................................... 12 6 Appendices .......................................................................... 12 -1- 1 INTRODUCTION Using KBTA method in security domain involves extremely large amount of computations on temporal data in order to derive abstractions. The performance of these computations is limited by characteristics of specific machine that running the KBTA framework. Since, performance issue for KBTA process in security domain is crucial, there is a motivation to improve its framework in order to be useful. 1.1 Vision The goal of our project is to develop framework that manage a Parallel Computation of Knowledge-Based Temporal Abstractions (KBTA) by distributing the computation to stand-alone computation units. 1.2 The Problem Domain 1.2.1 Major components of the system. Control Unit - Receives queries from user and dispatches them to the CU’s. CU – Computational Unit, performing a KBTA process on the queries. Represented by stand-alone machines with KBTA framework installed on them. Visualization Exploration Application – The GUI application, allowing submitting queries and displaying results of processing. (Represented as User Interface on block diagram below). CU Monitoring GUI - Displays updated info about the CU’s activity. -2- 1.2.2 The following diagram describes the interaction between major components User Interface Control Unit Computational Units Query Module CU Monitoring GUI Comp. Unit Scheduler Database Monitoring Module 1.3 Stakeholders 1.3.1 Users – Researchers and security experts 1.3.2 Customers - Deutsche Telekom 1.3.3 Sponsors – Deutsche Telekom 1.4 Software Context Description of major system components Control Unit - Receives queries from user and dispatches them to the CU’s. Control Unit also displays activity states for each one of registered Computational Units. CU - Computational Unit, performing a KBTA process on the queries. Represented by stand-alone machines with KBTA framework installed on them. -3- Visualization Exploration Application – The GUI application, allowing submitting queries and displaying results of processing. (Represented as User Interface on system diagram). CU Monitoring GUI - Displays updated info about the CU’s activity. Input The input of the system is 1. Query from user. 2. Result of processing query by CU. 3. Registration/Unregistration requests by CU. 1.5 System Interfaces 1.5.1 Hardware Interfaces – N/A 1.5.2 Software Interfaces 1. The system will implement API of Visualization Exploration Application (GUI) in order to interact with user. 2. The system will use predefined protocol in order to interact with Computational Units. 1.5.3 Events Control Unit will get Accept query events from User Interface. Register/Unregister me events from CU Accept processing results events from CU Update my activity state events from CU CU Monitoring GUI will get Accept Plan for processing events from Control Unit Ping events from Control Unit -4- 2 FUNCTIONAL REQUIREMENTS 2.1 Initializing # Function 1 Configurability 2 3 4 Connection with the database Connection with the knowledge base Preprocessing Description The system should read from configuration files the following info: ip address of the database which Knowledge set to use where Knowledge resides Set up connection with database. Set up connection with the knowledge base and retrieve knowledge Process the knowledge and extract information that is relevant to the task analysis process (e.g., semantic links between concepts) 2.2 Handling Query (Analyzing Query and Generating Plans for CUs) # Function Description 1 Receive Query 2 Analyzing the query 3 4 5 Generating plans for a processed query Dispatch Plan Get results from CU 6 Aggregation 7 Return results The system should be able to receive queries issued using the Visual Exploration Module or other application. The query is described in (Appendix A) The system should be able to analyze on base of: 1. Knowledge 2. Characteristics in Database 3. Available CU’s On base of analyzed result the system should generate plans for each CU Dispatch a plan to any available CU The system should be able to receive results of computational process from all CU’ The system should aggregate of results from all CU. The system should return the results of computation process. -5- 2.3 CU Management # Function 1 2 Register Monitor 3 Monitor the state of connection with CU Unregister 4 5 6 CU State Visualization Logging Description Registering specific CU to the Control Unit. Each registered CU should be able to report about its current state. The system should be able to validate the connectivity state of each registered CU Removing CU from list of available units (the opposite of #1) All the CU’s states and their activities are visualized on GUI dedicated for this purpose The system should record all significant activities information like When query received Queries description How many CU was available and when When the results has received Failure reasons and details 3 NON-FUNCTIONAL REQUIREMENTS 3.1 Performance Constraints 3.1.1 Speed Initialization - the initialization process should take less than X seconds. 3.1.2 Reliability The system’s reliability will be measured by comparing the ratio of the threat detection percentage vs. the false alarm percentage in the system, to the same ratio in a single CU, the both ratios should be equal, or at least, very close. -6- 3.1.3 Portability Applicability - The system shouldn't be applicable, in the close future, with any other existing software components except the Computational Units (CU) and the Visualization Exploration Tool. 3.1.4 Usability Simplicity - The Query format should be simple and concise as long as possible. Configurability - The system should be configurable for the advanced user by a configuration file Error notifying - In case of an failure, the system should notify about it. 3.1.5 Availability Running period - As long as the system doesn't get a command from the user to stop its activity, the system should run constantly regardless the state of any CU or the Visualization Exploration Tool. 3.2 Special restrictions and limitations 3.2.1 Compatibility The system should be compatible with the CU's and the Visualization Exploration Tool 3.2.2 Modularity Modularity of whole system and specially of Decision Making Engine. Decision Making Engine should be easily changed. 3.2.3 Logging All activities and events of the system should be recorded by Logging module. 3.2.4 Programming Language The system will be implemented on base of Java Platform. 4 USAGE SCENARIOS 4.1 User Profiles -7- Actors: 1. User/Security Expert/Visualization Exploration Application 2. CU – KBTA Computational Unit Goals: User (Knowledge Expert) 1. Send queries to the system 2. Get result of processing query CU 1. Get Plan from the System 2. Return results of processing query 3. Register/Unregister at Coordinator 4. Report about activity’s state 5. Response to the connectivity state query (echo request/is alive) 4.2 Use Case Diagram -8- 4.3 Use Cases -9- 4.3.1 Register CU Actors: Stand-alone Computational Unit. Description: The system performs registration of connected CU. Trigger: The CU is initialized. Pre-conditions: The CU is loaded all necessary info from Database and Knowledge base configuration. Post-conditions: The CU is registered in the list of available CUs, and could be monitored by Monitoring GUI. Covered Requirements: 2.3.1, 2.3.2. 4.3.2 Check Connection State Actors: Stand-alone Computational Unit. Description: The systems performs ping to specified CU in order to decide if it still alive. Trigger: Predefined time interval. Pre-conditions: The CU is already registered. Post-conditions: The state of CU remained as it was before the check. Covered Requirements: 2.3.3. 4.3.3 Handle Query Actors: Knowledge Expert. Description: The system handles a query that should be processed. Trigger: Knowledge Expert performs check of target system. Pre-conditions: The system is initialized. Post-conditions: The system begin to process query. Covered Requirements: 2.1.1 – 2.1.4, 2.2.1, 2.2.2. 4.3.4 Analyze a query Actors: Stand-alone Computational Unit. Description: The system preprocessing received query on base of: raw data, Knowledge base, list of available CUs. Trigger: Knowledge Expert performs check of target system. Precondition: The query received by system. Post-condition: The system has list of Plans for each one of available CUs. Covered requirements: 2.1.1 – 2.1.4, 2.2.1 – 2.2.3. 4.3.5 Dispatch Plan Actors: Stand-alone Computational Unit. - 10 - Description: The system dispatches the plan to CU chosen in previous steps by Distribution Strategy. Trigger: Knowledge Expert performs check of target system. Pre-conditions: 1. A query was received by the system. 2. The system performed analyze of query. 3. The system performed distribution process. 4. The CU is registered and ready to get plans. Post-conditions: The CU begin to process the plan. Covered Requirements: 2.1.1 – 2.1.4, 2.2.1 – 2.2.4. 4.3.6 Handle Computation Results Actors: Stand-alone Computational Unit. Description: The CU returns to the system results of computation process. Trigger: Completing computation process by CU. Pre-conditions: The system handled query and dispatched correspondent plan to the CU. Post-conditions: CU is ready for processing next task. Covered Requirements: 2.2.5. 4.3.7 Report Activity State Actors: Stand-alone Computational Unit Description: CU reports about its activity state. Trigger: 1. CU began some activity. 2. CU finished some activity. Pre-conditions: CU is registered in the system. Post-conditions: Reported state is updated in system. Monitoring GUI updated the state of specified CU. Covered Requirements: 2.3.2, 2.3.5, 2.3.6. 4.3.8 Unregister CU Actors: Stand-alone Computational Unit Description: The system removes specified CU from the list. Trigger: 1. Connection with CU is failed. 2. The CU was turn down by the operator. Pre-conditions: Specified CU is registered. Post-conditions: Specified CU removed from list of available units. Covered Requirements: 2.3.2, 2.3.4, 2.3.5, 2.3.6 - 11 - 5 RISKS – N/A 6 APPENDICES 6.1 Glossary Coordinator CU KBTA KBTA Process Knowledge Knowledge Characteristics Knowledge Expert Plan Query Visualization Exploration Application - 12 -