1 Multimedia Systems Security: Video Data Analysis for Security Applications and Securing Video Data Dr. Bhavani Thuraisingham September 2007 Outline Data Mining for Security Applications Video Analysis Suspicious Event Detection Access Control Privacy Preserving Surveillance Secure Third Party Publication of Video Data Malicious Code Detection Directions and Opportunities 2 Acknowledgments Professor Latifur Khan for data mining applications and Malicious Code Detection Prof Elisa Bertino (Purdue) and Prof. Jianping Fan (UNCC) for Privacy Preserving Video Analysis Prof. Elisa Bertino, Prof Elena Ferrari (Milan/Como) and Prof. Barbara Carminati (Milan/Como) for Secure Third Party Publication Students at the University of Texas at Dallas 3 Data Mining for Security Applications Data Mining has many applications in Cyber Security and National Security Intrusion detection, worm detection, firewall policy management Counter-terrorism applications and Surveillance Fraud detection, Insider threat analysis Need to enforce security but at the same time ensure privacy 4 Problems Addressed Huge amounts of video data available in the security domain Analysis is being done off-line usually using “Human Eyes” Need for tools to aid human analyst ( pointing out areas in video where unusual activity occurs) Need to control access to the video data Need to securely publish video data Need to ensure that the data is not maliciously corrpupted 5 Video Analysis fore Security The Semantic Gap The disconnect between the low-level features a machine sees when a video is input into it and the highlevel semantic concepts (or events) a human being sees when looking at a video clip Low-Level features: color, texture, shape High-level semantic concepts: presentation, newscast, boxing match 6 Our Approach Event Representation Event Comparison Estimate distribution of pixel intensity change Contrast the event representation of different video sequences to determine if they contain similar semantic event content. Event Detection Using manually labeled training video sequences to classify unlabeled video sequences 7 Event Representation, Comparison, Detection Measures the quantity and type of changes occurring within a scene A video event is represented as a set of x, y and t intensity gradient histograms over several temporal scales. Histograms are normalized and smoothed Determine if the two video sequences contain similar high-level semantic concepts (events). l l 2 [h1k (i) h2 k (i)] 1 D 3L k ,l ,i h1lk (i) h2l k (i) 2 Produces a number that indicates how close the two compared events are to one another. The lower this number is the closer the two events are. A robust event detection system should be able to Recognize an event with reduced sensitivity to actor (e.g. clothing or skin tone) or background lighting variation. Segment an unlabeled video containing multiple events into event specific segments 8 Labeled Video Events These events are manually labeled and used to classify unknown events Walking1 Running1 Waving2 9 10 Labeled Video Events walking1 walking2 walking3 running1 running2 running3 running4 waving 2 walking1 0 0.27625 0.24508 1.2262 1.383 0.97472 1.3791 10.961 walking2 0.27625 0 0.17888 1.4757 1.5003 1.2908 1.541 10.581 walking3 0.24508 0.17888 0 1.1298 1.0933 0.88604 1.1221 10.231 running1 1.2262 1.4757 1.1298 0 0.43829 0.30451 0.39823 14.469 running2 1.383 1.5003 1.0933 0.43829 0 0.23804 0.10761 15.05 running3 0.97472 1.2908 0.88604 0.30451 0.23804 0 0.20489 14.2 running4 1.3791 1.541 1.1221 0.39823 0.10761 0.20489 0 15.607 waving2 10.961 10.581 10.231 14.469 15.05 14.2 15.607 0 Experiment #1 Problem: Recognize and classify events irrespective of direction (right-to-left, left-to-right) and with reduced sensitivity to spatial variations (Clothing) “Disguised Events”- Events similar to testing data except subject is dressed differently Compare Classification to “Truth” (Manual Labeling) 11 12 Experiment #1 Disguised Walking 1 walking1 0.97653 walking2 0.45154 walking3 0.59608 running1 1.5476 running2 1.4633 running3 1.5724 Classification: Walking running4 1.5406 waving2 12.225 13 Experiment #1 Disguised Running 1 walking1 1.411 walking2 1.3841 walking3 1.0637 running1 0.56724 running2 0.97417 running3 0.93587 Classification: Running running4 1.0957 waving2 11.629 XML Video Annotation Using the event detection scheme we generate a video description document detailing the event composition of a specific video sequence This XML document annotation may be replaced by a more robust computer-understandable format (e.g. the VEML video event ontology language). <?xml version="1.0" encoding="UTF-8"?> <videoclip> <Filename>H:\Research\MainEvent\ Movies\test_runningandwaving.AVI</Filename> <Length>600</Length> <Event> <Name>unknown</Name> <Start>1</Start> <Duration>106</Duration> </Event> <Event> <Name>walking</Name> <Start>107</Start> <Duration>6</Duration> </Event> </videoclip> 14 Video Analysis Tool Takes annotation document as input and organizes the corresponding video segment accordingly. Functions as an aid to a surveillance analyst searching for “Suspicious” events within a stream of video data. Activity of interest may be defined dynamically by the analyst during the running of the utility and flagged for analysis. 15 Access Control: Authorization Objects 16 Authorization objects, the actual video data to which we wish to restrict access and represented in the form of a 7 value tuple. This tuple contains information about the content of a particular video object. Some of this content information pertains to high-level semantic information such as events and objects. This information is stored as a set of concepts taken from a “closed-world” hierarchical taxonomy which relates these concepts to one another. Other content information such as location and timestamp is represented as a special data type that allows more meaningful specification of this unique kind of content. Access Control: Video Object Hierarchy Surveillance Object Video Camera Hallway Camera Lobby Camera Still Camera Satellite Image Aerial Image 17 Access Control: Other Concepts Events is the set of semantic events occurring within the video object. Objects is the set of semantic objects contained within the video object. Location is the term indicating the geographic earth coordinates of where the surveillance video object was captured. Timestamp is the term describing the real world time when the video was captured. 18 Access Control: Event and Object Hhierarchies Video Object Toy Video Event Ball Vehicle Frisbe e Car Mobile Event Walkin g Stationar y Event Runni ng Jumping Waving Truck 19 Video Object Expressions 20 Video object expressions describe the object for which access control is to be applied. These expressions are expanded and made more robust so that a video object may be specified not only by its object ID but rather by any of its attributes or their combination. This is similar to querying a relational database using a complex SQL query specifying a particular set of records. We use access functions to reference the different components of our surveillance video objects for use in our expressions. Authorization Subjects We use the concept of user credentials to authorize users. That is, each user entity, in addition to having a unique user id or belonging to a group also possesses a set of credentials. Each credential is an instantiation of a certain credential type, the template for credentials in which the set of credential attributes, and whether they are optional or obligatory is defined. Specific values are assigned to these attributes when a new user instantiates the credential type. A subject may instantiate any number of credential types. These credential types are defined in a credential type hierarchy relating each credential type to the other credential types 21 Access Control: Credential Type Hierarchy Person Security Officer Police Maintenance Staff Guard Database Administrator Patrolman Captain 22 Access Control: Authorizations Authorizations are what allow us to specify our access control policy for the objects in our video surveillance database. Derived Authorizations: The properties of the hierarchical taxonomies used in defining surveillance video object types, semantic event types and semantic object types can be used to obtain implicit authorizations from the explicit authorizations specified as a part of the access control policy base. Additionally the relationships between the various privilege modes allow further extrapolation of authorizations. 23 Access Control Algorithm 24 User requests for surveillance video objects must be compared to the policy base of object authorizations before access can be granted. Furthermore, if the user request is not for a specific object but rather a query for a particular set of objects the system must be able to successfully reconcile the query criteria with the objects existing in the database. If the user request is authorized for some part (but not all) of the surveillance video object instead of denying the access entirely it is possible to post-process the data after retrieval and release only authorized portions to the user. Hence our access control process has three major components: Authorization, retrieval, post-processing and delivery. Access Control Policies: Extensions Policies based on content, associations, time, and event Policy engine that evaluates the policies for consistency Enforcement engine for enforcing the policies Distributed policies: Objects at different locations taken together are sensitive 25 System Architecture for Access Control Pull/Query User Push/result X-Access X-Admin Admin Tools Policy base Credential base Video XML Documents 26 Third-Party Architecture XML Source Credential base The Owner is the producer of information It specifies access control policies on the Video objects The Publisher is responsible for managing (a portion of) the Owner information and answering subject queries Goal: Untrusted Publisher with respect to Authenticity and Completeness checking 27 policy base SE-XML Owner credentials Publisher Reply document Query User/Subject Security Enhanced Video XML document XML Document • Policy Information • Merkle Signature SE-XML Document 28 Privacy Preserving Video Analysis •A recent survey at Times Square found 500 visible surveillance cameras in the area and a total of 2500 in New York City. •What this essentially means is that, we have scores of surveillance video to be inspected manually by security personnel •We need to carry out surveillance but at the same time ensure the privacy of individuals who are good citizens 30 System Use Raw video surveillance data Faces of trusted people derecognized to preserve privacy Face Detection and Face Derecognizing system Suspicious Event Detection System Manual Inspection of video data Suspicious people found Suspicious events found Report of security personnel Comprehensive security report listing suspicious events and people detected Detecting Malicious Code ✗Content 31 -based approaches consider only machine-codes (byte-codes). ✗Is it possible to consider higher-level source codes for malicious code detection? ✗Yes: Diassemble the binary executable and retrieve the assembly program ✗Extract important features from the assembly program ✗Combine with machine-code features ✗Extract both Binary n-gram features and Assembly n-gram features Hybrid Feature Retrieval (HFR) Training Testing 32 Summary and Directions 33 We have proposed an event representation, comparison and detection scheme. Working toward bridging the semantic gap and enabling more efficient video analysis More rigorous experimental testing of concepts Refine event classification through use of multiple machine learning algorithm (e.g. neural networks, decision trees, etc…). Experimentally determine optimal algorithm. Develop a model allowing definition of simultaneous events within the same video sequence Define an access control model that will allow access to surveillance video data to be restricted based on semantic content of video objects Secure publishing of Video Documents Privacy Preserving Analysis Detecting Malicious Code Opportunities for the Community We 34