Improving Analytical Tools – The Future of Interrogating Data London, October 26 2010 OrbisIP Technology Services Consultancy Technology scouting and evaluation Product Distribution Technology Readiness Level enhancements Secure Software Development Horizon‐scanning & Technology watching Open innovation services Security Technology Clusters Digital Forensics Biometrics Video Analytics Data Mining & Visualisation Cryptography Network Security Internet Security Secure Software Security Management & Architecture Data & Database Security Hardware, Embedded & Device Security Homeland Security Over 150 items of novel security technology exist in the OrbisIP Technology Tracker, available online The OrbisIP Model Consumers of IP Sources of IP University Research Labs National Research Labs SMEs IP Requirements IP Requirements Security Companies Advisory Board Tech. Transfer IP Licensing Technology Advisers Revenue Share Operations Primes Governments Payment / Royalties SITC conference challenge ".... During the investigations into the London bombings in 2005, 90,000 hard drives and video tapes from CCTV systems were seized, together with 100 computers, 500,000 pages of photocopying, 4000 exhibits, 70 telephones and 10,000 statements” “… a responsibility to capture, consolidate and interrogate massive volumes of structured and unstructured data. How do you make sense out of all the data coming in and then use it to make a difference?” " ... technologies which can assist users in making informed, strategic decisions." The challenges of large scale data analysis Data tracking, capture, management and isolation Retrieval and analysis of large data sets Analytical impartiality Data interrogation Decision support Actionable intelligence Reportage and distribution Helping to address the challenges Intercept Modernisation Programme ‐ IP data capture and interrogation – Network Traffic Surveillance System High speed data retrieval and management – Clusterpoint XML data base Decision Support and tracking technology – SheBa Structured and Unstructured Data Analysis ‐ Leximancer Intercept Modernisation Programme “Every email, phone call and website visit is to be recorded and stored after the Coalition Government revived controversial Big Brother snooping plans.” The Telegraph 20 Oct 2010 "We will introduce a programme to preserve the ability of the security, intelligence and law enforcement agencies to obtain communication data and to intercept communications within the appropriate legal framework … Communications data provides evidence in court to secure convictions of those engaged in activities that cause serious harm. It has played a role in every major Security Service counter‐terrorism operation and in 95 per cent of all serious organised crime investigations.” UK Government Strategic Defence and Security Review October 2010 Intercept Modernisation Programme Technical challenges: Mass capture and storage of data by ISPs Gathering intelligence from data sets Interrogate to level of all packets generated at IP address Produce auditable interactions with the data set that conform to existing legislation and can be submitted in court to support prosecution Intercept Modernisation Programme Technical challenges: Mass capture and storage of data by ISPs Gathering intelligence from data sets Interrogate to level of all packets generated at IP address Produce auditable interactions with the data set that conform to existing legislation and can be submitted in court to support prosecution Network Traffic Surveillance System (NTSS) - TRL: 9 NTSS collects data on ALL user network activities 1. All TCP & UDP traffic IP packets between customer and Internet get forwarded to NTSS Internet Traffic IP packets 2. IP packets are reengineered back to application level information units (web pages viewed, e-mails sent, documents transferred). NTSS Decoder 3. All reengineered and analysed information is fully indexed and stored in Clusterpoint Server database. 4. Easy to use WEB interface provides necessary tools, to: • get a quick situation overviews, • search through the collected data, • receive alerts on user defined criteria, • follow up on network user activities, • preview the reconstructed information. Entirely searchable & scalable NTSS database High speed data retrieval and management The issues: Size of data base and scalability Retrieval or interrogation time The solution – Clusterpoint DBMS – TRL 9 create a fully scalable and fast response time XML based tagged database easily and lineary scalable ‐ no additional development required to scale the storage and necessary processing power can improve data retrieval times in such unstructured data storages 100‐fold and more with response times of sub 5 seconds in multi‐terabyte databases Clusterpoint DBMS Architecture AUTHORIZED USERS SECURITY APPLICATION SERVICES Clusterpoint API XML Entirely searchable & scalable NTSS database Existing databases SECURITY AUDITING AND MONITORING DATA CLUSTERPOINT NTS ( MULTI-SERVER CLUSTER ) Objective & Subjective Decision support The Challenges: Capture, manage and analyse information on various topics of relevance, including multiple items of information supplied at different times, and from different sources; Consider factors that degrade the certainty of the information, such as the length of time that has passed between its collection and its use; Represent the reliability or credibility of information and its source, including its provenance, and the source’s objectivity, access, and specificity. Combine multiple sources of information with varying levels of reliability, and whose reliability may change over time; Objective & Subjective Decision support The Solution: Sheba Estimative Intelligence Tool – TRL 5 An application for performing predictive analysis under conditions of uncertainty a framework for users to structure and analyse estimative intelligence problems uses advanced probability theory to manage both likelihoods and certainty for problems in estimative intelligence transparent analytical structure necessary to allow the analyst to defend, and the consumer to understand, the judgments reached Structured & Unstructured Data Analysis The challenge: How do you extract fully‐automated meaningful intelligence from vast scalable data sets of structured and unstructured data? Generating concept maps and an Automatic Sentiment Lens Avoiding onerous set‐up overhead, user‐manipulation or prejudices or priori rule‐sets? Providing multilingual analysis and integration into other applications through an API Offering full flexibility for analysis of outcomes and exportability into reporting platforms Structured & Unstructured Data Analysis The solution: Leximancer – TRL 9 Some applications: Intelligence Profiling: rapid information gathering, correlation, validation and analysis Email Analysis & Security: validating & predicting security classification of email i‐Library Indexing & Search: indexing, search and retrieval within large databases/libraries Tendering: matching tender responses to questions ‐ coverage analysis Web searching and analysis: via Hypermancer Leximancer ‐ Conversation Analysis As long as speaker labels are formatted correctly, Leximancer will automatically extract the speaker identifiers as variables, and associate these labels with their utterances. This allows content from selected speakers to be filtered in or out, and allows comparative analysis between speakers, normally using the discovered concepts as independent variables. Conversation analysis can be extended to incorporate: Blogging/forums etc. on the internet – brand/product analysis etc. Email dialogues – litigation eDiscovery Voice‐to‐text translations ‐ call centre dialogues, meetings, scenario training/simulations OrbisIP ‐ Contact Details Peter Jaco pjaco@orbisip.com OrbisIP Limited 9‐10 St. Andrew Square Edinburgh, EH2 2AF United Kingdom Tel: +44 (0) 131 718 6023 Fax: +44 (0) 131 718 6100 Cell: +44 (0) 7855 308 290