Managing & Optimizing: QAD Infrastructure Tony Winter, Chief Technology Officer, QAD Managing & Optimizing: QAD Infrastructure Safe Harbor Statement The following is intended to outline QAD’s general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, functional capabilities, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functional capabilities described for QAD’s products remains at the sole discretion of QAD. 2 Managing & Optimizing: QAD Infrastructure What do we know about infrastructure? 3 Managing & Optimizing: QAD Infrastructure QAD Managed Customer Environments ~18,000 USERS > 230 Sites 4 Managing & Optimizing: QAD Infrastructure QAD Managed Internal Environments ~ 1500 VM Per Week ~ 2M VM Hours Per Year 5 Managing & Optimizing: QAD Infrastructure Performance 6 Managing & Optimizing: QAD Infrastructure QAD On Demand 7 Managing & Optimizing: QAD Infrastructure Compliance • SAS 70 SSAE 16 ~ ISAE 3402 • QAD Certified to ISO9001:2008 • Life Sciences – cGMP Qualified SOP SOP SOP SOP SOP 8 Managing & Optimizing: QAD Infrastructure QAD On Demand Services • Complete Services • 99.5% SLA Guaranteed • Globally Managed Service Management Infrastructure Management Systems Management Functional Management Customization Management EDI Management Network Management 9 Managing & Optimizing: QAD Infrastructure Service Management • Dedicated • SLA Reporting • Manages Resources Service Management Infrastructure Management Systems Management Functional Management Customization Management EDI Management Network Management 10 Managing & Optimizing: QAD Infrastructure Infrastructure Management • Infrastructure • Data Center • Operating System Service Management Infrastructure Management Systems Management Functional Management Customization Management EDI Management Network Management 11 Managing & Optimizing: QAD Infrastructure Systems Management • • • • Monitoring and Tuning System Administration Disaster Recover (OE Replication+) Version and Change Management Service Management Infrastructure Management Systems Management Functional Management Customization Management EDI Management Network Management 12 Managing & Optimizing: QAD Infrastructure Network Management • • • • QAD Leverages Virtela Infrastructure Monitoring Management Service Management Infrastructure Management Systems Management Functional Management Customization Management EDI Management Network Management 13 Managing & Optimizing: QAD Infrastructure Best Practices 14 Managing and Monitoring QAD Systems Best Practice Outcomes • Keep the users and sponsors of the QAD software installation happy and productive - Up and down the supply chain • • • • Reliability Performance Visibility Reporting 15 Managing and Monitoring QAD Systems Reliability 16 Managing and Monitoring QAD Systems: Reliability What do these companies have in common? Generator Failures Caused 365 Main Outage… “Several generators at 365 Main’s San Francisco data center failed to start when the facility lost grid power Tuesday afternoon, causing an outage that knocked many of the web’s most popular destinations offline for several hours.” - DataCenterKnowledge.com Computer Breakdown Halts Trading at London Exchange for 7+ Hours… “The exchange suffered its worst breakdown in eight years as a computer glitch prevented traders from participating in a global rally after the U.S. takeover of Fannie Mae and Freddie Mac.” - The New York Times 17 Managing and Monitoring QAD Systems: Reliability What are the most common causes of “Unplanned Downtime”? 8% 7% 27% 17% Software Hardware Human Error 18% 23% Network Unknown Natural Disaster 18 Managing and Monitoring QAD Systems: Reliability Business Impact Unplanned Company Down Time Cost Cause Ebay 22 Hours $3-5 Million 26% decline in stock price Operating System Failure AT&T 6 – 26 Hours $40 Million Forced to file SLAs with FCC Software upgrade MCI 10 Days 20 days free service to 3,000 enterprise customers Software upgrade Charles Schwab 4 outages of at least 4 hours Hershey Foods Unknown Unknown – announced a $70 Operator errors million infrastructure and upgrades investment 12% decline in 3Q99 sales; System failures & application 19% drop in net income from rollout 3Q98 Source: John Phelps, Gartner 2000 19 What are the Impacts to your business? Tangible Impact Salaries Lost Productivity Lost or Incomplete Orders Revenue Lost Intangible Impact Time & Resources to recover/rekey Dissatisfied Customers Impact to Reputation Loss of Market Share to competition 20 Managing and Monitoring QAD Systems: Reliability Business Continuity (BC) & Disaster Recovery (DR) What is the difference? Personnel From an IT perspective, HA & DR is a much larger part of the pyramid Business Impact Analysis Risk High Assessment Availability Business High Availability Emergency Response Systems & Training / Disaster Awareness Recovery Continuity Disaster Recovery Public Relations Continuing Operations 21 Managing and Monitoring QAD Systems: Reliability Market Demands for Reliability • Need to identify, resolve, and prevent problems before they happen • Need root-cause analysis for application or business process problems without finger pointing between groups (5 Whys?) • High availability really means zero downtime 22 Managing and Monitoring QAD Systems: Reliability Why is Reliability so important? • Visibility into customer experience • Assure no process steps are missed or lost • Early detection of performance and availability issues • Guarantee of SLAs • Decreased time and resources to fix issues 23 Managing and Monitoring QAD Systems: Reliability High Availability • Use Highly Available Architecture - Along with well trained staff and good management systems • HA systems employ fault tolerance, automated failure detection, recovery, testing, problem and change management • Duplicate everything. - Eliminate single points of failure 24 Managing and Monitoring QAD Systems: Reliability High Availability Guidelines • High Availability Systems have the following technical design requirements - Heartbeat monitoring - Scripts or tools to start / stop / failover and failback the QAD application - Shared storage (SAN) - Non−Corruption of data when the failover occurs • After imaging / replication 25 Managing and Monitoring QAD Systems: Reliability Days Where is your business today? Where does your business need to be? Seconds Time to Recover (RTO) Important measurements of Disaster Recovery 0 KB’s Amount of Data Loss (RPO) Many MB’s 26 Managing and Monitoring QAD Systems: Reliability The Seven Tiers of Disaster Recovery • Tier 0 : No offsite data • Tier 1: Offsite backup but no “hot” site • Tier 2: Offsite backup and “hot” site • Tier 3: Electronic vaulting • Tier 4: Point in time copies - After imaging, disk flash copy • Tier 5: transaction integrity - OE replication, disk replication • Tier 6: zero data loss • Tier 7: completely automated 27 Managing and Monitoring QAD Systems: Reliability What is your plan? • Do you do backups now? • Do you have a BC and/or DR plan? • Have you experienced any outages or unplanned downtime? What caused it? How long was it? What was the business impact? • Are you a 24x7 shop? What are the consequences of any business interruption? • What is the business impact to your customers? (e.g. do you provide web ordering for your customers?) • Can you quantify the business impact if there is interruption or loss of data for an hour, half a day, a day? • Do you have any DR/BC compliance requirements (e.g. with SOX, ISO9001, SSAE16 …)? 28 Managing and Monitoring QAD Systems: Reliability Nightly Backups Process Backup DB on some frequency Store off-site Recovery from Outage Install software Restore DB to Target Reconnect clients 29 Managing and Monitoring QAD Systems: Reliability Days Seconds Time to Recover (RTO) Nightly Backups 0 KB’s Amount of Data Loss (RPO) Many MB’s 30 Managing and Monitoring QAD Systems: Reliability Nightly Backup + After Image Process Backup DB on some frequency Backup AI files on some frequency Store off-site Recovery from Outage Install software Restore DB + After Image to Target Apply (Roll forward ) After Image Reconnect clients 31 Managing and Monitoring QAD Systems: Reliability Days Backup Seconds Time to Recover (RTO) Nightly Backup vs After Image 0 KB’s Amount of Data Loss (RPO) Many MB’s 32 Managing and Monitoring QAD Systems: Reliability Nightly Backup + After Image Roll Forward Process Backup DB + AI Files on some frequency Backup DB on Target Move AI To Target and roll forward upon arrival Recovery from Outage Restart Target Database Reconnect clients 33 Managing and Monitoring QAD Systems: Reliability Days Backup After Imaging 1st Level Seconds Time to Recover (RTO) Nightly Backup vs. After Imaging vs. Roll Forward 0 KB’s Amount of Data Loss (RPO) Many MB’s 34 Managing and Monitoring QAD Systems: Reliability OpenEdge ReplicationPlus Process Install / Configure replication Continue making backups of DB & AI Recovery from Outage Fail over (manual or automatic) Reconnect clients Read Only 35 Managing and Monitoring QAD Systems: Reliability Days Backup After Imaging 1st Level After Imaging 1st Level Seconds Time to Recover (RTO) Comparisons OE Replication 0 KB’s Amount of Data Loss (RPO) Many MB’s 36 Managing and Monitoring QAD Systems: Reliability Disaster Recovery Objectives • Working with the business, decide upon the following objectives - Recovery Time Objective (RTO) • How long can you afford to be without your systems? - Recovery Point Objective (RPO) • When it is recovered, how much data can you afford to recreate? - Degraded Operations Objective (DOO) • What will be the impact on operations with fewer data centers? - Network Recovery Objective (NRO) • How long to switch over the network? • Test you process! 37 Performance Managing and Monitoring QAD Systems 38 Managing and Monitoring QAD Systems: Performance What is Performance? 39 Managing and Monitoring QAD Systems: Performance Roles and Responsibilities • Users and IT support: - application responsiveness • Database administrators: - database efficiency and potential bottlenecks • Systems administrators and engineers: - server capacity and utilization • IT managers: - user productivity, system availability, budgets and risk avoidance 40 Managing and Monitoring QAD Systems: Performance Typical Performance Problems • • • • • Slow and unresponsive applications Unexplained / random application freezes Batch processes fail to complete (in time) Lack of scalability Lack of capacity 41 Managing and Monitoring QAD Systems: Performance Impact of Poor Performance Forrester Research has reported that among companies with revenue > $100 billion, nearly 85% reported significant application performance degradation Best Practices in Problem Management Nearly 85% of applications are failing to meet and sustain their performance requirements over time and under increasing load 42 Managing and Monitoring QAD Systems: Performance Impact of Poor Performance • Lost productivity • Lost confidence and credibility – Customers – End users – Frustration (inconsistency) • Lost revenue • Low morale • Financial penalties 43 Managing and Monitoring QAD Systems Performance Engineering Methodology 44 Performance Engineering Methodology How Do We Manage Performance? • Establish performance objectives • Identify critical requirements • Define abnormal and normal conditions – Service level agreements • Create a baseline • Continuous monitoring and alerting – QAD monitoring framework • Performance tuning • Capacity planning and re-sizing 45 Performance Engineering Methodology Performance Objectives • Unless performance is actively managed and benchmarked, user performance expectations are hard to quantify. “The system is running slow.” “It takes too long to log in.” What do these mean? Can we determine critical / objective requirements? 46 Performance Engineering Methodology Establish a Baseline • Using KPIs and performance requirements – – – – Create a set of baseline measurements Capacity requirements planning & trending Determine patterns for linear growth Reliability and consistency • Load testing tools may help with creating a baseline (not a true reflection of the system) – Apache Jmeter – HP LoadRunner • QAD Monitoring 47 Managing and Monitoring QAD Systems QAD Monitoring 48 Monitoring QAD Systems QAD Monitoring 49 Monitoring QAD Systems QAD Monitoring Ad Hoc Monitoring lacks transparency and can lead to emergency performance escalations • Continuous monitoring allows - Advanced notice of developing problems Trending against the baseline Extra information to aid in problem solving The ability to deliver KPI information to management on demand 50 Monitoring QAD Systems Key Features • • • • • • • • • • Holistic system monitoring Visual correlation of data Visibility into system trends and usage Ability to deliver KPI information on demand Powerful warning and exception alerting Reporting framework Scalable / flexible Helps with delta sizing Database lock monitoring Long running browses 51 Monitoring QAD Systems Technical Features • Non−intrusive, lightweight monitoring • Technology agnostic - Can monitor any component in the QAD technology stack on any platform - Any supported database technology • In built wiki with full documentation • Industry standard open source components - Proprietary QAD integration pieces 52 Monitoring QAD Systems Graphing and Trending Features • Allows graphing of numerical data for trending and analysis - Helps identify usage patterns and trends • Enables visual correlation of data - Data stored in time series (RRD) database - Filter by time periods of 30m to 1 year - Template driven deployment 53 Monitoring QAD Systems Alerting Features • Whenever a pre-defined condition is met, an alert can be sent to one or more contacts - Email / Pager / Twitter / Phone / Chat - Warning, critical and unknown alert levels • Recovery messages - Time zones, rosters • Escalation paths - Template driven definitions • Inheritance and overrides - Stores service level agreement data for reporting 54 Monitoring QAD Systems Reporting and Service Level Agreements (SLA) • Core SLA Description - • qaddb qadadm qadhlp qadcust qadui_AS qadfin_AS qadui_WS NS1 proadsv tomcat host ping telnet UI Login prod database prod admin database prod help database optional prod custom database prod UI Appserver prod fin Appserver prod Webspeed prod Nameserver prod AdminServer prod TC listening port ICMP ping OK Telnet login OK QAD UI Login OK QXtend SLA Descriptions - qxidb qxodb eventsdb tomcat prod QXI DB prod QXO DB prod events DB QX prod QX TC listening port 55 Monitoring QAD Systems Mobile Device Support * 3rd party apps ** webapp on appliance 56 Managing and Monitoring QAD Systems Technology & Architecture 57 Technology & Architecture Deployment • Deployed as a virtual appliance (Linux VMware image) - No Open Edge or commercially licensed components installed - ESX ready or VMware Server ready version • Security to monitored systems - Communicates with the monitored servers via trusted SSH relationships - Keys are stored on the VM and pushed to the remote servers 58 Technology & Architecture Integration • Technology agnostic: - it does not care what is being monitoring - the flexibility to monitor practically anything - version independent • Integration templates for: - mapping to the QAD Architecture - Tomcat, Open Edge / other databases, Connection Manager, QAD business logic - any supported Operating System* - (Windows Support Currently Limited) 59 Technology & Architecture Availability • QAD Monitoring is available to customers who: - Are on current QAD Maintenance - Require a Technical Q-Scan • Establish baseline system health • Tailor QAD Monitoring to their needs • Training 60 Building the Effective Enterprise A glimpse … … into the future. 61 QAD Technology Devices Present Past PC Terminal MES Data Capture • Device proliferation • Access anywhere • Intelligent • Context aware Future Tablet Mobile TV eReader Ubiquitous Worldwide Smart Connected Device Shipments, 2010-2016 (Unit Millions) Media Tablets PCs Smartphones Worldwide Smart Connected Device Shipments, 2010-2016 (Unit Millions) Media Tablets PCs Smartphones • Natural user interface 62 QAD Technology Deployment Present Past Mainframe Virtualization • Faster and Bigger Hardware • All Virtualized • Reduced Cloud Costs Private Cloud Public Cloud Multi Instance Client Server Future iOS Devices Android Devices QAD (future goal) • 1 Person 100’s Systems • Multi-tenant / Multi-instance Single Instance • Rebuild Over Repair • Automated Provisioning Hybrid Cloud QAD (Today) domains Regular websites Google Apps SalesForce.com Workday.com Single Tenant Multi Tenant 63 QAD Technology Deployment Monitoring & Management Connectors Customers QAD Store Installation & Upgrade Partners Customization Service QAD Cloud Connect QAD On Premise On Demand 64 QAD Technology Deployment • • • • • • Streamline Delivery Automated Deployment Standard Packages Store Monitoring Management 65 QAD Technology Integration Present Past Hard Coded EDI File Transfer SOA XML Future Cloud Integration EAI BPM XBRL • Standardized Connectors • Web Services • Cloud Integration • QAD Store • Simplified Tools • On Premise / On Demand 66 Building the Effective Enterprise Join us in San Antonio, TX May 6-9, 2013 Early Bird Ends Soon! 67