Managing and optimizing your QAD infra

advertisement
Managing & Optimizing: QAD Infrastructure
Tony Winter, Chief Technology Officer, QAD
Managing & Optimizing: QAD Infrastructure
Safe Harbor Statement
The following is intended to outline QAD’s general
product direction. It is intended for information
purposes only, and may not be incorporated into
any contract. It is not a commitment to deliver
any material, code, functional capabilities, and
should not be relied upon in making purchasing
decisions. The development, release, and timing
of any features or functional capabilities
described for QAD’s products remains at the sole
discretion of QAD.
2
Managing & Optimizing: QAD Infrastructure
What do we know about infrastructure?
3
Managing & Optimizing: QAD Infrastructure
QAD Managed Customer Environments
~18,000
USERS
> 230 Sites
4
Managing & Optimizing: QAD Infrastructure
QAD Managed Internal Environments
~ 1500 VM
Per Week
~ 2M VM Hours
Per Year
5
Managing & Optimizing: QAD Infrastructure
Performance
6
Managing & Optimizing: QAD Infrastructure
QAD On Demand
7
Managing & Optimizing: QAD Infrastructure
Compliance
• SAS 70  SSAE 16
~ ISAE 3402
• QAD Certified to ISO9001:2008
• Life Sciences – cGMP Qualified
SOP
SOP
SOP
SOP
SOP
8
Managing & Optimizing: QAD Infrastructure
QAD On Demand Services
• Complete Services
• 99.5% SLA Guaranteed
• Globally Managed
Service
Management
Infrastructure
Management
Systems
Management
Functional
Management
Customization
Management
EDI
Management
Network
Management
9
Managing & Optimizing: QAD Infrastructure
Service Management
• Dedicated
• SLA Reporting
• Manages Resources
Service
Management
Infrastructure
Management
Systems
Management
Functional
Management
Customization
Management
EDI
Management
Network
Management
10
Managing & Optimizing: QAD Infrastructure
Infrastructure Management
• Infrastructure
• Data Center
• Operating System
Service
Management
Infrastructure
Management
Systems
Management
Functional
Management
Customization
Management
EDI
Management
Network
Management
11
Managing & Optimizing: QAD Infrastructure
Systems Management
•
•
•
•
Monitoring and Tuning
System Administration
Disaster Recover (OE Replication+)
Version and Change Management
Service
Management
Infrastructure
Management
Systems
Management
Functional
Management
Customization
Management
EDI
Management
Network
Management
12
Managing & Optimizing: QAD Infrastructure
Network Management
•
•
•
•
QAD Leverages Virtela
Infrastructure
Monitoring
Management
Service
Management
Infrastructure
Management
Systems
Management
Functional
Management
Customization
Management
EDI
Management
Network
Management
13
Managing & Optimizing: QAD Infrastructure
Best Practices
14
Managing and Monitoring QAD Systems
Best Practice Outcomes
• Keep the users and sponsors of the QAD
software installation happy and productive
- Up and down the supply chain
•
•
•
•
Reliability
Performance
Visibility
Reporting
15
Managing and Monitoring QAD Systems
Reliability
16
Managing and Monitoring QAD Systems: Reliability
What do these companies have in common?
Generator Failures Caused 365 Main Outage…
“Several generators at 365 Main’s San Francisco data center
failed to start when the facility lost grid power Tuesday
afternoon, causing an outage that knocked many of the
web’s most popular destinations offline for several hours.”
- DataCenterKnowledge.com
Computer Breakdown
Halts Trading at London Exchange
for 7+ Hours…
“The exchange suffered its worst breakdown in eight
years as a computer glitch prevented traders from
participating in a global rally after the U.S. takeover of
Fannie Mae and Freddie Mac.”
- The New York Times
17
Managing and Monitoring QAD Systems: Reliability
What are the most common causes of
“Unplanned Downtime”?
8%
7%
27%
17%
Software
Hardware
Human Error
18%
23%
Network
Unknown
Natural Disaster
18
Managing and Monitoring QAD Systems: Reliability
Business Impact
Unplanned
Company
Down Time
Cost
Cause
Ebay
22 Hours
$3-5 Million
26% decline in stock price
Operating
System Failure
AT&T
6 – 26 Hours
$40 Million
Forced to file SLAs with FCC
Software
upgrade
MCI
10 Days
20 days free service to 3,000
enterprise customers
Software
upgrade
Charles Schwab
4 outages of at
least 4 hours
Hershey Foods
Unknown
Unknown – announced a $70
Operator errors
million infrastructure
and upgrades
investment
12% decline in 3Q99 sales; System failures
& application
19% drop in net income from
rollout
3Q98
Source: John Phelps, Gartner 2000
19
What are the Impacts to your business?
Tangible Impact
Salaries
Lost Productivity
Lost or Incomplete Orders
Revenue Lost
Intangible Impact
Time & Resources to recover/rekey
Dissatisfied Customers
Impact to Reputation
Loss of Market Share to competition
20
Managing and Monitoring QAD Systems: Reliability
Business Continuity (BC) & Disaster Recovery (DR)
What is the difference?
Personnel
 From an IT perspective, HA
& DR is a much larger part of
the pyramid
Business
Impact
Analysis
Risk
High
Assessment
Availability
Business
High
Availability
Emergency
Response
Systems
&
Training /
Disaster
Awareness
Recovery
Continuity
Disaster
Recovery
Public
Relations
Continuing
Operations
21
Managing and Monitoring QAD Systems: Reliability
Market Demands for Reliability
• Need to identify, resolve, and prevent
problems before they happen
• Need root-cause analysis for application or
business process problems without finger
pointing between groups (5 Whys?)
• High availability really means zero downtime
22
Managing and Monitoring QAD Systems: Reliability
Why is Reliability so important?
• Visibility into customer experience
• Assure no process steps are missed or lost
• Early detection of performance and availability
issues
• Guarantee of SLAs
• Decreased time and resources to fix issues
23
Managing and Monitoring QAD Systems: Reliability
High Availability
• Use Highly Available Architecture
- Along with well trained staff and good management systems
• HA systems employ fault tolerance, automated failure
detection, recovery, testing, problem and change
management
• Duplicate everything.
-
Eliminate single points of failure
24
Managing and Monitoring QAD Systems: Reliability
High Availability Guidelines
• High Availability Systems have the following
technical design requirements
- Heartbeat monitoring
- Scripts or tools to start / stop / failover and
failback the QAD application
- Shared storage (SAN)
- Non−Corruption of data when the failover occurs
• After imaging / replication
25
Managing and Monitoring QAD Systems: Reliability
Days
Where is your business today?
Where does your business need to be?
Seconds
Time to Recover (RTO)
Important measurements of Disaster Recovery
0 KB’s
Amount of Data Loss (RPO)
Many MB’s
26
Managing and Monitoring QAD Systems: Reliability
The Seven Tiers of Disaster Recovery
• Tier 0 : No offsite data
• Tier 1: Offsite backup but no “hot” site
• Tier 2: Offsite backup and “hot” site
• Tier 3: Electronic vaulting
• Tier 4: Point in time copies
- After imaging, disk flash copy
• Tier 5: transaction integrity
- OE replication, disk replication
• Tier 6: zero data loss
• Tier 7: completely automated
27
Managing and Monitoring QAD Systems: Reliability
What is your plan?
•
Do you do backups now?
•
Do you have a BC and/or DR plan?
•
Have you experienced any outages or unplanned downtime? What
caused it? How long was it? What was the business impact?
•
Are you a 24x7 shop? What are the consequences of any business
interruption?
•
What is the business impact to your customers? (e.g. do you provide
web ordering for your customers?)
•
Can you quantify the business impact if there is interruption or loss of
data for an hour, half a day, a day?
•
Do you have any DR/BC compliance requirements (e.g. with SOX,
ISO9001, SSAE16 …)?
28
Managing and Monitoring QAD Systems: Reliability
Nightly Backups
Process


Backup DB on some frequency
Store off-site
Recovery from Outage



Install software
Restore DB to Target
Reconnect clients
29
Managing and Monitoring QAD Systems: Reliability
Days
Seconds
Time to Recover (RTO)
Nightly Backups
0 KB’s
Amount of Data Loss (RPO)
Many MB’s
30
Managing and Monitoring QAD Systems: Reliability
Nightly Backup + After Image
Process



Backup DB on some frequency
Backup AI files on some frequency
Store off-site
Recovery from Outage




Install software
Restore DB + After Image to Target
Apply (Roll forward ) After Image
Reconnect clients
31
Managing and Monitoring QAD Systems: Reliability
Days
Backup
Seconds
Time to Recover (RTO)
Nightly Backup vs After Image
0 KB’s
Amount of Data Loss (RPO)
Many MB’s
32
Managing and Monitoring QAD Systems: Reliability
Nightly Backup + After Image Roll Forward
Process



Backup DB + AI Files on some frequency
Backup DB on Target
Move AI To Target and roll forward upon
arrival
Recovery from Outage


Restart Target Database
Reconnect clients
33
Managing and Monitoring QAD Systems: Reliability
Days
Backup
After
Imaging 1st
Level
Seconds
Time to Recover (RTO)
Nightly Backup vs. After Imaging vs. Roll Forward
0 KB’s
Amount of Data Loss (RPO)
Many MB’s
34
Managing and Monitoring QAD Systems: Reliability
OpenEdge ReplicationPlus
Process


Install / Configure replication
Continue making backups of DB & AI
Recovery from Outage


Fail over (manual or automatic)
Reconnect clients
Read Only
35
Managing and Monitoring QAD Systems: Reliability
Days
Backup
After
Imaging 1st
Level
After
Imaging 1st
Level
Seconds
Time to Recover (RTO)
Comparisons
OE
Replication
0 KB’s
Amount of Data Loss (RPO)
Many MB’s
36
Managing and Monitoring QAD Systems: Reliability
Disaster Recovery Objectives
• Working with the business, decide upon the
following objectives
- Recovery Time Objective (RTO)
• How long can you afford to be without your systems?
- Recovery Point Objective (RPO)
• When it is recovered, how much data can you afford to
recreate?
- Degraded Operations Objective (DOO)
• What will be the impact on operations with fewer data
centers?
- Network Recovery Objective (NRO)
• How long to switch over the network?
• Test you process!
37
Performance
Managing and Monitoring QAD Systems
38
Managing and Monitoring QAD Systems: Performance
What is Performance?
39
Managing and Monitoring QAD Systems: Performance
Roles and Responsibilities
• Users and IT support:
- application responsiveness
• Database administrators:
- database efficiency and potential bottlenecks
• Systems administrators and engineers:
- server capacity and utilization
• IT managers:
- user productivity, system availability, budgets
and risk avoidance
40
Managing and Monitoring QAD Systems: Performance
Typical Performance Problems
•
•
•
•
•
Slow and unresponsive applications
Unexplained / random application freezes
Batch processes fail to complete (in time)
Lack of scalability
Lack of capacity
41
Managing and Monitoring QAD Systems: Performance
Impact of Poor Performance
Forrester Research has reported that among companies
with revenue > $100 billion, nearly 85% reported
significant application performance degradation
Best Practices in Problem Management
Nearly 85% of applications are failing to meet and
sustain their performance requirements over time and
under increasing load
42
Managing and Monitoring QAD Systems: Performance
Impact of Poor Performance
• Lost productivity
• Lost confidence and credibility
– Customers
– End users
– Frustration (inconsistency)
• Lost revenue
• Low morale
• Financial penalties
43
Managing and Monitoring QAD Systems
Performance Engineering Methodology
44
Performance Engineering Methodology
How Do We Manage Performance?
• Establish performance objectives
• Identify critical requirements
• Define abnormal and normal conditions
– Service level agreements
• Create a baseline
• Continuous monitoring and alerting
– QAD monitoring framework
• Performance tuning
• Capacity planning and re-sizing
45
Performance Engineering Methodology
Performance Objectives
• Unless performance is actively managed
and benchmarked, user performance
expectations are hard to quantify.
“The system is running slow.”
“It takes too long to log in.”
What do these mean? Can we determine
critical / objective requirements?
46
Performance Engineering Methodology
Establish a Baseline
• Using KPIs and performance requirements
–
–
–
–
Create a set of baseline measurements
Capacity requirements planning & trending
Determine patterns for linear growth
Reliability and consistency
• Load testing tools may help with creating a
baseline (not a true reflection of the system)
– Apache Jmeter
– HP LoadRunner
• QAD Monitoring
47
Managing and Monitoring QAD Systems
QAD Monitoring
48
Monitoring QAD Systems
QAD Monitoring
49
Monitoring QAD Systems
QAD Monitoring
Ad Hoc Monitoring lacks transparency and can
lead to emergency performance escalations
• Continuous monitoring allows
-
Advanced notice of developing problems
Trending against the baseline
Extra information to aid in problem solving
The ability to deliver KPI information to
management on demand
50
Monitoring QAD Systems
Key Features
•
•
•
•
•
•
•
•
•
•
Holistic system monitoring
Visual correlation of data
Visibility into system trends and usage
Ability to deliver KPI information on demand
Powerful warning and exception alerting
Reporting framework
Scalable / flexible
Helps with delta sizing
Database lock monitoring
Long running browses
51
Monitoring QAD Systems
Technical Features
• Non−intrusive, lightweight monitoring
• Technology agnostic
- Can monitor any component in the QAD
technology stack on any platform
- Any supported database technology
• In built wiki with full documentation
• Industry standard open source components
- Proprietary QAD integration pieces
52
Monitoring QAD Systems
Graphing and Trending Features
• Allows graphing of numerical data for trending and
analysis
- Helps identify usage patterns and trends
• Enables visual correlation of data
- Data stored in time series (RRD) database
- Filter by time periods of 30m to 1 year
- Template driven deployment
53
Monitoring QAD Systems
Alerting Features
• Whenever a pre-defined condition is met,
an alert can be sent to one or more
contacts
- Email / Pager / Twitter / Phone / Chat
- Warning, critical and unknown alert levels
• Recovery messages
- Time zones, rosters
• Escalation paths
- Template driven definitions
• Inheritance and overrides
- Stores service level agreement data for reporting
54
Monitoring QAD Systems
Reporting and Service Level Agreements (SLA)
•
Core SLA Description
-
•
qaddb
qadadm
qadhlp
qadcust
qadui_AS
qadfin_AS
qadui_WS
NS1
proadsv
tomcat
host ping
telnet
UI Login
prod database
prod admin database
prod help database
optional prod custom database
prod UI Appserver
prod fin Appserver
prod Webspeed
prod Nameserver
prod AdminServer
prod TC listening port
ICMP ping OK
Telnet login OK
QAD UI Login OK
QXtend SLA Descriptions
-
qxidb
qxodb
eventsdb
tomcat
prod QXI DB
prod QXO DB
prod events DB
QX prod QX TC listening port
55
Monitoring QAD Systems
Mobile Device Support
* 3rd party apps
** webapp on appliance
56
Managing and Monitoring QAD Systems
Technology & Architecture
57
Technology & Architecture
Deployment
• Deployed as a virtual appliance (Linux
VMware image)
- No Open Edge or commercially licensed
components installed
- ESX ready or VMware Server ready version
• Security to monitored systems
- Communicates with the monitored servers via
trusted SSH relationships
- Keys are stored on the VM and pushed to the
remote servers
58
Technology & Architecture
Integration
• Technology agnostic:
- it does not care what is being monitoring
- the flexibility to monitor practically anything
- version independent
• Integration templates for:
- mapping to the QAD Architecture
- Tomcat, Open Edge / other databases,
Connection Manager, QAD business logic
- any supported Operating System*
- (Windows Support Currently Limited)
59
Technology & Architecture
Availability
• QAD Monitoring is available to customers
who:
- Are on current QAD Maintenance
- Require a Technical Q-Scan
• Establish baseline system health
• Tailor QAD Monitoring to their needs
• Training
60
Building the
Effective Enterprise
A glimpse …
… into the future.
61
QAD Technology
Devices
Present
Past
PC
Terminal
MES
Data Capture
• Device proliferation
• Access anywhere
• Intelligent
• Context aware
Future
Tablet
Mobile
TV
eReader
Ubiquitous
Worldwide Smart Connected Device Shipments,
2010-2016 (Unit Millions)
Media Tablets
PCs
Smartphones
Worldwide Smart Connected Device Shipments,
2010-2016 (Unit Millions)
Media Tablets
PCs
Smartphones
• Natural user interface
62
QAD Technology
Deployment
Present
Past
Mainframe
Virtualization
• Faster and Bigger Hardware
• All Virtualized
• Reduced Cloud Costs
Private Cloud
Public Cloud
Multi Instance
Client Server
Future
iOS Devices
Android Devices
QAD (future goal)
• 1 Person  100’s Systems
• Multi-tenant / Multi-instance
Single Instance
• Rebuild Over Repair
• Automated Provisioning
Hybrid Cloud
QAD (Today) domains
Regular websites
Google Apps
SalesForce.com
Workday.com
Single Tenant
Multi Tenant
63
QAD Technology
Deployment
Monitoring &
Management
Connectors
Customers
QAD Store
Installation
& Upgrade
Partners
Customization
Service
QAD Cloud Connect
QAD
On Premise
On Demand
64
QAD Technology
Deployment
•
•
•
•
•
•
Streamline Delivery
Automated Deployment
Standard Packages
Store
Monitoring
Management
65
QAD Technology
Integration
Present
Past
Hard Coded
EDI
File Transfer
SOA
XML
Future
Cloud Integration
EAI
BPM
XBRL
• Standardized Connectors
• Web Services
• Cloud Integration
• QAD Store
• Simplified Tools
• On Premise / On Demand
66
Building the
Effective Enterprise
Join us in San Antonio, TX
May 6-9, 2013
Early Bird Ends Soon!
67
Download