SLA - UC Davis JIRA

advertisement
UC Davis Central Kuali Rice Service
Service Level Agreement
By: IET
Last Revised
7-22-2010
1
Purpose
This document serves as an agreement between IET and any application sponsor integrating with the UC Davis Central Kuali
Rice Service (hereafter referred to as the “Rice service”). The objective of the Service Level Agreement (SLA) is to present a
clear, concise and measurable description of the Rice service.
Effective Date: (Date SLA is put into effect)
Version History
Version
1.0
1.1
1.2
1.3
1.4
Date
5/14/10
6/23/10
7/15/10
7/16/10
7/22/10
Revision Description
First Draft
Author
Hampton Sublett
Hampton Sublett
Hampton Sublett
Hampton Sublett
Hampton Sublett
Approval History
Approver
Title
Approval Date
Agreement Termination (if necessary)
Approver
Title
Approval Date
2
Table of Contents
1.
General Overview
4
2.
Service Description
4

Service Scope
4

Assumptions
4
3.
Roles and Responsibilities
5

Stakeholder Information
5

IET Responsibilities
5

Integrating Application Team Responsibilities
5
4.
Service Enhancements and Minor Bug Requests
5
5.
Service Availability, Response Times and Escalation Procedures
5

6


Hours of Coverage, Response Times and Escalation Procedures

Rice Team Coverage Periods
6

“Blocker” Issue Type Notification Process
6

“Non-Blocker” Issue/Enhancement Type Notification Process
6

Response Times
6

Prioritization
6

Escalation of Issues
6

Communication
7
Planned Maintenance and Service Changes
7

Release Schedule
7

Advanced Notification/Negotiation
7

Service Exceptions to Coverage
7
Uptime Commitment
7
6.
Reporting, Reviewing and Auditing
7
7.
Risks
8
8.
Signatures
8
3
1.
General Overview
This is an SLA between IET and any application sponsor integrating with the Rice service and is intended to document:




The Rice service that IET provides to integrating applications
The general levels of availability, response and maintenance associated with the service
The responsibilities of IET and the integrating application team
The process for requesting services or receiving support
This SLA shall remain valid until revised or terminated.
2.
Service Description

Service Scope
 Rice Environments
o Development: Each anchor application will have a central Rice development environment
available which includes a private application server and private database schema.
o Test-Integration: All anchor applications will share a test integration environment where all
applications can test integration issues in a shared environment.
o QA-Integration: A shared QA environment used for integration and load testing.
o Stage: Used for integrated deployment practice.
o Production: This is the version that all Production Anchor Applications will be using. While the
other environments might have lesser uptime levels, it is expected that the Production system
be available 24/7/365 with the exception of minimal planned and unplanned outages.

Rice Support
Production
o Production Services managed by IET use Remedy to track all production related issues.
Development
o Development systems for Services managed by IET use Jira to track all development related
bugs and enhancements.


3.
Rice Community
o As a member of the Rice community, members should receive access to and are expected to
contribute back to the following:
 Developer Documentation
 Service FAQs
 Rice Mailing Lists
Assumptions
 Changes to the Rice service will be documented and communicated to all integrating application
stakeholders as defined in the Roles & Responsibilities section of this document.
 The Rice service will be provided in adherence to any related policies, processes and procedures.
Roles and Responsibilities

Stakeholder Information
IET


Rice Service Owner – Responsible for overall Rice service
Rice Service Manager – Responsible for the day to day Rice service operations and enhancements
4

Rice Architect – Responsible for ensuring all technical changes are consistent with best practices and to
ensure technical compatibility between Rice and integrating systems.
Contact Info
Role
Rice Service Owner
Rice Service Mgr
Rice Architect
Name
Deborah Lauriano
Hampton Sublett
Curtis Bray
Desk Phone
530 754 5990
530-754-6193
530-754-6199
Mobile Phone
530-219-4660
530-574-7758
530-574-7794
Email
dalauriano@ucdavis.edu
hsublett@ucdavis.edu
clbray@ucdavis.edu
KFS


KFS Operations Manager –
KFS Project Manager -
Contact Info
Role
KFS Operations Mgr
KFS Project Mgr
Name
Desk Phone
Radhika Prabhu
530 754 6805
Mobile Phone
Email
Rprabhu@ucdavis.edu

IET Responsibilities
 Meet response times associated with the priority assigned to incidents and service requests.
 Meet system uptime target
 Generate quarterly reports on service level performance
 Provide appropriate notification to customers for all scheduled maintenance via the shared Integrated
Calendar
 Provide an auditable migration path of Rice changes for all projects
 Adherence to campus policies (e.g. PPM-310-22 UC Davis Cyber-Safety Program, see Appendix A)

Integrating Application Team Responsibilities
 Meet agreed upon Anchor Application or Rice release dates or communicate schedule changes 90 days
in advance.
 Provide Anchor Application representative(s) to assist with resolution of a service related incident
 Participate in the documented change control process
 Document specific service availability requirements on Confluence and work with IET Rice Service
Manager for additions or changes to established service levels.
(https://confluence.ucdavis.edu/confluence/x/VobTAQ)
 Adherence to campus policies (e.g. PPM-310-22 UC Davis Cyber-Safety Program, see Appendix A)
4.
Service Enhancements or Minor Bug Requests
 To request a non-Blocker modification to the Rice service, Rice customers should enter a ticket in Middleware’s
Jira instance.
 All requests will be planned and scheduled among Anchor Applications Project Managers into future
releases.
5.
Service Availability, Response Times and Escalation Procedures
The Rice service has been architected to support multiple mission critical campus systems. It is because of these
systems that the Rice service needs to have the highest availability possible so as not to impact the users of the
other dependent systems. Service includes:
 24/7/365 High Availability (HA) of virtualization and SAN infrastructure with monitoring
 Automatic hardware failover in the event of hardware failure or resource contention
 Nightly whole server guest host backups for Disaster Recovery (DR) retained for three days. These
backups do not allow file level recovery. Systems are backed up with all files open and will recover in a
“crashed” state.
 Network Load Balancing
5




24x7 support that will be managed by the Campus Data Center operations team.
All systems within the Campus Data Center monitored for security to guard against intrusion and other
service attacks.
A web site monitoring page, updated every five minutes, indicating the status of every system and
service
Hours of Coverage, Response Times and Escalation Procedures
 Rice Team Coverage Periods (turn around times during Non-Business Hours are likely to be slower than
during Business Hours)
 Business Hours (8AM-5PM, Mon-Fri)
 Non -Business Hours (5PM-8AM Mon-Fri, SAT, SUN, Holidays and Campus Closure Days).
 “Blocker” Issue Type Notification Process
All Blocker Issues or “Major Incidents” are described as an unplanned outage, service degradation, or
emergency maintenance outside of planned maintenance that affects or could affect multiple users
and/or a mission critical service. Any downtime during the hours that a service is expected to be
available, is a major incident. It is possible to have a development system “Blocker” if the issue is
impacting a critical path task on a project, although by default, all production “Blocker” tickets will be
addressed first.
Communication/Escalation Process
Regardless of time of day whoever identifies the issue:
 Call IET Operations (530-752-1566)
i. Operations will notify the appropriate support staff member and create a
Remedy ticket with the highest priority (Urgent)
 “Non-Blocker” Issue/Enhancement Type Notification Process
 Create a Jira Ticket
i. Assign to Rice Lead Developer (unless correct resource is known)
ii. Assign a priority
iii. Assign a due date
 Response Times (Rice team has acknowledged ownership of the issue and has begun work to resolve)
 “Blocker” Issue Type
 The Rice Production system is being auto-monitored and will alarm when the system
is unavailable. The SysAdmin will be paged and will begin triaging the issue within 15
minutes. For non-Production Blocker issues, once Operations is contacted, a
SysAdmin will take ownership of the issue within 15 minutes. The issue will then be
seen through to resolution regardless of time of day.
 “Non-Blocker” Issue Types
 “Critical” Issue Type
i. Response time = Rice team ownership of the issue will be assumed within 4
business hours, work to resolve the issue will be top priority until fixed
unless due date allows work to be scheduled (but worked on only during
business hours)
 “Major” Issue Type
i. Response time = Rice team ownership of the issue will be assumed within 2
business days, work will be planned and scheduled within a release (and
updated within Jira)

“Minor and Trivial” Issue Types
i. Response time = No predefined response time. Items are unscheduled and
reside in a “wish list” repository and discussed during specific release
planning meetings.
6
 Prioritization
 “Blockers and Critical” Issue Types = use stated priorities. If there is more than one ticket in
either of these categories relying on the same resource, Rice Service Mgr will work with
Anchor Application PMs to negotiate priority order.

All other issue types will be planned and scheduled among Anchor Applications Project
Managers into releases.
 Escalation of issues
 Contact Rice Service Mgr, Rice Architect or Rice Service Owner.
 Communication
 “Blockers” Issue Types
o Rice Service Manager: Communicate multiple times a day with affected Anchor
Application Project Managers (or other stakeholders as necessary, including Tech
Lead) via email or phone if necessary.
o Anchor Application PM or Tech Lead: Confirm fix ASAP and inform Rice team.
 “Critical” Issue Types
o Rice Service Manager: Send summary email at the end of the business day describing
the state of any open “Critical” issues.
o Anchor Application PM or Tech Lead: Confirm fix ASAP and inform Rice team.
 “Non-Blocker” Issue Types
o Communication within the “comments” field of Jira tickets is sufficient.

Planned Maintenance and Service Changes
 Release Schedule (Post Version Compatibility): The Rice service will follow the Rice Foundation
Roadmap but will intentionally stay one release behind it (or two months after the release, whichever
comes first).
 Advanced Notification/Negotiation
 Rice Service Manager will notify Anchor Application PMs and Sponsors of releases 90 days in
advance via email and by updating the “Rice and Anchor Applications Calendar of Events”
schedule.
 Planned outages from other systems upon which Rice is dependent will need to be
communicated and negotiated.
 Service Exceptions to Coverage Schedule
 Exceptions to the Rice schedule/processes are possible, but need to be negotiated as least 30
days in advance (example Fiscal Year End close where KFS needs 100% uptime)

6.
Uptime Commitment
 Target is 99.9% uptime
Reporting, Reviewing and Auditing
 Reporting: Status reports, including response time to High Priority tickets (Urgent/Blocker and High/Critical) on
the system availability, as described in this document will occur on a monthly basis. These reports will be
assembled by the Rice Service Manager.

Reviewing: This document will be reviewed on an annual basis and modified accordingly, by the UC Davis Kuali
Rice Technical workgroup and approved by the Rice Oversight committee.
7

7.
8.
Risks

Auditing: The processes employed for system availability will be audited annually internally by the UC Davis
Kuali Rice Technical workgroup.
Rice System dependencies are documented in the network diagram, posted on confluence:
https://confluence.ucdavis.edu/confluence/x/qY-TAQ
Signatures
Rice Service Owner: _________________________________________
Date:____________________
Anchor Application Sponsor: _______________________________
Date:____________________
8
Appendix A:
DCCS Security Best Practices Checklist
The UC Davis Cyber-safety Policy, UC Davis Security Standards Policy (PPM Section 310-22), has been officially adopted by campus to define
both responsibilities and key practices for assuring the integrity, availability and confidentiality of UC Davis computing systems and
electronic data. The following best practices list includes the key practices from the UC Davis Security Standards Policy and the Cybersafety Program and additional practices that are necessary for servers residing within DCCS. For more information on the policy, see
http://manuals.ucdavis.edu/ppm/310/310-22a.htm. For more information on the Cyber-safety Program, see
http://security.ucdavis.edu/cybersafety.cfm. This document will be updated as necessary to incorporate changes to security practices due
to changes in security risks, technology, and campus policies.
DCCS Security Standards in Support of the Campus Cyber-safety Standards
Data Center and Client Services Security Best Practices 2009 Checklist
The following requirements for network devices on the DCCS
network are based on policies that are audited by the Cyber-Safety
Program and/or required by DCCS policy.
Responsibility
Cyber-Safety 2009 Policy
Reference
DCCS
Ref.#
Category
1
Software
Vulnerabilities
1
Software
Vulnerabilities
2
Virus Infections
Software Vulnerabilities (Patching)
All critical OS and application software patch updates must be
applied within 7 days of release. Managers shall make final decision
about patching outside of posted maintenance schedule.
x
Virus Infections/Anti-Spyware
Client
Anti-virus/anti-spyware software must be installed on Windows and
Macs. Updates must be applied within 24 hours of release.
x
2
Virus Infections
and Anti-Spyware
Anti-virus/anti-spyware software must be configured to check and
apply updates daily.
x
2
Virus Infections
and Anti-Spyware
3
Weak
Authentication
Weak Authentication
9
All default passwords must be modified on initial use.
x
3
Weak
Authentication
All user accounts must be password-protected or use SSH keys or a
token.
x
3
Weak
Authentication
x
3
Weak
Authentication
Standard accounts such as UCD loginID used by system
administrators should not have privileged access.
x
3
Weak
Authentication
Group accounts must not be used when logging in to a host.
x
3
DC Weak
Authentication
Configure hosts to exclusively use Unix or windows bastion host
services for SSH or remote desktop access to hosts.
x
If a staff member is no longer affiliated with the Data Center, all
privileged-access passwords should be changed.
x
Publicly accessible on-line applications must guard against brute
force attacks.
x
All system and application passwords must be stored in the safe in
Operations
All devices must be configured to “lock” and require the user to reauthenticate if left unattended for more than 20 minutes
Pass phrase strength shall have a minimum combination of 6 x 1019
or 65.9 bits of entropy or randomness using the formula,
entropy = (log2(b))*l This value is a function of the size of the
character set (b) and the length of the password (l).
x
DC Weak
Authentication
3
Weak
Authentication
3
DC Weak
Authentication
x
3
DC Weak
Authentication
x
3
Weak
Authentication
4
Insecure Personal
Information
4
Insecure Personal
Information
x
Insecure Personal Information
Restricted data (SSN, credit card number) must be removed from all
computers where it is not required
x
Client must inform DCCS when a system houses or has access to
restricted data.
Computers storing restricted data must be running tripwire, must
have a host-based firewall, and must log security events to DCCS
syslog server.
Firewall Services
x
DC Insecure
Personal
Information
x
DC Insecure
Personal
Information
x
5
10
Firewall Services
All firewall services are restrictively configured to deny all incoming
traffic unless expressly permitted.
x
5
Firewall Services
System must run a host-based firewall. In addition, systems must be
housed behind the DCCS firewall or project-based firewall.
x
5
Firewall Services
6
Unnecessary
Computer
Programs/Services
Unnecessary Computer Programs/Services
Telnet and FTP should be disabled. SSH should be allowed only
through the bastion hosts except for ISUN service.
x
Insecure network services/processes must be disabled. If an insecure
network service is required, use additional security measures to
secure the service.
x
Unnecessary
Computer
Programs/Services
Backup, Recovery, and Disaster Planning
6
Unnecessary
Computer
Programs/Services
7
Backup, Recovery,
and Disaster
Planning
All critical & sensitive university electronic communication records
must be backed up on a regular & frequent basis to separate backup
media.
x
7
Backup, Recovery,
and Disaster
Planning
Backup media must be protected from unauthorized access & stored
in a separate location from source.
x
7
Backup, Recovery,
and Disaster
Planning
Backup media must be tested on a regular basis to ensure
recoverability.
x
7
Backup, Recovery,
and Disaster
Planning
8
Physical Security
8
Physical Security
8
Physical Security
9
Open Relay Email
Proxies
9
Open Relay Email
Proxies
Physical Security
Physical security measures must be implemented to protect critical
or sensitive university electronic communication records from theft.
x
Portable storage devices must not be left unattended if publicly
accessible.
x
Open Relay Email Proxies
Hosts connected to network must not permit open e-mail relaying.
11
x
x
Unrestricted Proxy Servers
If a proxy server exists, users must authenticate to the server & meet
the campus criteria to access campus licensed intellectual property
(e.g. online journals).
x
Audit Logs
10
Unrestricted
Proxy Servers
10
Unrestricted
Proxy Servers
11
Audit Logs
System logs must be configured to track system access.
x
11
Audit Logs
System logs must be sent to Data Center syslog server.
x
11
Audit Logs
DCCS security team shall run yearly audits using Nessus. Results
identified as critical by Nessus shall be resolved within 7 days.
x
11
DC Security
Review
Password strength checking software such as Crack must be run
yearly on all Data Center password files to check for use of weak
passwords.
x
3
DC Security
Review
12
Security Training
12
Security Training
12
Security Training
14
Release of
Electronic Storage
14
Release of
Electronic Storage
14
Release of
Electronic Storage
Security Review
Security Training
A technical training program is documented for systems staff
responsible for security administration.
x
Campus staff handling critical or sensitive university electronic
communication records will receive annual information security
awareness program training re: policy and proper handling and
controls:
http://security.ucdavis.edu/presentations/ITSecurityTutorial.ppt.
x
x
Release of Electronic Storage
All data is removed from electronic storage prior to release or
transfer of equipment.
x
Data removal must be consistent with physical destruction of
electronic storage device, degaussing or overwriting of data at least 3
times.
x
x
Web Application
Security
Vulnerabilities
Web Application Security Vulnerabilities
Web applications developed or acquired by campus units must
support secure coding practices.
12
x
x
16
Web Application
Security
Vulnerabilities
Web applications must mitigate the vulnerabilities described within
the OWASP Top Ten Critical Web Application Security Vulnerabilities.
Use Watchfire Appscan Enterprise to analyze the security of your
web sites. Sites with high visibility or sites that access restricted data
or sites that house publicly writable pages should be analyzed first.
x
x
x
x
16
Web Application
Security
Vulnerabilities
16
DC Web
Application
Security
Vulnerabilities
Any system requiring an exception to any of the above must be detailed with an explanation of why the exception is
necessary.
Procedures to follow in the event of a security incident:
REPORTING A SECURITY INCIDENT
(https://confluence.ucdavis.edu/confluence/display/IETSP/Security+Incident+Response+Plan+At-A-Glance)
IET staff should not attempt to repair problems with their own or their co-workers systems. In the event of a problem,
immediately contact your designated desktop support person. If you are not sure who your desktop support person is, please
ask your supervisor.
When reporting a security incident, you should provide as much information as possible, including:





The date and time you discovered the incident
General description of the incident
The system or data at risk, including the known or suspected presence of personal identifying information on the
affected computer
Any actions you have taken since you discovered the incident
Your contact information
13
Download