December (opens in a new window)

advertisement
UCD IT Services
PERFORMANCE MONTHLY REPORT – NOVEMBER / DECEMBER 2011
Due to the way the final workweeks have fallen in 2011, this report encompasses 8 week
period for November and December.
Part 1 - Week beginning 7th November – Week beginning 28th November 2011 (weeks 45-48)
Part 2 - Week beginning 5th December – Week beginning 26th December 2011 (weeks 49-52)
Part 1: Service Outages:
1. Blackboard (Elearning) – 1 Unscheduled Outage
Date: Duration:
Cause
7/11
No cause was recorded.
5 minutes
Impact:
Connectivity to hosted Blackboard systems was temporarily lost.
Blackboard was then unavailable through UCD Connect for a further
25 minutes, but users could access it through Direct Login.
Action:
Direct service came back without any intervention. Restoring SSO
access required the Connect team to restart the Blackboard
connector.
Blackboard applied an Oracle patch over the Christmas break. This
addresses some system-wide performance issues which may
address this.
Improvement:
2. UCD Connect – 1 Unscheduled Outage
Date: Duration:
Cause
25/11 2hr 10
minutes
LDAP stopped responding overnight because the size of the
backup was too large and caused the disk space to reach
100%.
Impact:
Action:
Access to UCD Connect was unavailable.
LDAP was restarted to and the LDAP database automatically
initiated. This took 2 hours to complete and once complete, front end
services were restored. After 10 minutes of coping with login
demand, users could access the system normally.
Improvement:
Reduced the amount of backups held on disk from 5 to 4 in order to
decrease the capacity needed for backups.
UCD IT Services
3. Crumlin Network unavailable – 1 Unscheduled Outage
Date: Duration:
Cause
23/11 40 minutes
3rd party contractors invited on site by Crumlin staff installed
cabling and network splitters on network which caused
network loops and shutdown the entire network.
Impact:
Action:
Improvement:
There was no network connectivity to Belfield for Crumlin Hospital.
Crumlin staff were advised to remove illegal network device from
network.
Crumlin staff invited 3rd party IT company to do networking after IT
Services advised them not to do what they were proposing. In this
case our advice was ignored and the end result was a campus
outage.
4. UCD Connect unavailable – 1 Unscheduled Outage
Date: Duration:
Cause
30/11 5 minutes
CPIP Connectors were failing for those user trying to login
and use mail, Blackboard and Calendar through Connect
Impact:
Action:
Improvement:
Connect was unavailable for those not already logged in and access
to other services using the CPIP connector were also unavailable.
CPIP servers were restarted and normal service resumed.
Increased monitoring has been put in place in order to notify us
sooner if one server goes down. Also, upgraded CPIP servers with
greater capacity are being implemented.
Service Availability Levels 9.00am-9.00pm – Part 1: November / December 2011
Period
Beginning
Network
UCD Connect
Staff Email
Student Email
Staff File Sharing & Connect files
Software Applications
Staff Printing
Student Printing
Internet
Elearning
Infoview
Banner
Remote Sites
Overall Services
45
7-Nov-11
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
99.86%
100.00%
100.00%
100.00%
99.99%
46
14-Nov-11
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
47
21-Nov-11
100.00%
96.39%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
98.89%
99.64%
48
28-Nov-11
100.00%
99.86%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
99.99%
Monthly
Avg.
100.00%
99.06%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
99.97%
100.00%
100.00%
99.72%
99.90%
UCD IT Services
Part 2 : Service Outages –
1. Daedalus Data Centre fire – 1 Unscheduled Outage for all services.
Date: Duration:
8/12
40 minutes –
4 hours
Impact:
Action:
Improvement:
Cause
On Thursday afternoon (3pm) there was a serious incident
from a localised fire in the Daedalus Data Centre. A large
research cluster in Daedalus caused the fire. The specialised
fire protection systems in the data centre worked very
effectively, releasing FM200 gas into the room and shutting
down all power. This means there was no damage to
equipment, other than the equipment which caused the fire,
damage to this cluster is localised.
All services were unavailable for various periods of time to staff and
students.
Campus network was restored in 1 hour and 10 minutes.
Essential Student services (Blackboard & Email) were restored within
2.5 hours
Essential staff services were restored within 4 hours
Fire suppression system and data recovery plan operated as
required/correctly.
2. Network Connection to Lyons Estate – 1 Unscheduled Outage
Date: Duration:
Cause
15/12 15 minutes
There was an outage at on HEAnet equipment which caused
this outage
There was no network available from Belfield to Lyons Estate.
HEAnet resolved the radio link issue.
Due to the location of the equipment providing the link, it is an
identified risk that events outside our control will interfere with the
link. There is constant monitoring by both HEAnet and IT Services
so that service can be resumed quickly as an issue is identified.
Impact:
Action:
Improvement:
3. Network shared files unavailable – 1 Unscheduled Outage
Date: Duration:
Cause
15/12 15 minutes
Controllers on Daedalus EVA 01 and EVA 03 both rebooted
at 12.51pm causing volumes on Staff cluster to become
unavailable. All resources were back online by 1.10pm.
Impact:
Action:
Improvement:
There was no access to shared files for staff during this time.
Controllers Rebooted automatically. Reboot caused connection to
Storage to be lost on the Staff Data Cluster. The cluster was then
brought back online
Recommended Action - Update the EVA Controller Firmware to
latest version.
UCD IT Services
4. Connect files unavailable
– 1 Unscheduled Outage
Date: Duration:
Cause
21/12 10 minutes
All file systems did not mount completely since the full power
on of Daedalus data centre the previous day.
Impact:
Action:
Improvement:
Connect files was unavailable to staff and students.
The missing file systems were remounted and Xythos was stopped
and restarted on all clients.
Recommend auto mount of the files systems.
Service Availability Levels 9.00am-9.00pm – Part 2: November / December 2011
Period
Beginning
Network
UCD Connect
Staff Email
Student Email
Staff File Sharing & Connect files
Software Applications
Staff Printing
Student Printing
Internet
Elearning
Infoview
Banner
Remote Sites
Overall Services
49
5-Dec-11
98.06%
93.75%
93.75%
95.42%
93.75%
95.42%
95.42%
67.08%
96.53%
95.42%
93.75%
93.75%
96.53%
92.97%
50
12-Dec-11
100.00%
100.00%
100.00%
100.00%
99.58%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
99.58%
99.94%
2. Support Statistics December 2011
Overall totals
Total cases logged
Total cases logged by Students
Total cases logged by Staff
1809
606
1203
Overall top 5 queries logged
NON ITServices Call
Applications
Service Outages
Account Related Calls
Customer Equipment
445
275
210
172
143
51
19-Dec-11
100.00%
100.00%
100.00%
100.00%
99.65%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
99.97%
52
26-Dec-11
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
100.00%
Monthly
Avg.
99.51%
98.44%
98.44%
98.85%
98.25%
98.85%
98.85%
91.77%
99.13%
98.85%
98.44%
98.44%
99.03%
98.22%
Download