UCD IT Services PERFORMANCE MONTHLY REPORT – NOVEMBER / DECEMBER 2011 Due to the way the final workweeks have fallen in 2011, this report encompasses 8 week period for November and December. Part 1 - Week beginning 7th November – Week beginning 28th November 2011 (weeks 45-48) Part 2 - Week beginning 5th December – Week beginning 26th December 2011 (weeks 49-52) Part 1: Service Outages: 1. Blackboard (Elearning) – 1 Unscheduled Outage Date: Duration: Cause 7/11 No cause was recorded. 5 minutes Impact: Connectivity to hosted Blackboard systems was temporarily lost. Blackboard was then unavailable through UCD Connect for a further 25 minutes, but users could access it through Direct Login. Action: Direct service came back without any intervention. Restoring SSO access required the Connect team to restart the Blackboard connector. Blackboard applied an Oracle patch over the Christmas break. This addresses some system-wide performance issues which may address this. Improvement: 2. UCD Connect – 1 Unscheduled Outage Date: Duration: Cause 25/11 2hr 10 minutes LDAP stopped responding overnight because the size of the backup was too large and caused the disk space to reach 100%. Impact: Action: Access to UCD Connect was unavailable. LDAP was restarted to and the LDAP database automatically initiated. This took 2 hours to complete and once complete, front end services were restored. After 10 minutes of coping with login demand, users could access the system normally. Improvement: Reduced the amount of backups held on disk from 5 to 4 in order to decrease the capacity needed for backups. UCD IT Services 3. Crumlin Network unavailable – 1 Unscheduled Outage Date: Duration: Cause 23/11 40 minutes 3rd party contractors invited on site by Crumlin staff installed cabling and network splitters on network which caused network loops and shutdown the entire network. Impact: Action: Improvement: There was no network connectivity to Belfield for Crumlin Hospital. Crumlin staff were advised to remove illegal network device from network. Crumlin staff invited 3rd party IT company to do networking after IT Services advised them not to do what they were proposing. In this case our advice was ignored and the end result was a campus outage. 4. UCD Connect unavailable – 1 Unscheduled Outage Date: Duration: Cause 30/11 5 minutes CPIP Connectors were failing for those user trying to login and use mail, Blackboard and Calendar through Connect Impact: Action: Improvement: Connect was unavailable for those not already logged in and access to other services using the CPIP connector were also unavailable. CPIP servers were restarted and normal service resumed. Increased monitoring has been put in place in order to notify us sooner if one server goes down. Also, upgraded CPIP servers with greater capacity are being implemented. Service Availability Levels 9.00am-9.00pm – Part 1: November / December 2011 Period Beginning Network UCD Connect Staff Email Student Email Staff File Sharing & Connect files Software Applications Staff Printing Student Printing Internet Elearning Infoview Banner Remote Sites Overall Services 45 7-Nov-11 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 99.86% 100.00% 100.00% 100.00% 99.99% 46 14-Nov-11 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 47 21-Nov-11 100.00% 96.39% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 98.89% 99.64% 48 28-Nov-11 100.00% 99.86% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 99.99% Monthly Avg. 100.00% 99.06% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 99.97% 100.00% 100.00% 99.72% 99.90% UCD IT Services Part 2 : Service Outages – 1. Daedalus Data Centre fire – 1 Unscheduled Outage for all services. Date: Duration: 8/12 40 minutes – 4 hours Impact: Action: Improvement: Cause On Thursday afternoon (3pm) there was a serious incident from a localised fire in the Daedalus Data Centre. A large research cluster in Daedalus caused the fire. The specialised fire protection systems in the data centre worked very effectively, releasing FM200 gas into the room and shutting down all power. This means there was no damage to equipment, other than the equipment which caused the fire, damage to this cluster is localised. All services were unavailable for various periods of time to staff and students. Campus network was restored in 1 hour and 10 minutes. Essential Student services (Blackboard & Email) were restored within 2.5 hours Essential staff services were restored within 4 hours Fire suppression system and data recovery plan operated as required/correctly. 2. Network Connection to Lyons Estate – 1 Unscheduled Outage Date: Duration: Cause 15/12 15 minutes There was an outage at on HEAnet equipment which caused this outage There was no network available from Belfield to Lyons Estate. HEAnet resolved the radio link issue. Due to the location of the equipment providing the link, it is an identified risk that events outside our control will interfere with the link. There is constant monitoring by both HEAnet and IT Services so that service can be resumed quickly as an issue is identified. Impact: Action: Improvement: 3. Network shared files unavailable – 1 Unscheduled Outage Date: Duration: Cause 15/12 15 minutes Controllers on Daedalus EVA 01 and EVA 03 both rebooted at 12.51pm causing volumes on Staff cluster to become unavailable. All resources were back online by 1.10pm. Impact: Action: Improvement: There was no access to shared files for staff during this time. Controllers Rebooted automatically. Reboot caused connection to Storage to be lost on the Staff Data Cluster. The cluster was then brought back online Recommended Action - Update the EVA Controller Firmware to latest version. UCD IT Services 4. Connect files unavailable – 1 Unscheduled Outage Date: Duration: Cause 21/12 10 minutes All file systems did not mount completely since the full power on of Daedalus data centre the previous day. Impact: Action: Improvement: Connect files was unavailable to staff and students. The missing file systems were remounted and Xythos was stopped and restarted on all clients. Recommend auto mount of the files systems. Service Availability Levels 9.00am-9.00pm – Part 2: November / December 2011 Period Beginning Network UCD Connect Staff Email Student Email Staff File Sharing & Connect files Software Applications Staff Printing Student Printing Internet Elearning Infoview Banner Remote Sites Overall Services 49 5-Dec-11 98.06% 93.75% 93.75% 95.42% 93.75% 95.42% 95.42% 67.08% 96.53% 95.42% 93.75% 93.75% 96.53% 92.97% 50 12-Dec-11 100.00% 100.00% 100.00% 100.00% 99.58% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 99.58% 99.94% 2. Support Statistics December 2011 Overall totals Total cases logged Total cases logged by Students Total cases logged by Staff 1809 606 1203 Overall top 5 queries logged NON ITServices Call Applications Service Outages Account Related Calls Customer Equipment 445 275 210 172 143 51 19-Dec-11 100.00% 100.00% 100.00% 100.00% 99.65% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 99.97% 52 26-Dec-11 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% Monthly Avg. 99.51% 98.44% 98.44% 98.85% 98.25% 98.85% 98.85% 91.77% 99.13% 98.85% 98.44% 98.44% 99.03% 98.22%