Kåre Rude Andersen Operations Manager and the Perfect world The guy on stage • • • • Chief System Management Architect & Co-founder of Coretech SCOM since MOM 2000 Automation / Management Packs Author of Mastering and Advanced SCOM Training. • • • • Kåre Rude Andersen Partner, Coretech A/S kra@coretech.dk, @kracoretech blog.coretech.dk/kra Agenda • • • • • • • • • SCOM – Love/Hate relationship SCOM – Why it sucks SCOM – Why its great Automation – Alerts Automation – Notification Automation – Maintenance mode Automation – SMA Automation – Groups Monitoring and Automation - Everything SCOM – Love/Hate relationship • • • • • • • • Lets do it – its a part of our License Next next , dada Import All Management Packs from Microsoft – di daaa di daaa We did it – now what YES We can see it all Damm’it we can see it all Alerts and Critical Health States Bad reputation – often because of no processes in the company Why it sucks • • • • • • Deliver proactive information Discovers everything you didnt knew you had No control on Alerts What, how and to whom do we need to present status Do we really need to build applications topologies Currently Microsoft are sleeping on features SCOM – Why its great • • ITEL / MOF / Six Sigma, VAL-IT, ISO 20000, SAS-70, etc 3 type of MPs: – – – • • • Microsoft SW/HW Vendor Yourself Totally open interface – you are in control Working together with the rest of the suite 2012 R2 is stable and (fast) SCOM – Solution • The solution is pretty simple: • Implement Processes and Automate SCOM – Solution – But what processes? SCOM – Solution – But what processes? 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. Procedure and script for installation and configuration of gateway server (depending on number of agents) Procedure and script for Installation and configuration of agents to communicate with gateway (depending on step 1) Procedure and script for enabling failover for agent communicating with more than one server. Procedure and script for Installation and configuration of agents to communicate with a Management Server Process to create groups for customers servers, either by registry or NetBIOS name Process for Network monitoring Process for Linux monitoring Process and Script for creating Management Pack and naming conventions Procedure for creation of Distributed Applications/Groups for Services Procedure for Defining SLA/SLO on customers Services Procedures for Rules/Monitor Procedure for Presentation of reports – performance, availability etc. Procedure for creation of customer views Procedure for Subscriptions Procedure for creation and configuration of Run As Accounts Procedure for overrides Procedures for Customer access to Reports, Views and Dashboards Procedure for GSM Monitoring for Customers Procedure for OpsInsights implementation Configure Coretech SCOM Dashboard Implement “Application groups” of servers – like Application, Web and Database. Implement a gold, silver, bronze support environment by the use of registry. Create Ticketing with your Helpdesk environment Automation - Alerts • • • Please automate Alerts Alerts could be self manageble Always do a decision: – – – • Disable it Override it Fix it Introduce a 10 mins weekly Alerts meeting Automation – Alerts Manage Resolution State $Alerts = get-scomalert -ResolutionState 0 if ($Alerts) { foreach($Alert in $Alerts) { $newState = $null switch -wildcard ($Alert.Description) { "*Active directory*" { $newState = 10 } "*DNS*" { $newState = 30 } "*Cisco*" { $newState = 250} } switch -wildcard ($Alert.Name) { "ACME.Monitor.Event.Dummy.100" { $newState = 249 } } if($newState -ne $null) { $Alert.ResolutionState = $newState $Alert.Update(“Resolution State changed automatically by the QueueManager Robot”) } } } Automation – Alerts Daily Alerts #Define last 24 hours $AlertDateYesterdayBegin = [DateTime]::Today.AddDays(-1) $AlertDateYesterdayEnd = [DateTime]::Today.AddDays(-1).AddSeconds(86399) #Get alerts from last 24 hours $YesterdayAlerts = @(get-scomalert | where {$_.TimeRaised -gt $AlertDateYesterdayBegin -and $_.TimeRaised -lt $AlertDateYesterdayEnd}) #Output write-host write-host write-host write-host write-host write-host write-host write-host NUMBER OF ACTIVE ALERTS YESTERDAY: ($YesterdayAlerts).Count CURRENT CURRENT CURRENT CURRENT NUMBER NUMBER NUMBER NUMBER OF OF OF OF ACTIVE ACTIVE ACTIVE ACTIVE ALL CRITICAL WARNING INFORMATIONAL ALERTS: ALERTS: ALERTS: ALERTS: @(get-scomalert @(get-scomalert @(get-scomalert @(get-scomalert | | | | where where where where {$_.ResolutionState {$_.ResolutionState {$_.ResolutionState {$_.ResolutionState -ne -ne -ne -ne ‘255’}).count ‘255’ -and $_.Severity -eq ‘2’}).count ‘255’ -and $_.Severity -eq ‘1’}).count ‘255’ -and $_.Severity -eq ‘0’}).count -foregroundcolor “red” -foregroundcolor “yellow” TOPLIST OF YESTERDAYS ALERTS SORTED BY COUNT: #list and sort yesterday alerts $YesterdayAlerts | Group-Object Name |Sort -desc Count | select-Object Count, Name |Format-Table –auto write-host #list and sort current active alerts write-host CURRENT ACTIVE CRITICAL ALERT LIST: -foregroundcolor “red” (get-scomalert | where {$_.ResolutionState -ne ‘255’ -and $_.Severity -eq ‘2’} | Group-Object Name |Sort -desc Count | select-Object Count, Name |Format-Table –auto) Automation - Alerts • Introduce Resolution states like: – – – – – • Handle by Service Owner Awaiting Weekly Meeting Do disable this Alert Do override this Alert Investigate this Alert Reuse solution history with: – – – – SCSM ServiceNOW Remedy etc Automation - Notification • • • • • Do not send Alerts directly to people Always use Distributed Application or Groups as a source Classify your Applications – Gold, Silver and Bronze Prioritize your Applications – High, Medium, Low Only send Alerts => Silver and High Automation – Notification Alerts Resend Alerts $oldAlerts = Get-SCOMAlert | Where-Object {($_.LastModified -ge [DateTime]::Now.AddHours(-4)) -and ($_.ResolutionState -eq 0)} ForEach($alert in $oldAlerts) { $alert.Update("") } Automation – Alerts Microsoft Alert Update Connector DEMO - Automation – Alerts Get more info into an Alert $machine = Get-SCOMAlert | where {$_.ResolutionState -eq '110' -and $_.MonitoringObjectDisplayName -ne $()} | ForEach { $_.ID $AlertID = $_.ID $strComputer = $_.NetbiosComputerName $strFilter = "(&(objectCategory=computer)(objectClass=computer)(cn=$($strComputer)))" $objComputer = ([adsisearcher]$strFilter).FindOne() $Desc = $objComputer.properties.description $alert = Get-SCOMAlert | where {$_.Id -eq $AlertID} $alert.CustomField1 = ("Info from AD: " + $Desc) $alert.ResolutionState = 0 $alert.Update("Got description from AD") } Automation – Notification Catches • • • Do not send empty values = nothing happens Remember the Notification Account Notification Ressource Pool Automation Maintenance mode • Tim Mcfadden Maintenancemode script – • Stefan Stranger – • • https://gallery.technet.microsoft.com/scriptcenter/Put-OM2012-Computer-Group-43902672 Remember the catch about getting out of MM – • http://www.scom2k7.com/scom-2012-maintenance-mode-scheduler/ Get-SCOMClass -name "Microsoft.SQLServer.2012.database" | GET-SCOMClassInstance Orchestrator / SMA / SCCM Christopher Keyaert SMA – Start SCOM MaintenanceMode – https://gallery.technet.microsoft.com/scriptcenter/SMA-Runbook-Start-SCOM-c594b92c Automation with SCOM Name Runs Benefits Limitations Orchestrator On a Local server Graphical Interface, Runs all .Net Scripts Max number of runbooks running. Service Management Automation 1 or more Windows Azure Access any resource in Pack server your Datacenter – Manage public/private cloud, do pay $ for I/O Interface (not really a limitation) Azure Automation Azure public cloud Cannot access your local SMA or Orchestrator (All) ways up Automation - SMA • %windir%\system32\windowspowershell\v1.0\powershell.exe • -command "& {Start-SmaRunbook -WebServiceEndpoint "https://scwap" Port 9090 -Name "SCOM-RecoverHelloWorld" -Parameters @{'Message'='Hello World!'}}" Demo • Temperature – Warmer and Colder SCOM – Using Recovery Action Automation – Distributed Applications • • • Factfinder Create automatically DA’s Use as detective tool Automation – Distributed Applications • Demo FactFinder Automation - Everything • Logical Objects – – Document Workflows, see status of batch Show Health from a logical object Automation - Everything • Physical Objects – If you can reach it you can control it Automation - Everything • • • Logical Objects Demo – Control Watt and Temperature Create Logical Objects for Internet of things THANK YOU Send me an email for information on Dashboards Logical Objects SCOM Health Check Management Packs etc. kra@coretech.dk