Pragmatic Azure What can the Windows Microsoft Azure cloud do for me? Boston Code Camp 21 21-June-2014 (10:30-10:50) Bill Wilder @codingoutloud codingoutloud@gmail.com blog.codingoutloud.com linkedin.com/in/billwilder • Except where noted, slide deck is © 2014 Development Partners Software Corporation • http://www.devpartners.com • http://www.gartner.com/technology/reprints.do?id=1-1IMDMZ8&ct=130819&st=sb IaaS According to Gartner Aug 2013 http://www.gartner.com/technology/reprints.do?id=1-1IMDMZ8&ct=130819&st=sb IaaS According to Gartner May 2014 http://www.gartner.com/technology/reprints.do?ct=140108&id=1-1P502BX&st=sb PaaS According to Gartner Jan 2014 Azure delivers… Hyper Scale Enterprise Grade Hybrid Cloud Services 2008-2012 Original PaaS-focused S+S Vision Cloud Services Virtual Machines 2012 Web Sites Cloud Services Virtual Machines 2013 Web Sites Cloud Services Stateless Nodes HA Internet Scale AutoPatching Continuous Delivery If you outgrow Web Sites, migrate to Cloud Services RDP Full Admin Virt Network IP ACL C++ Linux/Win2k .NET Networking PHP Endpoint ACLs Python On-prem Node.js equivalents Java Mgmt API Gallery Portal Persistent Disks Enterprise Scale Virtual Machines 2014 Web Sites Rapid Deploy & Scale Sticky LB Most new features show up here first Azure Active Directory Software + Services MySQL SQL Database Service Bus Caching SaaS IaaS Blob Storage NoSQL Table Storage PaaS Reliable Queue Autoscaling Alerting Traffic Manager <MORE SERVICES>… Marketplace des questions? Demos • • • • • • Azure Web Site #1 – Gallery Azure Web Site #2 – Flasky VM #1 – Ubuntu VM #2 – MSDN for DevTest Azure Web Site #3 – ASP.NET Azure Web Site #4 – … + AAD – Log streaming • Azure Web Site #5 – … + ETW + Table Storage • Azure Management Studio • Storage #1 – Click, Click, Ta-Da! Let’s get into demo mode .. …… des questions? Microsoft Azure Data Center Regions http://azuremap.blob.core.windows.net/apps/bingmap-geojson-display.html http://blog.codingoutloud.com/2014/02/01/mapping-windows-azure-4-years-after-full-general-availability/ Azure Map Monthly Costs? • Assume 1000 hits per day Each Storage Account (ns)... • • • • 200 TB capacity CDN-enabled 20,000 entities or message per second Geo-replicated • http://msdn.microsoft.com/enus/library/azure/dn249410.aspx Valet Key Pattern var queueValetKeyUrl = "https://bost...... var cloudQueue = new CloudQueue(new Uri(queueValetKeyUrl)); var name = "Bill Wilder"; var msg = new CloudQueueMessage(name); cloudQueue.AddMessage(msg); // cloudQueue.PeekMessage().AsString; // cloudQueue.DeleteMessage(…); Valet Key Pattern var queueValetKeyUrl = "https://bostonazureboot.queue.core. windows.net/attendees?sv=2012-02-12& st=2014-03-28T00%3A37%3A28Z& se=2014-04-04T01%3A37%3A28Z& sp=ra& sig=Jeb%2F8QmpuuPBWBp3hW5MXnQZ2NK8GXyuw des questions? ETW, EventSource, and SLAB for Logging, Instrumentation, and Telemetry In Distributed Systems… 1. Cloud Services are Distributed Systems 2. Gathering and Aggregating information on Distributed Systems is HARD 3. Insight via telemetry more critical than ever to debug, monitor, diagnose, track QoS (SLA), … The term “cloud” is nebulous… Logging Today Most Common Logging Today int x = foo.DoSomething(); // what could go wrong? 2nd Most Common Logging Today try { int x = foo.DoSomething(); } catch (Exception ex) { // Let's hope this never happens } 3rd Most Common Logging Today try { int x = foo.DoSomething(); } catch (Exception ex) { // “Handle” the exception Logger.Error(ex.ToString()); } term “cloud” is nebulous… Logging The Challenge: Reactive: something unexpected happened Not solution-oriented: why am I logging this and what do I hope to learn from it? who is the audience? Proactive Instrumentation (Telemetry?) var stopwatch = Stopwatch.StartNew(); // … call FooApi stopwatch.Stop(); var duration = (int)stopwatch.ElapsedMilliseconds; Logger.Info( String.Format( "User {0} accessed method {1} (took {2} ms)", Thread.CurrentPrincipal.Identity.Name, "FooApi", duration); Some Challenges from Prior Slide… • Formatting done at logging site – Unstructured – Performance hit – Not centralized / coordinated • Severity Level decided at logging site • Who is the customer of this logging statement? • Who is using this code? (Distributed System) The term “cloud” is nebulous… Event Tracing for Windows ETW ETW Background • Integrated into Windows Desktop and Server • Used by Microsoft (.NET, ASP.NET, IIS, …) – Your data side-by-side (by time, activity id) • Wicked fast (kernel-level buffers) • Semantically rich (time, stack, custom) • Standardized tooling support (more coming) But… • Hard to use for .NET developers (<= .NET 4.0) EventSource class (.NET 4.5) • Makes ETW available to .NET developers – “worth the effort” • Steps to PRODUCE ETW events • Derive class from EventSource – System.Diagnostics.Tracing namespace • Create methods for each kind of event – Annotate appropriately • Log through these methods • FAMILIAR: superset of logging frameworks – e.g, levels (Error, Info, etc.), other attributes Consuming ETW Events • Custom Code (Event Listener, such as in SLAB) • PerfView tool Else… • ETW event “fall on the floor” How is this better than log4net? Log4net • Can log to Azure Table synchronously • Distributed string formatting, severity determination at log location • Encourages variable log formats + parsing • Very Simple ES + SL + SLAB + Azure • Can do it with buffering, out-of-proc, and with RX • Centralized string formatting, severity determination – more flexible, DRY* • Encourages structured log formats • Just as simple? How is this WAY BETTER than log4net? Activity Id Correlation across calls and tiers See: http://blogs.msdn.com/b/agile/archive/2 014/03/27/semantic-logging-applicationblock-now-supports-activity-tracing-andelasticsearch.aspx Limitations of ETW • Old, but new • Repetitive, boilerplate for EventSource • Finicky! (Keywords, Event Id, …) – SLAB helps • Limited Data Type - no TimeSpan, no userdefined • Auto-augment with Process Id, Thread Id, Current Principal (Claims) • Activity Id Correlation has tricky cases w/ async ETW Tips & Tricks • • • • • Use >1 EventSource 1:N Event Trace Use Table vs. File vs. SQL Consider RX (in-proc only!) Focus first on ‘seams’ in architecture Use Activity Id and think about correlation across tiers • Continually improve telemetry – see TDD later Semantic Logging Application Block SLAB Augments ETW with: • Easy wire-up Listeners to move events somewhere interesting – Windows Azure Storage Table (NoSQL, K/V, W/C) – Windows Azure or SQL Database – File (JSON) • Unit testing support – Note “Finicky!” bullet on prior slide The term “cloud” is nebulous… When does Logging become Telemetry “It is a capital mistake to theorize before one has data.” - Sherlock Holmes, DevOps Team Leader Telemetry Automatic transmission and measurement of data from remote sources. Data Facts and statistics collected for reference or analysis. SOURCE: The Internet TDD Test-Driven Dev Telemetry-Driven Dev • Need new feature or change in behavior • Bug was reported • So we… • Write a test for it • See the test fail • Then proceed to… • Write code to implement new feature or fix bug • Need to know how long a Web API call is taking • Need to diagnose error • So we… • Instrument the code • Observe the data • Then proceed to… • Answer questions & explain issues using data Semantic Logging is a Mindset • Planning – dev, ops, business are all potential customers • Move effort to earlier in development process – better-thought-out logging (instrumentation), rather than more effort in log parsing • Think about what your application requires: – Pattern: FooStart, FooEnd, FooException Questions Telemetry Can Answer • • • • • How long, on average, do my APIs take? Are my APIs meeting SLA? Is my site responding? How many users are currently on my site? Is everything going well? – Code exceptions • Is my current capacity optimal – Cloud Services Better-Defined Automatable • Some questions have answers that can be automated – SLA performance compliance – Up or Not • Do X if Y – example, SLA – SLA violations > 5% in past hour, alert human – At end of month, create report and apply credit • MUST HAVE STRUCTURED DATA to be possible – Processing the data exercise for reader Tools for Answering Questions • ETW, SLAB, PerfView • Windows Azure Diagnostics (WAD) – (quick demo if there’s time) • Log4net, nlog, Enterprise Library Logging AB • … • But wait – there’s more! The Right Tool for the Job • • • • • • • Windows Azure Portal Windows Azure Diagnostics ELMAH Glimpse Google Analytics Real Time (some for money like…) AppDyanmics, New Relic, Azure Watch, … ELMAH email From: <monitor@pageofphotos.com> Date: Wed, Sep 11, 2013 at 2:09 PM Subject: ELMAH-PageOfPhotos-Error To: codingoutloud@gmail.com System.Web.HttpException: The controller for path '/createerror' was not found or does not implement IController. Generated: Wed, 21 Nov 2012 19:08:59 GMT System.Web.HttpException (0x80004005): The controller for path '/create-error' was not found or does not implement IController. at System.Web.Mvc.DefaultControllerFactory.GetControllerInstance (RequestContext requestContext, Type controllerType) at System.Web.Mvc.DefaultControllerFactory.CreateController(Req Glimpse www.getglimpse.com Bill’s Logging & Telemetry Stack + OLD – still used/useful • Log4net, nlog, entlib logging block • IIS logs • Windows Events – Event Viewer • Existing logging from existing services NEWER – distributed apps • Event Tracing for Windows (ETW) • Semantic Logging mindset • TDD (Telemetry-Driven Dev) – Continual incremental Improvements • SLAB • Platform Services: Windows Azure Portal, Windows Azure Diagnostics • Third-Party Services: ELMAH, Glimpse, Google Analytics Real Time, New Relic, Application Insights within Azure… So Now What? • Realize old-school logging will be here for a loooong time • Realize ETW has rough edges, but is still the best we have for holistic analysis, kernel-mode performance, and standardized approach • Embrace Semantic Logging – move the effort to where it has most leverage • Embrace “TDD” and continually elevate your logging to telemetry • Don’t be a snob - use multiple tools if you can Resources • EventSource Class (in .NET 4.5) - http://msdn.microsoft.com/enus/library/system.diagnostics.tracing.eventsource.aspx • SLAB (part of EntLib 6) - http://msdn.microsoft.com/enus/library/dn169621.aspx • PerfView - http://www.microsoft.com/enus/download/details.aspx?id=28567 • Telemetry defined - http://en.wikipedia.org/wiki/Telemetry • Telemetry Basics from CAT team • http://social.technet.microsoft.com/wiki/contents/articles/17987.cl oud-servicefundamentals.aspx#Telemetry_Basics_and_Troubleshooting More Resources • Activity Id in.NET 4.5.1 https://github.com/jonwagner/EventSourceProxy/w iki/Implementing-an-EventSource • TOOL Tutorial: https://github.com/jonwagner/EventSourcePr oxy/wiki/Using-LogMan-for-ETW-Tracing Look at (if time) More Interesting EventSource • https://github.com/codingoutloud/Pronounce .io/blob/master/Pronounce.Logging/Pronounc eEventSource.cs Web API Tracing Action Filter • https://gist.github.com/codingoutloud/9109e 67e10aa2e97e7b6 des questions? des questions? Questions? Find this slide deck here See you at Boston Azure bostonazure.org Bill Wilder @codingoutloud codingoutloud@gmail.com blog.codingoutloud.com linkedin.com/in/billwilder • Except where noted, slide deck is © 2014 Development Partners Software Corporation • http://www.devpartners.com •