Surviving Technical Due Diligence – Case Studies Mike Kelly Managing Partner Topics • Tech DNA and the work we do • Top problems we find in tech due diligence • Case studies • Q&A • Started in 2009. • Consultants all have 10+ years of experience with large-scale software development: Microsoft, Black Duck, various start ups. • About 40 projects to evaluate software quality – everything from mobile apps, cloud services, integrated hardware/software products, on-premise IT products. • Multiple languages / technologies. • Developed methodology and automated tools. Tech DNA focuses on Technical Due Diligence • • • • Many people aware of financial, legal and IT due diligence during acquisitions For technology acquisitions, the technical and staff quality can be as important Our clients are the acquirer or investor and are concerned about: • • • • • What exactly does it do (beyond the marketing hype) and how? How easy will it be to integrate with other products / services I offer? What areas am I going to have to invest in that I may not have expected to (valuation)? What about growth – scale, DR, etc.? Sometimes other specific articulated / unarticulated concerns we try to address Basic question: if I’m taking on a bunch of new code and people, what should I be worried about? Buy Side vs Sell Side • We have mostly worked on the buy-side and for PE investors. • Also potentially value on the sell-side to make sure you are better positioned for eventual sale. • Nip a few things early on that would be problematic if going on for a long period. • Other services • • • Your core business may not be technology but have just acquired a technical component. What things should you worry about? New CIO / CEO wants an evaluation of key internal technical projects – we’ve done this for a UK-based private equity investor. Outsourced development efforts are “off the rails” – what’s the path forward. People • We’re doing technical due diligence – but a lot is about people. • • Good CTOs drive solid designs and architecture Good devs are professional about practices like testing, source control, etc. • Our technical audit really is also an audit of the people involved in building this tech. • Sometimes that’s explicit, sometimes implicit. Approach Source Code Tests Stories, Tech Doc Interviews Technical Research 2-4 weeks Summary • Our report gives a high-level summary of risks by technical area. • We also go into detail on each of these areas. • Goals: • • • Inform Integration Planning Potentially inform valuation Identify pre-close remediation work. • Let’s look at a few areas that consistently come up as problematic. Top Problem Areas • We analyzed about a dozen projects we’ve done across a variety of different technology areas – desktop apps, server products, etc. • Top RED (problem) areas No Threat Models 100% No DR Plan 100% Code contains "magic values" rather that having these in config files 100% UI is English only (no localized UI) 100% Code is not localization ready (separated localizable resources) 92% No security testing regularly done 92% Code is poorly or not at all commented / inadequate tech doc 85% No automated tests 77% No “Smoke Test” for Check-ins / Commits 54% Top Problem Areas: Security No Threat Models 100% No DR Plan 100% No security testing regularly done 92% • Three of these are related to security • • • Threat models are an organized way of reasoning about potential threats to your system and how to address / mitigate them. Don’t have to be extensive – the security consultants we work with typically develop basic ones in about a week. Disaster Recovery (DR) is arguably a security issue. Lots on the web about this. Key point: Regular practice makes perfect. Security testing is a must. With the Home Depot, Anthem, etc. exploits we see the cost of not doing regular security testing. Without threat models, security testing is just random poking at things that seem to make sense intuitively. Disaster Recovery Case Studies • One company we looked at had a backup CoLo for servers but when Hurricane Sandy hit NYC, their backup CoLo was also knocked out due to a multi-state power outage. They were down for about 12 hours. • Another was an enterprise app and part of the DR plan relied on each user (who might not be in the affected area) changing the path to the server instead of relying on multiple “A” records in the DNS path to the server. • Cloud hosted ≠ having a DR plan. • Understand “availability zones” and take advantage of them when you deploy. • Most companies we look at have no DR plan, and in most cases in the few who do, it hasn’t been fully exercised to find gaps and get staff familiar with what to do. Resources for Security / DR Issues • • • • OWasp Threat Modeling https://www.owasp.org/index.php/Threat_Risk_Modeling Jesse Robinson deck on DR - http://www.slideshare.net/jesserobbins/amedaycreating-resiliency-through-destruction Using DNS for DR - http://serverfault.com/questions/424785/if-dns-failover-is-notrecommended-what-is NCC Group on security testing - https://www.nccgroup.com/en/ourservices/security-testing-audit-and-compliance/security-and-penetration-testing/ Top Problem Area: Poor Code Practices Code contains "magic values" rather that having these in config files • • • Code is not localization ready (separated localizable resources) 92% Code is poorly or not at all commented / inadequate tech doc 85% Startups are busy – so very tempting to cut these corners. Most of these aren’t really time-consuming. They’re like flossing – you just have to get in the habit and then it just happens without thinking about it. For config files and loc – someone needs to invest in setting up the framework and then everyone can easily take advantage of that. • • 100% Note this is localization ready, not localized. The decision to localize is a business decision. The decision to be ready for it is a technical decision. Much easier to do early-on than to retrofit into a large codebase. Architecture Drawing: Something Better than Nothing • • • • Architecture drawings don’t have to be polished Graffle or Visio drawings. A photo of a whiteboard (as was supplied by one company to us) is often very helpful. Easy to capture as you modify designs. Helpful for new hires as well One idea – have the architect narrate a video of the whiteboard drawing and capture that – 5 minutes of work and worth its weight in gold. Config Values: Case Studies • • Here are some of the things we’ve found embedded in code instead of separated into config files: • • • • • • Passwords for ancillary services (e.g. database, billing backend) Email addresses of staff members Private keys for encrypting data like passwords, credit card numbers, etc. “Secret” keys for connecting to private APIs “Special” user names for users given backdoor / superuser access to the system For most settings, we prefer config files deployed along with the code to storing these things in a database: • • • • IP addresses of ancillary servers They have source control history associated with changes If a database is restored, old values aren’t inadvertently restored along with user data Some platforms have special support for “debug” values of settings in config files (e.g. ASP.Net’s web.config, Java Configuration, etc.) Config separate from code is usually essential for effective DR. Code Practices: Case Studies • Bad example: no source control, multiple source trees (one per customer), no code review, no test. • Good example: using a GitHub shared repository, all changes managed with pull requests from per-feature branches. • • • Pull requests are reviewed by a senior developer. When accepted, they are merged to the “develop” branch. Continuous integration (CI) builds of the “develop” branch run automated tests and push test results into a Slack channel the team uses. • Resources for Code Practices Search for “i18n” for your framework/language – typically there are frameworks already there (“i18n” is a tech synonym for “internationalization”) • • • • • Ruby, Python, Java, C#, PHP all have packages to help with this. If you want to over achieve, do a “pseudo loc” release of your product – where the values for the localized strings are all machine-generated gibberish. Then click around in your UI and see if anything English comes up – that’s a missing loc item. Stack Overflow on I18N - http://stackoverflow.com/questions/tagged/internationalization Config files – don’t reinvent the wheel – use what your framework provides. YAML, JSON, XML are all good choices. Make code comments part of your peer code reviews (you are doing peer code reviews, right?). Anything that has to be explained in a code review is a good candidate for an added comment. Clearly named methods and properties help avoid need for comments. Top Problem Area: Testing • • • • • No security testing regularly done 92% No automated tests 77% No “Smoke Test” for Check-ins / Commits 54% Security testing already covered Automated tests – probably understated here. The 33% who have them are usually pretty minimal. Typically: “we don’t have time”. Good automated test suite will let you go faster: • • You can deploy with confidence that you haven’t broken anything. You catch bugs early on in continuous builds rather than downstream when they are harder to diagnose / figure out. Smoke test is just a subset of the automated suite – run on every commit means devs can’t push fundamentally bad code to the shared repo. • Initially your smoke test can just be your whole automated test suite. Subset to a “smoke test” only if running that on every build is too cumbersome. Testing: Case Studies • Worst case (unfortunately typical): No test. Not automated. Not manual. All ad-hoc testing. Very challenging as user base and complexity grows. • Medium case: CRM app. Manual test script tracked in Excel or Google Docs. Copy workbook prior to deployment, follow steps and mark results. Developers followup on any failures. • Best case: Hardware storage appliance with embedded software. Fully automated tests run on all builds, both developer unit tests and E2E integration tests based on real-life scenarios. Failure injection at all levels, 72 hour burn in tests run on weekly builds. Test coverage measured at 70% of codebase. Customer bugs result in developing a new automated test added to suite. Resources for Testing • James Whittaker, “How Google Tests Software” http://books.google.com/books/about/How_Google_Tests_Software.html?i d=vHlTOVTKHeUC • Stack Overflow posts on Smoke Tests http://stackoverflow.com/questions/tagged/smoke-testing Conclusions • • • • • • • Tech DD is like having a home inspector come when you sell your house – if you’ve been keeping up on routine maintenance, you’ll usually do fine. Not just for technology companies buying technology companies – any acquisition where the tech is a key part could benefit from this. • E.g. we did a project for a transportation logistics company buying a LoB app. These areas aren’t just nits – they can cause real integration costs for acquirers and can lead to reduced valuations. Also marks of professionalism which make an acquirer feel more confident about the acquisition. Just covered the top 10 that are most frequently problems – look at about 60 different areas. Didn’t cover third-party / open source licensing – but also can do that. More information: www.tech-dna.net or mike@tech-dna.net. Q&A Appendix Acquisition Risks Time to Market Risk Valuation Integration Problems Software Acquisition 24 Acquisition Stages Strategy Identify Target(s) Negotiation Letter of Intent Close Integration Due Diligence • • • • Tech DD happens after acquisition target has been approached and a Letter of Intent (LoI) has been signed LoI typically has an exclusivity period (45-90 days) prior to close DD happens during this period to help inform final valuation negotiations, identify any remediation needed prior to close and to inform final go/no go decision Very time-constrained – we typically conduct our investigation over a 2-3 week period Top Green Areas • • • • • • • Is code distributed among logical/understandable components? Are interfaces used? Is information hidden effectively (e.g. right level of access limitation to classes, members, properties, etc.) Can the team make changes to any prior release still in supported use and rebuild and release the change? Does the source code control system accurately track any public release (including betas, community technology previews, etc.) Is there a lab environment? Who supports that? Is there a prioritized bug list? Is there a process for triaging and managing change just prior to release?