Alfresco on AWS Provisioning and deploying Alfresco solutions on Amazon Web Services Advantages Of The Cloud What is the cloud? Our Definition • “The Cloud” is a massively overloaded term • Here, we are using it to refer to virtualised infrastructure, wrapped in an easy-to-use web service API • Infrastructure as a Service • Can be private (See OpenStack & Ubuntu) as well as public What is the cloud? The Wider Sense • Storing something “in the cloud” has come to refer to using network-accessible resources to store objects (usually files) • Marketing around the cloud has created a very broad definition, companies and products previously operating in the Internet are now part of the cloud How can it help us? Advantages • Infinite resources • Pay as you go model – no capital expenditure on infrastructure • Geographically distributed • Rapid scalability • Provisioning speed – new machines can be provisioned and deployed to almost instantly How can it help us? Disadvantages • Actual hardware is managed by a third-party in a central location – issues with this hardware can cause outages • Provisioned instances don’t have concrete hardware specifications, and so sizing architectures can be difficult • Privacy – Some industries have legislative restrictions on how and where they store data How can it help us? Alfresco • Alfresco Cloud – the ultimate example of a successful Alfresco AWS deployment • Client solutions – Allows the infrastructure behind their Alfresco solution to grow as they do • Cost savings – A move to operational expenditure reduces the risk of a deployment • Loaded with content – Alfresco manages content effectively, the cloud can help ensure that that content is distributed efficiently, securely and reliably AWS The AWS console AWS Services • EC2 - Pay-as-you-go (or reserved) compute power, delivered as a virtual server, with a number of premade OS images to install • S3 - Storage service, accessible over HTTP • EBS - Hard drive image service, disconnecting instances from their data • VPC - A virtual private cloud, providing VLAN-like control of network segments, and a VPN option DevOps DevOps What is it? • A methodology that makes collaboration the highest priority, to ensuring development and operations teams can deliver software quickly and efficiently How does it relate to the cloud? • Tools, like Puppet and Chef, have been built to help Developers and Operations get on the same page • IaaS environments create a juicy target for these tools • Automation means issues get resolved quicker, and we can do things like auto scaling! Puppet What is puppet? Service Orchestration • Puppet is a resource-oriented method of specifying what you want your nodes to look like • The how of ensuring a given node looks like its specification is taken care of by puppet • Resources in the node specification typically take the form of abstracted concepts, such as a “service”, a “file”, a “mount” etc. Back to front? Puppet Master Puppet Certificate Authority 1 3 Puppet Agents Puppet Puppet Agents Agents 2 The Puppet Agent drives updates • On every node in an environment, the puppet agent needs to be installed • This agent discovers the master from DNS Resource driven • • • • [Tomcat must be installed] [Tomcat must be started] [Imagemagick must be installed] [alfresco-global.properties must have these values: dir.root=/opt/alfresco swf.dir=/opt/alfresco/swf ] • [The alfresco and share webapps must exist, copied from this location: /opt/puppet] Alfresco manifest snippet – pt1 service { 'tomcat6': ensure => 'running', enable => true, require => [ Augeas['alfresco-global.properties'], Package['tomcat6'], File['/var/lib/tomcat6/webapps'], ] } Alfresco manifest snippet – pt2 augeas { 'custom-log4j.properties': incl => '/var/lib/tomcat6/shared/classes/alfresco/exte nsion/log4j.properties', lens => 'Properties.lns', context => '/files/log4j.properties', load_path => '/etc/augeas', changes => [ "set log4j.appender.File.File /var/log/tomcat6/alfresco.log" ], notify => Service['tomcat6'] } Provisioning from the command line puppet node_aws bootstrap --image=ami-f9231b8d --type=m1.medium -keyname=gatewaykey --login=ec2-user --keyfile=/home/ec2user/.ssh/gatewaykey.pem --puppet-version=2.7.14 --puppetagentcertname=cms3.ixxus.co.uk --region=eu-west-1 --tags=Name=cms3 -subnet=subnet-6c033205 notice: Creating new instance ... notice: Creating new instance ... Done notice: Creating tags for instance ... notice: Creating tags for instance ... Done notice: Launching server i-49327f01 ... ########################################################################## ######################################################################### ########## notice: Server i-49327f01 is now launched notice: Server i-49327f01 public dns name: , private ip: 10.0.0.56 notice: Waiting for SSH response ... notice: Waiting for SSH response ... Done notice: Installing Puppet ... notice: Puppet is now installed on: 10.0.0.56 notice: No classification method selected notice: Signing certificate ... notice: Signing certificate ... Done Alfresco on AWS - Opportunities What can we do? Opportunities • Auto scaling – CloudWatch, New Relic, Nagios and other monitoring / APM tools can help you trigger actions when load hits specified limits • These actions can include provisioning new instances in AWS • An Alfresco Admin Console for AWS • Show the current status of the whole environment • Trigger actions to provision more servers • Historical usage statistics • Much, much more! Cloud Challenges Infrastructure setup DNS • EC2 on it’s own doesn’t help with persistent addressability • Depending on architecture, you may need a way to address nodes after a stop/start VPC • Security requirements may mandate the use of a VPC • VPCs are great, but leave you in charge of network setup • They give you sticky IPs VPC beginnings... Internet External tier Gateway Internal tier CMS Search Incoming route Outgoing route XML RDBMS Transform. Infrastructure setup Images • Another question of cloud architecture • Do you bootstrap from nothing, have a base image, or maintain images for all of your services? Storage • EBS provides a quick answer, but reliability concerns need to be considered • Puppet artefact storage • S3 (alfresco-cloudstore)? Puppet challenges Automated provisioning and certificates • Often, the ability automatically deploy a node, or an entire environment is highly desirable • One option is node_aws • Setup of a Puppet Master is often required before automated provisioning can begin • Puppet Master needs to automatically sign each new node • The ability to resolve DNS internally is useful here Puppet challenges Code Artefact development • At Ixxus, we primarily use Maven (with some Ant and Gradle thrown in) • How do we get our developed artefacts from CI to our application servers? JAR WAR Repository Puppet “File” resource Application Server Thanks! Further reading • http://docs.puppetlabs.com/guides/cloud_pack_gett ing_started.html • http://aws.amazon.com/documentation/ • http://code.google.com/p/alfresco-cloud-store/ • http://newrelic.com/ • http://augeas.net/ Questions?