Deploying a Neos 2.0 website to AWS with Elastic Beanstalk

Have you ever heard the phrase: “Why don’t you just host it in The Cloud?” and wondered if that might be possible with your cms-driven website?
There are many benefits to using the public clouds like Amazon Web Services to host your website, but there  are also some obstacles that needs to be solved in order to make it work. 
With Neos 2.0 it is now possible to utilize all these benefits og running your website in the Cloud.
This tutorial will guide you through settings up automatic provisioning and deployment of a Neos 2.0 websites with Amazon Elastic Beanstalk.
Some knowledge of Amazon Web Services and Neos  is required to follow the tutorial.

Purpose and background

The purpose of this tutorial is setting up a Neos website running on Amazon Web Services (AWS). The deployment will be done with Elastic Beanstalk, and the site will be configured to not store any resources (files etc) on the local disc of the webserver but instead store all resources on Amazon S3. This is required when running application in The Cloud. You should expect that the webserver might reboot or be temporarily unavailable, but AWS will automatically create a new instance with the correct software installed, and deploy your application (the Neos site) on it. 

If you want to try it out, just clone the https://github.com/revsbech/neos-aws-base repository, create an environment and you are ready to go. 

Take a look at the finished product here: http://neosdemo-env.elasticbeanstalk.com/

There are 5 steps to setting up this environment, but it is very much worth the effort. 

STEP 1 - Local Neos site for development

First of all, make sure that you have a running version of the Neos website you wish to deploy. In this tutorial, you are deploying using github and composer, so I suggest that you fork the Neos Base distribution from here (github) and install it in your local development environment. Make sure that your site works as expected.

STEP 2 - Install the Amazon Elastic Beanstalk CLI locally

In order to complete this tutorial, you need to install the Elastic Beanstalk CLI (hereafter shortened eb) on your local development environment. See the eb manual for information on how to do this. 

STEP 3 - Creating an AWS Elastic Beanstalk application

Using the Amazon console, navigate to Compute->ElasticBeanstalk . Here you create a new application called NeosDemo

Click "next" and choose "Web server". Click "next" and choose "PHP" as predefined platform (default is PHP 5.6). For now you choose "Single instance" as your environment type. Otherwise you will install a load balancer and auto-scaling group that ensures that you always have a specified amount of webservers available. In production this is very much desired as it allows you to deploy without downtime and to have high-availability. For the purpose of this tutorial "Single instance" is fine though.

Next choose "Single application", and click "next". You then have to fill in a name for the environment. It will suggest something like neosdemo-env which is fine. The idea is that you can have several environments for a single application. Production, Staging and Test for instance. It will also suggest a url neosdemo.elasticbeanstalk.com.
Notice that this URL must be available globally, not only for your account! 

Next you are asked if you wish to create an RDS database instance for this application. Unless you have your own RDS DB preinstalled, you probably want a database. For this tutorial, you will issue a new DB instance, so you specify that you want a single RDS DB instance. The hostname (and username/password) for this DB instance is later made available during the deployment of your application. I often create the RDS DB instance manually and not as a part of the eb deployment allowing me to have more control of the database. But for this purpose a new DB instance for every deployment is perfect. 

Next you are prompted for which instance type, SSH keypair, and e-mail notification address you wish to use. Review the settings, and click Launch. eb will then kickstart a new environment consisting of a single server for your neosdemo application.
Sweet!

Let it run for a few minutes. Meanwhile, you can read a little more about eb here. 

When this is done, click the "Configuration" of your newly create neosdemo-env environment and click the gear next to "Software Configuration". 

This will allow you to edit several things: For now you need to specify the apache DocumentRoot to be /Web since this is what Flow/Neos uses. You can choose other settings as well, and you will add the environment variable FLOW_CONTEXT = Production here as well. eb will now restart your environment, and will probably fail since the Web folder does not exist yet. But currently that does not matter. You still have a lot to do before you have a running application.

If you navigate to your AWS EC2 dashboard, you will see an extra instance has been created with the name neosdemo-env and in your RDS dashboard, you will sew a new DB instance as well. 

STEP 4 - Configuring eb on your local installation

Next, you have to do some setup on your local working Neos site. As you have already installed the eb (right?), just navigate to the root of your Neos installation and type eb init. 

Assuming authentication is correctly set up (see the eb manual), you will be asked for which region you wish to deploy to. Choose your favorite region (it needs to be the same that hosts your DB instance. Next you will be prompted for which application. You should see the "NeosDemo" application you created earlier. Choose that one. Done!

You should notice that the command modified your .gitignore file (assuming you have one) to include some rules for the new .elasticbeanstalk folder. It also create the .elasticbeanstalk folder for you with a single config yml file in it. Take a look at it, in contains information about instance type etc.

Next, you have to add a few things to make it work correctly. First create a new folder called ".ebextensions" on the root of your Neos application. This folder will contains settings that control how the application is deployed on your environment.  

- Writing Flow Settings.yaml file

The first thing you need to fix is the DB connectivity. Since you dynamically spin up new DB instances when resetting the environment, you do not know the IP of the databaseserver, and you do not wish to store the username/password combination directly in the Configuration/Settings. Luckily the deployment with eb knows that you have an associated RDS DB instance, and knows the password for this! The informations are actually made available in the environment variables RDS_DB_NAME, RDS_USERNAME, RDS_PASSWORD, RDS_HOST and RDS_PORT. But since Flow expects these to be part of a Settings file, you write a pre-deployment hooks that automatically create the Configuration/Settings.yaml file during deployment. 

This is done by create the file .ebextensions/10-initDbAndConf.config with the following content:

files:
  "/opt/elasticbeanstalk/hooks/appdeploy/pre/11_write_flow_settings.sh":
    mode: "000755"
    owner: root
    group: root
    content: |
      #!/usr/bin/env bash

      set -xe
      . /opt/elasticbeanstalk/support/envvars
      EB_APP_STAGING_DIR=$(/opt/elasticbeanstalk/bin/get-config  container -k app_staging_dir)
      cd $EB_APP_STAGING_DIR
      cat <<EOF > Configuration/Settings.yaml

      TYPO3:
        Flow:
          persistence:
            backendOptions:
              dbname: '$RDS_DB_NAME'       # adjust to your database name
              user: '$RDS_USERNAME'             # adjust to your database user
              password: '$RDS_PASSWORD'         # adjust to your database password
              host: '$RDS_HOSTNAME'
      EOF

It looks a little weird, but what is does, is to create a bash-script in the file/opt/elasticbeanstalk/hooks/appdeploy/pre/11_write_flow_settings.sh 

This file is automatically run every deployment. The bash-script will simply read the RDS environment variables, create a Settings.yaml file and place it in Configuration/Settings.yaml. So your own application better not have that file already, since it will be overwritten! In real life, you would probaly have a Production/Aws sub-context that you use on AWS and write the file to Configuration/Production/Aws/Settings.yaml.

But to keep it simple, you just overwrite the Configuration/Settings.yaml file.

- Flow initialization

Next, you wish to have the deployment process to take a few steps for us. You want it to:

  • Do a cache-warmup
  • Run doctrine-migrate
  • Prune existing site (since you reuse the database)
  • Import the neos-demo site
  • Publish resources


To accomplish this, create the file .ebextensions/20-initflow.config with the following content: 

container_commands:
  20-warmup-cache:
    command: "php flow flow:cache:warmup"
    env:
      FLOW_CONTEXT: Production
  30-doctrine-migrate:
    command: "php flow doctrine:migrate"
    env:
      FLOW_CONTEXT: Production
  35-flush-site:
    command: "php flow site:prune"
    env:
      FLOW_CONTEXT: Production
  40-import-site:
    command: "php flow site:import --package-key TYPO3.NeosDemoTypo3Org"
    env:
      FLOW_CONTEXT: Production
  45-reset-file-permissions:
    command: "chown -R webapp:webapp ."
  50-publish_resources:
    command: "php flow resource:publish"
    env:
      FLOW_CONTEXT: Production
  55-reset-file-permissions:
    command: "chown -R webapp:webapp ."

  60-remove-conffile:
    command: "rm Data/Temporary/Production/Configuration/ProductionConfigurations.php"

Here you have added a section called container_commands: This is commands that are run after the code is fetched, but before the code is loaded into Apache. What actually happens on your environment is that a folder /var/app/ondeck is created, and the content of your application is fetched from S3 (you will upload it in a short while). It then detects that a composer.lock is present and executes composer install. Then it executes the container_commands in alphabetical order. When this is done, the folder is renamed to /var/app/current, ownership set correctly to the "webapp" user, and then apache is restarted.

So here you basically specify that after composer install, but before apache is reloaded, you should run the warmup-cache, doctrine-migrate, prune any existing sites import the base site, and last but not least you should publish all resources.

Normally you probably do not want to prune an existing site, and import a new one, but for this tutorial, you want to have a fresh new neos-base site for each deployment.

- deploy!

Save the files, add the whole .ebextensions to git and commit it.

Now you are ready to deploy your NeosBase application to the neosbase-env environment. This is done simply by typing eb deploy on the command-line.

What happens is that the eb cli automatically detects that you have git, and uses that to create a zip file with files respecting you .gitignore files. So only the files that are under version control will be included in the zip file. The zip file is then uploaded to S3 and transferred  on your neosdemo-env environment.

Here the content is unpacked, a composer install is run (since it detects the composer.lock file), your custom container_commands are run, and if all goes well, the application is moved into place, and apache restarted and boom, your website is ready.

- Accessing the webserver over SSH

A nifty little feature is, that you can run the eb ssh command to ssh to your new webserver in the environment. The only thing with this, is that  private key of the pair name you specified during environment creation must be present in your .ssh folder with the name of the Key pair. In my setup the key-pair is named "MOC Ssh key", and so I need to have the private key stores in .ssh/MOC Ssh key. This took me a while to figure out.

When ssh'ing into the server. Your application will be deployed to /var/app/current. Apache and php error logs are placed in /var/log/httpd/error.log and the file /var/log/eb-activity.log contains the log of your deployment where you should also be able to spot the command from step 2.

- Accessing the site over http

Now its time to access the site in your browser. Use the urls that was associated with your environment. Mine is neosdemo-env.elasticbeanstalk.com, yours is most certain not (as the URLs are globally unique).  If you forget the URl, you can always use the eb open command.

What a bummer, the site looks al weird:( Read on to fix this!

STEP 5 - Fixing static resources

The reason the sites looks weird, is because static resources by default uses the SymlinkTarget which means that the folder Web/_Resources/Static/My.Packages is a symlinks to Packages/Application/My.Pacakges/Resources/Public which is normally fine. However this does not work with eb due to the way the deployment is made. Here is why (took me a while to figure out):

When you do the flow resource:publish command, you are still working in staging, meaning that you work in the /var/app/ondeck folder, and hence the symlinks til point to /var/aop/ondeck/Packages/Application/My.Pacakges/Resorces/Public. When eb later on in the deployment renames the ondeck folder to current, the symlinks no longer work. So either you can do the resource:publish after you have renamed the directory, og you can simply tell Flow to use the normal FileSystemTarget which will copy files. This options was in Flow 2 known as mirrormode. In Flow 3, you instead write the following in your Configuration/Production/Settings.yaml

TYPO3:
  Flow:
    resource:
      targets:
        ## Change to not use symlinks, since deployment on AWS with EB will not work
        localWebDirectoryStaticResourcesTarget:

What about persistent resources?

Good question. Since the whole point about using The Cloud is that you can always scale with more webservers, and if an instance should crash, AWS will automatically create a new instance for us and deploy it, you need to think about the resources that the CMS editor uploads to Neos. This is known as persistent resources. 

To keep in the spirit of the cloud, you should never save important data on local disk of your instance. Instead you should use a persistent and shared storage layer. Luckily AWS provides such a system called S3. And with Flow 3.0 there is a standard Package for storing and publishing resources in S3. To install this package simply run composer require flownative/aws-s3 1.0.0-beta4

The detailed manual of how to use this package kan be found here: https://github.com/Flownative/flow-aws-s3

For your installation you add the following to your Production/Settings.yaml

TYPO3:
 Flow:
    resource:
      collections:
        persistent:
          target: 'cloudFrontPersistentResourcesTarget'
          storage: 's3PersistentResourcesStorage'

      storages:
        s3PersistentResourcesStorage:
          storage: 'Flownative\Aws\S3\S3Storage'
          storageOptions:
            bucket: 'flow.storage'
            keyPrefix: 'sites/neosdemo/'

      targets:
        ## Change to not use symlinks, since deployment on AWS with EB will not work
        localWebDirectoryStaticResourcesTarget:
          target: 'TYPO3\Flow\Resource\Target\FileSystemTarget'
        cloudFrontPersistentResourcesTarget:
          target: 'Flownative\Aws\S3\S3Target'
          targetOptions:
            bucket: 'flow.publish'
            keyPrefix: 'sites/neosdemo/persistent/'
            baseUri: 'http://d3jn4705a8oq8f.cloudfront.net/'

What this does, is simply to create storage and pulishing on S3 in the buckets flow.storage and flow.storage buckets under the directory sites/neosdemo. These buckets will needs to be created beforehand. To authenticate, you can either add the S3 credentials directly in the Settings file as described in the AWS S3 package docs, or you could configure Amazon IAM to allow the webserver access to your buckets. The later is the preferred way. 

Since S3 is not well suited for heavy load, you have set up a cloudfront CDN service which access the S3 bucket and specified that resources should be linked with the url http://d3jn4705a8oq8f.cloudfront.net

With this, you will have a complete Neos site running in the AWS Cloud with unlimited scalability and high availability. Enjoy!

Thoughts

One might argue that it is a bit risky to have you deployment process being dependent on composer and github to be available. The deployment will fail if github is down, and that might be a big problem. To come around this, you could actually have the eb deploy script pack up all of your files, including the packages that composer normally installs, and then not run composer during deployment. This means that even if github is down, you can still deploy you application. There is a small trick to this however. To trick eb into not running composer install (no points since you packaged up everything), just create an empty vendor folder in the root of your project. eb will then skip the composer install step, even if a composer.lock file is present.