RSS 2.0
29.06.10

The new MOC hosting setup.

Af: Kim Jørgensen

I am writing this article for two reasons.
The first is that i wish to share my observations and the plan for the future of our hosting setup,
The second is to open up for any and all comments to the design.

The old setup
The old setup is what i think is pretty standard.
We have a bunch of servers, each working on their own with the basic components;
Webserver, PHP, SMTP-daemon, FTP-server and Database server, all on the same box.
Some are on a virtual server, some run bare-metal.

The new setup
The new setup is something i have been planning and researching for a long time.
It started with the a basic idea of wanting complete N+1 infrastructure.
So we had a failover anytime something would die.
This is not something we have experienced often, but the knowledge that it could happen is still there.

On virtualisation
We started trying to overcome this problem by moving to a virtual infrastructure.
We had some initial trouble getting everything working with virtualisation, and still have.
The biggest problem with virtualisation is that there is a noticeable performance degredation.
I think virtualisation is a great technology.
And for someone looking to save some money on running a company's internal servers on a pretty failsafe system, without a great need to scale, it's perfect.
But i think that for us, to run our most important service on, its wrong.

A different type of virtualisation
Basicly, we already have a point of virtualisation and seperation of workloads. Our virtual hosts in apache.
With this as our basis i have designed the new setup.
I want to split up everything on virtual hosts, and genereally seperate our primary service, web-hosting, out from everything else.
This will allow us to focus on the performance and stability of this service, and give it a layer of isolation.
To allow our webservers to shift workloads around and scale-up on the web-workload of a single site, we will be using a reverse caching proxy.
The caching is just a bonus, the real trick is in the proxy, which will send the request for any site to the correct webserver, based on what host is defined in the request.
Beyond the proxy, two other things are required for this setup to work.
All webservers must be able to access the files that make up the website.
This is accomplished somewhat simply by holding all files on a NAS.
The second item that is required is a non-local database.
Luckily MySQL allows network connections so we can place the database on a dedicated server which the webservers then connect to.

SQL clustering
Putting our MySQL servers on their own hardware has two advantages.
We don't have to deal with the MySQL performance killing IO of a SAN drive, and we can setup a failover system
that can handle MySQL easier. If we dont have to worry about other services on the same service.
The search for "the one good solution" to scaling MySQL performance while running Typo3 has not been easy.
I have looked at every type of high-availability and scaling method i could find for MySQL.
But all look like they have severe complications for the sites we run, and would require significant changes in how we connected to and use MySQL.
The solution that i think will win and prove itself as actually useful is in two parts.
The first is the high-availability element. This will be done by running a Multi-Master setup with a shared Multi-Master-Manager.
The way we then scale performance is a bit of a kludge, but the best way i could find. We scale out.
We will create several MySQL groups, consisting of two servers. An active and an inactive master.
And then we move sites from one server to the other when it becomes overcrowded.
The Multi-Master-Manager will handle the monitoring and switching of the active and inactive servers, if one is to fail.
It will also handle syncronisation so failover and failback should be automatic.

Other services
With the main focus of the services on a good stable platform, its time to look at the less critical stuff.
This is the services like FTP and email.
These have a very specific trait. They are for most of the time idle, and they then have sudden spikes of workloads.
And while a 1-2 second lag on a website can be really bad, its okay for FTP and a lag of 15 minuts or more is generally accepted for email, especially newsletters.
FTP is going to be handled very simply by assigning a group of sites file-access through a single FTP-server.
All the FTP-servers will be run on a virtual host.
Email has to be handled a bit differently. Here we can't just seperate on sites, but have to seperate on what type of email it is.
If a user just signed up for a services on a website, that user would expect the confirmation email to arrive near instantly.
But if you are sending out a newsletter to a large group of customers, you will probably not care if it's 15-30 minuts in transit.
We handle notification and normal hosted email services the same.
They are eather send from the webserver or the webmail server, and they go out directly.
Newsletter emails are handled quite a bit differently. We have any hosting customer add a header to their outgoing email when it is a newsletter.
This allows us to offload all the generated emails to an email server which will create a long queue of emails and then send them out as fast as possible. The time used to build the queue also allows the email server to group emails going to the same domain.
Both types of email servers will be run on virtual hosts.
For some customers we also have a third form of special service. They have a dedicated video conversion service.
As this shares the peak-load behavior of the other services mentioned here, they also run on virtual hosts.
With this setup we can share the resources of the servers who only have a peak-load, and we can still get a high fault-tolerance.

Our current level of deployment
So, how much of this have we actually deployed?
Well a good part of it.
The email-splitting and handling is well in place and working.
The first group of sites have been moved into a setup with redundant webservers, and a file server.
They are also running behind a reverse-proxy. With great results both in performance and stability.

We also have some of our most demanding sites running behind the proxy.
Although this is not yet to use the failover ability, but to highten performance.
You can read the other blog post about the Varnish-cache proxy on this blog to get an idea of the performance boost it gives.

We have also gotten the special video-conversion services moved to their own virtual hosts.

We are also running some sites on a dedicated MySQL server and we have started the test-phase for the failover setup for MySQL.


Expected project results
What we are already seeing as results from this project is two things.
More stability and better performance.
What I expect to get from this setup in the future is the folloing set of features.

The ability to easily see where a performance problem is. As the services are isolated, its easier to spot.
F.X. A climbing load on the MySQL server will most likely mean theres a need for more MySQL muscle, or a problem with the way the database is used.

The ability to easily respond to a performance problem by moving the problematic site to a new server in the group where its giving problems.
F.X. if a site is putting allot of load on its assigned webservers, we can easily give it more muscle by extending the range of servers which handle that site. Again without downtime. Equally if its a problem with MySQL load, a site can be moved to a new server.
Although this would give downtime.

The ability to take a webserver out of production, upgrade it, and put it back online. Without any downtime to any sites.
A failsafe, in case a database or webserver fails. And better stability for FTP, email and specialised services.


I hope this little intro to our new setup for our servers has been interesting.
All comments are ofcourse welcome.

Ingen beskeder

Fandt ingen beskeder i gæstebogen

Nyheder

I dag, fredag d. 2/12 holder vi julefrokost fra kl. 12.00 og vil derfor ikke være at...

» Læs mere

I går dumpede der et diplom ind af døren. MOC er blevet udnævnt til Gazelle 2011. Det betyder at vi...

» Læs mere

Blog

14/03/2011

Take a look at one of the project at the MOC Hackathon. We decided to try and make a full webshop...

» Læs mere
07/03/2011

Take a look at our new tutorial section on our website. We will try to write new tuorials...

» Læs mere

Twitter