PDA

View Full Version : Load balancing cluster.


bigb89
06-16-2008, 11:10 PM
Hi guys,

I'm not sure if what I'm gonna ask is too complex or possible to do, but here it is: I'm trying to create a load balance cluster but I really don't know where to start.

Here's what I'm trying to do: I want to create a load balance cluster for shared hosting. I work for a company where we have multiple shared hosting servers, but of course we can't have all of the customer in one single server, so once a server has enough customers, we have to use new hardware and also buy a new cpanel license (which isn't very cheap).

So my boss asked me if there's any way to create a cluster with multiple servers that will pretty much act as if it is a super computer. And by doing something like that we would only need one cpanel license and all we had to do was just add more servers to the cluster once we got more customers coming. This would pretty much be multiple servers working together as one super computer.

Here are my questions: What king of software/application would I need to use to develop something like that? Do you guys know of any how-to article that will teach me to develop a load balancing cluster?

Any help on this will be very appreciated! :)

P.S. This doesn't need to created using just *BSD. If you guys know this cluster implementation using linux (CentOS to be specific) I would be happy to know about it.

Thanks in advance!

stukov
06-17-2008, 02:48 PM
bigb89, I am in a very similar situation of yours. As a hobby, I am building such a cluster.

The main problem here is cPanel. If you want to build such a cluster, you will need one cPanel license per node and then synchronize your nodes. This isn't very cool, knowing the cost of those licenses.

Anyway, the basic setup of a HP/HA Web hosting cluster is having two redundant load balancers (I recommend using PF and CARP for those) and a set of Web server nodes behind serving the pages with a common /tmp or a custom session store (you will need that your PHP sessions replicate from one node to another if one goes down).

After, if you want a MySQL cluster, it is another story...

I hope this is clear enough so you understand how it works and get started. If you need any help, feel free to post back here. I'm going to do my best to help you out with the best of my knowledge.

Good luck.

dk_netsvil
06-17-2008, 04:21 PM
I've had quite alot of experience building and installing HA load balancers using Linux variants, but I've yet to do so using BSD. HAProxy has FreeBSD and OpenBSD sources available, though, and is very reliable - so you might want to give it a shot. If you're interested in a Linux solution there's also IPVS.

Either way I suggest using a pair of load balancers in a failover configuration using sysutils/heartbeat (if using BSD).

For several of my production clusters I've used 4 identical machines where 3 of them are web servers (which are almost always FreeBSD) that are served to round-robin and the 4th is the master MySQL server (also usually FreeBSD). Typically I set up one of the web servers as the MySQL slave using replication.

And, in some cases, I've mounted the /www directory to each server from the MySQL server, so I can add new web servers to handle additional traffic.

cajunman4life
06-17-2008, 05:42 PM
You can always set up one machine with a scsi disk array attached, and use NFS to export (this will require you to have a private vlan with which to share the filesystems to the other machines... you wouldn't want to share them over the network that customer traffic is coming across).

So you can set up /www on the machine with the disk array and NFS export it, so that no matter what system the users log in to, the /www will always be the same.

Same goes for /home. Or maybe you can host them out of /home/$LOGIN/public_html or something similar.

On my set up I have a seperate filesystem /www, and /home/$LOGIN/public_html is a soft-link to /www/$LOGIN.

And the cost of cpanel licenses is staggering :( I know cpanel is extremely popular, but maybe you can look into ISPconfig (which runs beautifully on CentOS). It's free and gives most of the functionality of cpanel (But then again, in places where I do have access to cpanel I hardly use it, I find it too clunky for me... I'd rather be on the command line personally but not too many hosts allow this).

bigb89
06-17-2008, 05:59 PM
Thanks for the help guys.

This should give me an idea of least where to start.

I'm guessing that the only problem that I would still have is with the cPanel licenses for each node that I would use in the cluster like stukov mentioned. I'm gonna go to the cPanel website and ask the support team if there's any way to create such cluster using a single license.

vermaden
06-17-2008, 06:13 PM
You should search for nginx and memcached.

lvlamb
06-17-2008, 06:45 PM
Heartbeat is now in the OpenBSD tree. (Among other "cluster"ing systems, pls. define your understanding of "cluster".).
Clusters as "CPU sharing" still work best with the "old" Linux 2.4.* kernel (although OpenMOSIX stopped upgrading).
You also can run applications on one CPU while using another box to share files. Or boot a headless box and run processes on it. Or ...
what is your question?

bigb89
06-17-2008, 11:29 PM
You also can run applications on one CPU while using another box to share files. Or boot a headless box and run processes on it. Or ...
what is your question?

I want to share the Load (cpu and ram) between the number of nodes that I have in the cluster. So lets say I have two nodes in my cluster, each has dual xeon, and 3GB of RAM, then I would combine those two nodes and it would be like if it was a single server that had 6GB of RAM and whatever amount of CPU power of both nodes combined (also it would have the disk space of both nodes combined).

Now here's the tricky part: if it is possible, I would like to have one single cpanel license running on that cluster (after all, it is supposed to act as single super computer) so that I wouldn't need to get a new cpanel license each time a single server had too much load. Instead I would just add more nodes to the cluster and therefore increasing its power.

stukov
06-18-2008, 01:32 AM
Hey, I just stumbled on OpenSSI after reading lvlamb's post. Looks promising for you. I'm going to test this application in the next weeks. Very interesting.

lvlamb
06-18-2008, 04:09 AM
My understanding of Heartbeat is that it "steals" CPU time from any free or balanced CPU in the grid. The master runs the program whichever number of nodes are involved.
So, number of licenses are a matter of discusiion with the vendor: you have a quad core and one license only is kosher.
Obviously, you'd need other licenses if you build a failover redundant system.

You might look at Sun clusters (OpenSolaris Open HA still is a project), haven't looked deep enough in OpenSolaris to see how it works.

As suggested, Free- and OpenBSD have Heartbeat in the tree,
OpenBSD also has the ole IBM style PSSP http://openports.se/sysutils/clusterit

You might also look at http://idea.uab.es/mcreel/ParallelKnoppix/ (LiveCD) and is a 2.6.* kernel.

bigb89
06-18-2008, 04:43 AM
YES!!!

I haven't gotten the chance to poke around too much, but so far it seems like OpenSSI and OpenMOSIX is the exactly type of application/system that I'm looking for.

I guess that this whole time I wasn't perhaps looking for a "load balance cluster" but a "Single-system image or SSI cluster" as I found this term defined on wikipedia. So that could be the reason why I couldn't find the right application that needed when I did a google search.

One other thing that I did find interesting while reading about Single-system image on wikipedia, is that DragonFly BSD has SSI (Single-system image) as one of its long term goal.

Things should be very interesting for next couple of months (maybe less) as I'll be trying to implement a Single-system image cluster. :)

Please let me know if any of guys has ever done something like that. I would like know if its working for you.

Thanks once again guys! You really helped me out to know what I need to use.

dk_netsvil
06-18-2008, 02:49 PM
I concede that I have misused the term "cluster" which is more appropriate for a system where a job, such as rendering or calculation, is distributed over several machines simultaneously to decrease the amount of time it would take to process. I think, in this case, it might be more appropriate to call them high-availability redundant servers or something related to traffic queuing.

stukov
06-18-2008, 08:03 PM
dk_netsvil, cluster is a very large term for describing many types of systems. From wikipedia: "Cluster, a group of computers that achieve a common computation".

@bigb89: I have a few spare machines here. In the next week I'm going to try the SSI cluster with OpenSSI. I'll update you when I'll know more.

dk_netsvil
06-18-2008, 08:19 PM
About 8 months ago I picked up a lot of 12 IBM eServers for the cost of shipping - I'm sure something similar is available at your favorite auction site. With the diminishing cost of those older 1U machines it's affordable for most anyone to put something together.

bigb89
06-19-2008, 03:40 AM
Another SSI cluster application that seems to be very good is kerrighed.

The only problem with kerrighed, is that it doesn't support hot node addition/removal (although they're planning to support that by November of this year) right now.

So OpenSSI still seems to be one the best SSI applications out there, even though it seems like they haven't released anything since 2006.

ai-danno
07-03-2008, 10:19 PM
It's 4 years old now... and an old linux kernel (ugh...) but if you were looking at OpenMosix.... perhaps Cluster Knoppix (http://clusterknoppix.sw.be/) is a possibility.

ai-danno
07-03-2008, 10:28 PM
And I don't know how apprapro this may or may not be... but http://www.rocksclusters.org

A lot of high-end research facilities use this as their clustering solution... and it's up-to-date.