A powerful platform to build web and mobile apps that scale automatically is Google's punch line for App Engine. And automatically is the keyword: it's difficult to underestimate the power of a platform that allows any application to react to a different user induced load automatically, with no intervention from sysadmin, or developer. We loved that statement so much that we wanted everyone to be able to take advantage of autoscaling, and that's why AppScale was created.
The Basics of Scaling
Google has extensive documentation on the scaling of App Engine applications. In the documentation you will find references to application instances and how latency is the main factor to understand how many instances are needed to satisfy a specific load. Since App Engine applications run on the Google platform, the promise of infinite resources at their disposal is as true as it can get. Limited perhaps by the customer wallet.
AppScale works in a very similar way: the application is allowed to scale up for as long as resources are available. While in Google instances determine the memory available in each instance, in AppScale we have a configuration option to achieve the same. Similarly, latency is used to determine if and how to scale the application up or down. What's different is how resources are acquired to allow the application to scale.
|App Engine applications scale automatically based on load, running in Google or AppScale. Users won't notice much: just that requests are served timely.|
Within an AppScale deployment, some nodes are AppServer nodes, which means their CPUs and Memory will be dedicated to run application instances. Once the resources within the AppServers nodes are exhausted, if the underlying infrastructure allows it, new nodes can be acquired (up to the desired maximum) as new AppServers. AppScale supports this on Cloud environments, like GCE, AWS, OpenStack, HP Helion Eucalyptus, and there are some experimental work for vSphere.
Scaling in AppScale
For autoscale to work properly, AppScale needs to be able to answer two questions: Does the application needs more resources? Do we have resources available to start a new instance? Whenever an application is uploaded within an AppScale deployment, the AppController (a component of AppScale) automatically creates a load balancer (we use haproxy) configuration for it. This allows the application's instances to be added or removed with no service interruption. Periodically the AppController checks the application statistics within haproxy to see if the application is struggling. This allows AppScale to keep the application latency in check.
|At a very high level, AppScale is similar to a usual three tier system. The front end acts as a load balancer and SSL termination, the middle tier, the AppServers, runs the application instances, and the lower tier is the Datastore.|
The biggest difference between Google and AppScale comes in the scaling of the datastore (the App Engine API to the integrated NoSQL database). AppScale implements the datastore API using Cassandra: the scaling we obtain has been extremely good, and we tested it in excess of 17,000 datastore transactions per second (equivalent to over a quarter million transactions in Cassandra for that specific workload). While Google service is limited only by the quota the user desires, the scaling of AppScale datastore implementation is manual. The main reason is that adding nodes to a running database will incur a re-balancing cost that at this time needs to be weighted and controller by an administrator.
|The AppController monitor the application statistics, via a query to the load balancer information on the front end (1). The AppController can then inform the AppServers to start or stop an application instance if needed.|
Tuning Scaling operations in 2.5.0
AppScale 2.4.0 and 2.5.0 bring some tuning to the scaling mechanism. In particular the hysteresis cycle has been introduced also for scaling instances within existing nodes. We observed that under certain loads, the scaling was a bit too aggressive, in particular if the application requires a long time to load (for example a complex Java application with a lot of dependencies).
We also increased the cool-down period for VMs started in a private cloud: we observed that in some private clouds environments the boot time can be long (depending on configuration), so we wanted to make sure we amortized the cost of starting a new instance, increasing the time to live. For the latter, we made sure we were well within the one hour mark, which is used by AWS as the unit of time to charge.
For any question about AppScale find your preferred way to reach us at http://www.appscale.com/community.