Wednesday, December 30, 2015

Life in a container: AppScale in Docker

Docker and AppScaIe

Docker hardly needs any introduction: it's been extremely popular as of late, for very good reasons.  Based on a fairly old and tested container Linux technology, it makes it easy to create fresh environments to test and develop, or to isolate services within their own software stack, or to deploy a complex application stack.



App Engine is another well-tested and appreciated technology, developed in 2008 and in full production at Google since 2011. An estimated 6 million App Engine applications are running in production. AppScale implements the App Engine model, allowing your application to be deployed on virtually any infrastructure. It seems only natural then to have Docker and AppScale working together.

Developing and Testing App Engine Applications

The App Engine model makes application development easier, since the infrastructure components are already taken care of by AppScale or Google App Engine. For example, scaling the application or the databases is already part of the model. The developer is then able to focus on the logic of the application, and rely on well known components servicing the APIs. The development cycle is the usual: develop in your favorite IDE, setup a test environment, demo/confirm the feature or bug fix, merge in master, rinse and repeat. There are of course a lot of different methodologies, but setting up a test environment is one of the phases that is always required (is anyone still developing with waterfall?!).

AppScale can be deployed on a single node, thus simplifying the dev/test phases. The deployed system is a full environment, preserving the characteristics of the production environment (minus possibly the scale of nodes or data). Docker is one of the infrastructures supported by AppScale, and it is particularly suited for setting up multiple test environments. With AppScale 2.6.0 we officially released AppScale onto Docker Hub, so setting a new environment is as easy as

      docker pull appscale/appscale
      docker run -t -i appscale/appscale /bin/bash

In few seconds you will have your own App Engine environment.

Life in a Container

Typically Docker containers run one service, such as Nginx or Cassandra. It's a relatively recent push toward microservices, and it helps when integrating different systems since it isolates the software stack each service depends upon. Hence the base images in Docker don't have the usual 'boot' sequence, and many ancillary services aren't running since they are usually not needed to run the desired service. You typically don't need to have a cron or syslog daemon to run just Cassandra or Nginx.

AppScale expects to have a normal instance running, i.e.: it expects quite a few of the usual services a Linux box has. For example, AppScale expects the ssh and syslog daemons to be running. This is typically done during the boot sequence, and virtually all Linux boxes have them running. So we created a simple script to start a dev/test environment easily; we call it FastStart, and when it detects a Docker container it starts the needed ancillary services.

From the AppScale container as started using the previous commands, FastStart can be invoked with the following command:

      cd
      ./appscale/scripts/fast-start.sh --no-demo-app


The script creates a basic AppScalefile and starts AppScale. The AppScalefile is AppScale's configuration file and can be found and inspected in /root (the first cd command ensures it will be put there). A sample application is also downloaded into /root. If you just want to test your container, you can deploy it and check it out with the following command:

      appscale deploy guestbook.tar.gz

The FastStart script works on any infrastructure AppScale supports, from Vagrant/VirtualBox, to AWS and GCE, and it can set up a single node instance on any of them. If the infrastructure supports the concept of public and private IP (say GCE or AWS), the script detects it and configures the system properly. For some infrastructures where detection cannot be easily done (for example Vagrant), modify the AppScalefile and change the login directive to have the correct 'public' IP.

Dev/Test cycle made it easy

AppScale's FastStart makes it trivial to generate a new development or testing environment. With Docker this process literally takes seconds, bringing a new level of convenience to the App Engine development cycle. Try it out.

Thursday, December 3, 2015

'Scale' in AppScale

App Engine


A powerful platform to build web and mobile apps that scale automatically is Google's punch line for App Engine. And automatically is the keyword: it's difficult to underestimate the power of a platform that allows any application to react to a different user induced load automatically, with no intervention from sysadmin, or developer.  We loved that statement so much that we wanted everyone to be able to take advantage of autoscaling, and that's why AppScale was created.


App Engine: A powerful platform to build web and mobile apps that scale automatically.

The Basics of Scaling

Google has extensive documentation on the scaling of App Engine applications. In the documentation you will find references to application instances and how latency is the main factor to understand how many instances are needed to satisfy a specific load. Since App Engine applications run on the Google platform, the promise of infinite resources at their disposal is as true as it can get. Limited perhaps by the customer wallet.

AppScale works in a very similar way: the application is allowed to scale up for as long as resources are available. While in Google instances determine the memory available in each instance, in AppScale we have a configuration option to achieve the same. Similarly, latency is used to determine if and how to scale the application up or down. What's different is how resources are acquired to allow the application to scale.

App Engine applications scale automatically based on load, running in Google or AppScale. Users won't notice much: just that requests are served timely.



Within an AppScale deployment, some nodes are AppServer nodes, which means their CPUs and Memory will be dedicated to run application instances. Once the resources within the AppServers nodes are exhausted, if the underlying infrastructure allows it,  new nodes can be acquired (up to the desired maximum) as new AppServers. AppScale supports this on Cloud environments, like GCE, AWS, OpenStack, HP Helion Eucalyptus, and there are some experimental work for vSphere.

Scaling in AppScale

For autoscale to work properly, AppScale needs to be able to answer two questions: Does the application needs more resources? Do we have resources available to start a new instance? Whenever an application is uploaded within an AppScale deployment, the AppController (a component of AppScale) automatically creates a load balancer (we use haproxy) configuration for it. This allows the application's instances to be added or removed with no service interruption. Periodically the AppController checks the application statistics within haproxy to see if the application is struggling. This allows AppScale to keep the application latency in check.

At a very high level, AppScale is similar to a usual three tier system. The front end acts as a load balancer and SSL termination, the middle tier, the AppServers,  runs the application instances, and the lower tier is the Datastore.
Within each AppScale deployment, the Login node routinely checks the available resources on each AppServer node. When a new application instance is needed, the Login node needs to find an AppServer with enough memory for that application, and with enough CPU to spare. If there is no AppServer node meeting the requirements, a request is been sent to spawn a new virtual machine if the underlying infrastructure allows it.


The biggest difference between Google and AppScale comes in the scaling of the datastore (the App Engine API to the integrated NoSQL database). AppScale implements the datastore API using Cassandra: the scaling we obtain has been extremely good, and we tested it in excess of 17,000 datastore transactions per second (equivalent to over a quarter million transactions in Cassandra for that specific workload).  While Google service is limited only by the quota the user desires,  the scaling of AppScale datastore implementation is manual. The main reason is that adding nodes to a running database will incur a re-balancing cost that at this time needs to be weighted and controller by an administrator.
The AppController monitor the application statistics, via a query to the load balancer information on the front end (1). The AppController can then inform the AppServers to start or stop an application instance if needed.

Tuning Scaling operations in 2.5.0

AppScale 2.4.0 and 2.5.0 bring some tuning to the scaling mechanism. In particular the hysteresis cycle has been introduced also for scaling instances within existing nodes. We observed that under certain loads, the scaling was a bit too aggressive, in particular if the application requires a long time to load (for example a complex Java application with a lot of dependencies).

We also increased the cool-down period for VMs started in a private cloud: we observed that in some private clouds environments the boot time can be long (depending on configuration), so we wanted to make sure we amortized the cost of starting a new instance, increasing the time to live. For the latter, we made sure we were well within the one hour mark, which is used by AWS as the unit of time to charge.


For any question about AppScale find your preferred way to reach us at http://www.appscale.com/community.

Thursday, November 21, 2013

A new Media Server (flash on Linux)

I used to use an old T400s (Core 2 Duo P9400) as my household media server. I used it as replication from my working files (ie using unison to synchronize my laptop home to it), as a backup repository (with 2 RAIDed USB drive as backup storage), as Music Server using samba to export my music for SONOS, and again as Media Server using plex to export movies to ROKU (thanks David for this suggestion!).

A reasonably simple setup, although it fell short of my needs. That media server is connected to my 1080p projector so I can watch MotoGP. The MotoGP web site has a very convenient subscription to watch the races whenever I want to. Usually I steer clear of any news for a few days after the race, till I have time to watch the race, and voila', MotoGP on demand with no spoilers. 

I just wanted to watch the race
The issue is that the venerable T400s didn't have enough horsepower to stream the races.

Flash (and Linux?)

The stream coming from MotoGP uses flash, with quality up to HD (currently 720p). Till now I needed my wife's laptop with a quad core i7-3630QM with NVIDIA GT750M (nope she is not a gamer, just an architect) to be able to enjoy the races. Luckily my wife loves the races too, so no contention on her laptop during MotoGP season.

The late T400s started to be a bit noisy (the fan developed an annoying whine): the time had come to look for a substitute that would be able to stream the races. In case you wonder, yes, the Media Server had to run Linux. I started to sample all the Linux machine I had access to, to check what was the minimum configuration I would need.

There seem to have been some hardware acceleration (GPU assisted) available to Linux for flash some time ago, but to the best of my knowledge, the latest versions do not have it any longer. That is, CPU brute force is the answer. And looking at CPU usage during streaming, single core raw power is the key player.

As a side note, I do use firefox (well iceweasel): Chrome seems to have better performance with pepper for some, but during my testing I didn't notice any significant difference.

The Search for the CPU

I was hoping to use some small and quiet machine like the NUC, the very same ones used for the Eucalyptus Backpack Availability Zones. At about the same time I was looking at this, my working laptop (core i5-2520M, not enough to watch the races), started to have issues, and the replacement laptop I got (Dell XPS-13) had exactly the same CPU as the most powerful of the NUC (core i7 3537U). Very convenient for my testings.

Eucalyptus Backpak AZ-2


Once the XPS-13 arrived, I did a quick test with Windows (!!), and then with Linux. Surprisingly, both OSes showed about the same performance watching the races: almost enough, but not quite. The video showed at times annoying micro-delays. Now I had a CPU I can reference as almost capable. Although the NUC and similar bricks were off the table, I could start cross reference various benchmark, and select CPUs with a healthy single core lead over the low power core i7.

The new Media Server 

After some search across various tiny and small form cases, CPU benchmarks, and with the constrain of a reasonable price, I ended up with a DS61 case (USB 3.0 was a nice to have for faster backups), and a core i3-3245. The core i3 has enough of a lead over the low power core i7, to give me comfort: only the HD 4000 (same GPU on both) was slightly lower clocked. 

I recycled a 4G memory DIMM, and a small mSATA SSD drive (fast and silent) I already had, and there you have it: a brand new media server. Few more adjustments were done to make it more capable to stream MotoGP races, in particular:
  • default to SNA for Intel GPU acceleration (without it, I observed some tearing)
  • switched to usb wireless card (SONOS didn't provide enough bandwidth to stream the races)
  • adjusted fancontrol to keep the fan from revving too high
  • Bluetooth keyboard to couch control the machine 
Still fairly small, and fully capable

Pros

  • small (fits on top of the bookshelf)
  • fits the requirements (I can watch the races now)
  • reasonably quiet under load (after fancontrol)

Cons 

  • desktop CPU (more watts than strictly needed I think)
  • no battery (the T400s had an incorporated UPS which was nice)

Monday, March 4, 2013

Santa Barbara Cloud Meetup

Few years back I heard Santa Barbara referred as Silicon Beach. The first Cloud Meetup confirmed the rumor. The kick-off meetup, was co-organized between AppScale and Eucalyptus,  and it was held at AppScale HQ. We got a recording of the meetup (a better quality one should be on the way), at least of the presentation parts: sorry no food or drinks in it. The first meeting was meant as a very quick introduction to both Eucalyptus and AppScale and where they respectively fit into the Cloud Computing landscape (hint one is IaaS while the other is PaaS). In the spirit of any good developer meetup, some hacking was involved. After the presentation, laptops were fired up, demos were given, and more in-depth questions were asked.


AppScale on Eucalyptus

My share of the hacking was done ahead of time. At Eucalyptus we maintain a CommunityCloud (ECC). The idea behind the ECC is to allow our community access to the latest Eucalyptus without having to install it. It has been extensively used by library or tool developers, by University classes, by current Euca-user before upgrading their own cloud, and by just anyone interested in playing with a running cloud. I do use it sometimes to share big files (spins up an instances and scp big files there), or to experiment with new images, or software. The draconian SLA (instances, volumes, and buckets will be terminated or deleted at regular intervals) is not an issue for my use cases, so I found it a lot more convenient then having to poke at  my home firewall.

The ECC seems the perfect place to have an AppScale image for everyone to try, so ahead of the meetup we set to add a newer AppScale image. After some initial setbacks due to iptables not willing to co-operate, we had the image uploaded and ready to go (emi-76294490) with the latest appscale (1.7 at the time of this writing). Chris has a step by step write up: just follow it and you will get it up and running in no time.

Of course you are not limited to run AppScale on the ECC. If you want to run it on your Eucalyptus cloud, Shaon has a blog for you explaining how to set it up.


Heads up: enabling DNS 

Eucalyptus can be configured to provide DNS resolution for instances IP addresses and bucket names. Incidentally ECC is configured in such a way, so that describe instances will return hostnames instead of IP address, for example euca-173-205-188-102.eucalyptus.ecc.eucalyptus.com instead of  173.205.188.102. Although this is a very useful feature, currently there are 2 bugs against it, EUCA-1433 and EUCA-1456, which may create some difficulties. In particular AppScale relies on the AWS split-horizon behavior observed within the instances. 

The workaround here could be to either create a split-horizon DNS server for your cloud, or to ensure that the instances will associate the private IP to their public name.


Ready for the next meetup?

We just finished our first meetup, but we are already planning for the next one! Topics are still up for grab, so make sure you join the group and propose your favorite one.

Saturday, February 23, 2013

3.2.1 and the road ahead

In my older posts, I have been talking about maintainability and how I see it as a pillar for IaaS. I have been talking about the homework the Cloud Admin needs to do before deploying,  and I have been talking about the invisible work done in Eucalyptus 3.2.  Of course maintainability is not the goal of a sprint, or of a single release, but not unlike security, a  continued effort, a guiding principle. So, after only few weeks from the previous major release, welcome Eucalyptus 3.2.1.

The point release of Eucalyptus brings very important fixes. I will let you go through the release notes yourself, but I 'm very pleased about the gratuitous ARP message, and the moving if iptables-preload.   The former allow for a speedier network recovery after a Cluster Controller fail-over, and the latter makes Eucalyptus more compatibles with newer releases of linux distro. Of course there are a lot more fixes, and you can go and find your favorite one.


Ramping up QA

QA has always been the focus of our Engineering team. But as the perfectionists we are, we never rest on our laurels, and a tremendous effort has been put in extending the scope and speed of our QA. If you follow our blogs feed you have already noticed a lot of the work done. And with the good works, comes a better list of Known Issues, and warnings for corner cases we don't cover yet (if you have an EMC SAN and you use 3.2.1, it's your turn to be the corner case). The last thing we want is for our users to be surprised from unexpected behavior.

The Road Ahead



Aka Eucalyptus 3.3. The next major release of Eucalyptus will bring quite a few new features. You can check out the list of features scoped out. Of the major ones, we have Elastic Load Balancer, AutoScaling, CloudWatch, and Maintenance mode. 

We just had the end of sprint 3 status review: check out the demo yourself. So far the road ahead of Eucalyptus 3.3 is nice and clear, and we'll expect it to land on your machines by the mid/late Q2.

Edited: I don't know how to do math, since I considered a quarter to have 4 months ...

Sunday, January 13, 2013

Will My Internet Be Faster?

I have been beating the Maintainability drum lately, and highlighting what the latest Eucalyptus did in that regard. I'm not done yet. This time I want to change the angle of approach, focusing more on the Map Workload to Cloud Resources step, using examples and some  back-of-the-envelope calculations. 


Will my Internet be Faster?

Back in the day, I helped a user going through the installation of an ancient Eucalyptus (I think it was 1.4), and after some hurdles (there was no Faststart back then), he finally got instances up and running on his two nodes home setup.  Then he asked "Will my Internet connection be faster now?". 

I think the question points more to how Cloud Computing has been a buzzword, and has been perceived as a panacea to all IT problems. But it is also a reminder of the need to understand the underlying hardware Infrastructure, in order to achieve the desired performance. An on-premise Cloud is able to go as fast as the underlying Infrastructure, and will have as much capacity as the hardware supporting it. There is a tremendous flexibility that the Cloud provides, yet there is also the risk to under-perform, if the Physical Infrastructure is not prepared for the workload. 

Case In Point: Cloud Storage

At the end of the day, all Cloud Resources map to physical resources: RAM, CPU, network, disks. I will now focus on the Cloud Storage story because of its importance (anyone cares about their data?), and because historically it is where there have been some interesting issues. In particular Eucalyptus 3.1.2 was needed because high disk load caused seemingly random failures.
Some back-of-the-envelope, hand-waiving calculation needed for this blog
From my chicken scratch of the first figure, you should see how Ephemeral Storage resides on the NCs (Node Controller), while EBS (Elastic Block Storage) is handled by the SC (Storage Controller). Let's quickly re-harsh when the two are used:
  • Ephemeral: both instance-store and bfEBS (boot from EBS instances) uses Ephemeral, although instance-store use Ephemeral Storage also for root, and swap;
  • EBS: any instance with an attach Volume uses EBS, and bfEBS uses Volume for root, and swap.
and which kind of contention there is:
  • Ephemeral: any instance running on the same NC will compete for the same storage;
  • EBS: all instances using EBS within the same availability zone (Eucalyptus parlance for cluster) will access and use the same SC.
I used a simple spreadsheet to aid my examples. Feel free to copy it, play with it, enhance it, but please consider it a learning tool and not a real calculator: way too many variables have been left behind for the sake of simplicity.

In my examples I will measure the underlying storage speed with IOPS

IOPS values may vary dramatically. The above may be
used only to have  an indication of the expected performance. 


In the following examples, I will make the very unreasonable assumption that instances will access equally all their storage (both Ephemeral, and EBS), and that they will either use it 20% or 100% of the time. Moreover in the 20% case, an oracle minimizes the concurrent disk access of all instances (ie if there are less than five running instances, they will not compete at all and see the full speed of the storage). 

Thus one is a very light scenario, where the instances are mainly idles, while the other (100%) assumes the instances running benchmarks. Starting instances is a fairly disk intensive process, first because Eucalyptus needs to prepare the image to boot (which involve copying multi-GB files), and then because the OS will have to read the disk while booting. I added a column to the spreadsheet to show the impact of starting instances on the light workload.

Home setup

A small Cloud installation will most likely have the SC backed by local storage. Let's use an IOPS calculator to estimate the performance. Here I will use 2 Seagate Cheetah 15K rpm, and RAID 0, which gives about 343 IOPS (I will round it to 350). For the NC, I will assume 150 IOPS which should be a reasonably fast single disk (non SSD). 

For a Home setup three NCs seems a good number to me. Each NC should have enough cores and RAM to allow more than ten instances running (12-24 cores, 12-24 GB RAM  should do). If I run one instance-store, one bfEBS instance, and have one Volume  per NC, the very unrealistic calculator gives
Light load on the home setup: slowest
storage is still comparable to a 5400RPM disk.

Not bad for my Home setup. Even if the instances were to run iozone on all the disks, I can still see a performance of a slow 5400RPM disk. Now, let me create more load: four instance-store, four bfEBS, and have two Volume used per NC
The home setup with a heavier load doesn't do thatwell:
instances may see performance as slow as a floppy drive.

That's a bit more interesting. If the instances are very light in disk usage, they will see the performance of a 7200RPM disk, but under heavier load, they will be using something barely faster then a floppy. Ouch!

A More Enterprisy setup

From the previous example, is fairly obvious why bigger installations tend to use a SAN for their SC storage back-end. For this example I will use a Dell Equallogic. I will use a setup that gives a 5000 IOPS. Correspondingly, the number of NCs are increased to 10.

Let's start with a light load: one instance-store, one bfEBS, and one Volume per NC (similarly to the Home setup, although now there is a total of 20 running instances).
A SAN backed cloud with a
light load: pretty good all around.

The results are pretty good with access to EBS around 250 IOPS under heavy load, and very fast access on the light load. Even Ephemeral compares well with a 3.5" desktop-class disk. 

Now I will run more instances: four instance-store, four bfEBS, and have 4 Volumes per NC. . 
A SAN backed cloud with an heavier load: EBS
is now comparable to  a 5400 RPM under heavy load.

Ephemeral still takes a beating:  as in the Home setup case, there are eight different instances using the same local disk (bfEBS has access to Ephemeral too, and in my simplistic approach all disks are used at the same rate) . EBS slowed down quite a bit, and now it compares to a slow desktop-class disk. Although the instances should still have enough IOPS to access storage, perhaps it is time to start thinking about adding a second availability zones to this setup.

Snapshots

The above examples didn't consider Snapshots at all. Snapshots allows to back-up Volumes, and to move them across availability zones (Volumes can be created from Snapshot in any availability zone). Snapshots resides on Walrus, which means that every time a Snapshot is created, a full copy of the Volume is taken on the SC, and sent to Walrus. If Snapshots are frequent on this Cloud, it is easy to see how the SC, Walrus, and the Network can become taxed serving them.

Expectations

I would take all the above numbers as a best case scenario under their relative cases. A lot of variables have been ignored, starting from network, as well as others disk access. For example, Eucalyptus provides swap by default to instance-store, and the typical linux installation creates swap (ie bfEBS instances will most likely have swap), hence any instance running out or RAM, will start bogging down the respective disk. 

There was also the assumption that not only the load is independent, but the instance co-operate to make sure they play nice with the disk. Finally in a production Cloud, a certain mix of operation is to be expected, thus, starting, terminating, creating volumes, creating snapshots, will increase the load of both Storage (Ephemeral and EBS) accordingly. 

As I mention in my Maintainability post, having a proper workload example, will allow you to properly test and tune the cloud to satisfy your users.

Making Internet Faster

In the above examples, I pulled off some back-of-the-envelope calculations which do not consider the software Infrastructure at all (ie they don't consider Eucalyptus overhead). Eucalyptus impact on the Physical Infrastructure has been constantly decreasing. Just to mention few of the improvements, before Eucalyptus 3, the NC would make straight copies multi-GB file, now it use device mapper to minimize the disk access to bare minimum, And the SC alongside with the SAN plugins, now has DASManager (Direct Access Storage, ie a partition or a disk), which allow to bypass the file system when dealing with Volumes. 


There has been a nice performance boost with Eucalyptus 3,  but there is still room for improvements, and no option has been left unexplored, from using distributed file systems as back-end, to employing SDN. Although Eucalyptus may not be able to make Internet faster yet, it is for sure trying hard.

Tuesday, January 1, 2013

Maintainability and Eucalyptus

I recently blogged about the importance of Maintainability for on-premise Clouds. Within the lists of steps to a successful on-premise Cloud deployment identified in the blog, Eucalyptus as IaaS software is heavily involved with the Deploy and Maintain part.

Deploy

I already  mentioned the work done to make Eucalyptus installation easy peasy, so let me summarize them here. Eucalyptus is packaged for the main Linux distritributions, so the installation is as easy as configuring the repository, and do a yum install or apt-get install. Configuring Eucalyptus is still a bit more complex that I would like to, and requires to register the components with each other,  but the steps can easily be automated, as demonstrated by our FastStart installation. 

Although there is always margin to improve, as distributed systems go, I dare to say that we are getting as easy as possible. Moreover, any good sysadmin already uses software to manage the infrastructure, so I see script-ability as the most important feature to allow easy progress with custom installations (ie Eucalyptus deploy recipes to use with ansible, chef, and puppet).


Maintain

If you follow our development, you already know that Eucalyptus 3.2 got recently released. There are ample documents covering the release either in general (Rich, and Marten blog) or for specific features (DavidAndrew, and Kyo blog), but if I wear my Cloud Admin hat, the part that didn't get enough coverage is the the amount of work that went into making Eucalyptus more maintainable.

Eucalyptus 3.2 fixed issues.
Eucalyptus 3.2 had 350 fixed bugs, and those are only the reported ones, since quite a few got fixed while restructuring parts of the code. Peek over the list, you will see the ones related to the new features but there is a large number of things done to make Eucalyptus more robust and hence maintainable. You don't believe me? Let me give you a sample:
  • reworked the inner code paths of the Storage Controller, preventing now to accidentally configure the SC with am undesired backend;
  • added safety mechanism to the HA functioning which will prevent or greatly reduce the risk to have split brain Cloud Controllers;
  • more robust handling of orphan instances (the situation appears if the Node Controller is not able to timely relay its information all the way to the CLC)
  • plugged memory and database connections leaks (fairly annoying since they required restart of components under particular use cases).
Likely you got more excited about our awesome new user console, but it's the features like the above list that gives me the comfort of a solid Infrastructure. 

User Console screenshot taken from David's blog


Are we there yet?

As I mentioned before, there is always room for improvements. The bulk of the work for 3.2 went into hardening the code, into covering all the corner cases, into improving QA coverage. I call all this work the invisible work since it is neither flashy, nor apparent at cursory inspection, yet it is the one that allows the Infrastructure to survive the test of time. 

With most of the invisible work done, what is ahead of us is more easy to understand and categorize. From the list of the work scoped for 3.3, I see a lot of great new features: autoscaling, cloudwatch, elb for example. This is still scoping work, so if you really like one feature, go and tell us, or up-vote it in our issue tracker.  Yet, with all these new features, we don't loose focus on our Infrastructure roots. In particular the work for Maintenance Mode, and Networking, alongside with a lot of other features that will make it much easier to deal with Cloud Resource, for example vmtypes and tagging.

So, are we there yet? As I mentioned in my previous blog, work on an infrastructure is done when the infrastructure is not in use anymore, so no, we are not there yet, but for sure we are having a great ride.