Thursday, November 21, 2013

A new Media Server (flash on Linux)

I used to use an old T400s (Core 2 Duo P9400) as my household media server. I used it as replication from my working files (ie using unison to synchronize my laptop home to it), as a backup repository (with 2 RAIDed USB drive as backup storage), as Music Server using samba to export my music for SONOS, and again as Media Server using plex to export movies to ROKU (thanks David for this suggestion!).

A reasonably simple setup, although it fell short of my needs. That media server is connected to my 1080p projector so I can watch MotoGP. The MotoGP web site has a very convenient subscription to watch the races whenever I want to. Usually I steer clear of any news for a few days after the race, till I have time to watch the race, and voila', MotoGP on demand with no spoilers. 

I just wanted to watch the race
The issue is that the venerable T400s didn't have enough horsepower to stream the races.

Flash (and Linux?)

The stream coming from MotoGP uses flash, with quality up to HD (currently 720p). Till now I needed my wife's laptop with a quad core i7-3630QM with NVIDIA GT750M (nope she is not a gamer, just an architect) to be able to enjoy the races. Luckily my wife loves the races too, so no contention on her laptop during MotoGP season.

The late T400s started to be a bit noisy (the fan developed an annoying whine): the time had come to look for a substitute that would be able to stream the races. In case you wonder, yes, the Media Server had to run Linux. I started to sample all the Linux machine I had access to, to check what was the minimum configuration I would need.

There seem to have been some hardware acceleration (GPU assisted) available to Linux for flash some time ago, but to the best of my knowledge, the latest versions do not have it any longer. That is, CPU brute force is the answer. And looking at CPU usage during streaming, single core raw power is the key player.

As a side note, I do use firefox (well iceweasel): Chrome seems to have better performance with pepper for some, but during my testing I didn't notice any significant difference.

The Search for the CPU

I was hoping to use some small and quiet machine like the NUC, the very same ones used for the Eucalyptus Backpack Availability Zones. At about the same time I was looking at this, my working laptop (core i5-2520M, not enough to watch the races), started to have issues, and the replacement laptop I got (Dell XPS-13) had exactly the same CPU as the most powerful of the NUC (core i7 3537U). Very convenient for my testings.

Eucalyptus Backpak AZ-2


Once the XPS-13 arrived, I did a quick test with Windows (!!), and then with Linux. Surprisingly, both OSes showed about the same performance watching the races: almost enough, but not quite. The video showed at times annoying micro-delays. Now I had a CPU I can reference as almost capable. Although the NUC and similar bricks were off the table, I could start cross reference various benchmark, and select CPUs with a healthy single core lead over the low power core i7.

The new Media Server 

After some search across various tiny and small form cases, CPU benchmarks, and with the constrain of a reasonable price, I ended up with a DS61 case (USB 3.0 was a nice to have for faster backups), and a core i3-3245. The core i3 has enough of a lead over the low power core i7, to give me comfort: only the HD 4000 (same GPU on both) was slightly lower clocked. 

I recycled a 4G memory DIMM, and a small mSATA SSD drive (fast and silent) I already had, and there you have it: a brand new media server. Few more adjustments were done to make it more capable to stream MotoGP races, in particular:
  • default to SNA for Intel GPU acceleration (without it, I observed some tearing)
  • switched to usb wireless card (SONOS didn't provide enough bandwidth to stream the races)
  • adjusted fancontrol to keep the fan from revving too high
  • Bluetooth keyboard to couch control the machine 
Still fairly small, and fully capable

Pros

  • small (fits on top of the bookshelf)
  • fits the requirements (I can watch the races now)
  • reasonably quiet under load (after fancontrol)

Cons 

  • desktop CPU (more watts than strictly needed I think)
  • no battery (the T400s had an incorporated UPS which was nice)

Monday, March 4, 2013

Santa Barbara Cloud Meetup

Few years back I heard Santa Barbara referred as Silicon Beach. The first Cloud Meetup confirmed the rumor. The kick-off meetup, was co-organized between AppScale and Eucalyptus,  and it was held at AppScale HQ. We got a recording of the meetup (a better quality one should be on the way), at least of the presentation parts: sorry no food or drinks in it. The first meeting was meant as a very quick introduction to both Eucalyptus and AppScale and where they respectively fit into the Cloud Computing landscape (hint one is IaaS while the other is PaaS). In the spirit of any good developer meetup, some hacking was involved. After the presentation, laptops were fired up, demos were given, and more in-depth questions were asked.


AppScale on Eucalyptus

My share of the hacking was done ahead of time. At Eucalyptus we maintain a CommunityCloud (ECC). The idea behind the ECC is to allow our community access to the latest Eucalyptus without having to install it. It has been extensively used by library or tool developers, by University classes, by current Euca-user before upgrading their own cloud, and by just anyone interested in playing with a running cloud. I do use it sometimes to share big files (spins up an instances and scp big files there), or to experiment with new images, or software. The draconian SLA (instances, volumes, and buckets will be terminated or deleted at regular intervals) is not an issue for my use cases, so I found it a lot more convenient then having to poke at  my home firewall.

The ECC seems the perfect place to have an AppScale image for everyone to try, so ahead of the meetup we set to add a newer AppScale image. After some initial setbacks due to iptables not willing to co-operate, we had the image uploaded and ready to go (emi-76294490) with the latest appscale (1.7 at the time of this writing). Chris has a step by step write up: just follow it and you will get it up and running in no time.

Of course you are not limited to run AppScale on the ECC. If you want to run it on your Eucalyptus cloud, Shaon has a blog for you explaining how to set it up.


Heads up: enabling DNS 

Eucalyptus can be configured to provide DNS resolution for instances IP addresses and bucket names. Incidentally ECC is configured in such a way, so that describe instances will return hostnames instead of IP address, for example euca-173-205-188-102.eucalyptus.ecc.eucalyptus.com instead of  173.205.188.102. Although this is a very useful feature, currently there are 2 bugs against it, EUCA-1433 and EUCA-1456, which may create some difficulties. In particular AppScale relies on the AWS split-horizon behavior observed within the instances. 

The workaround here could be to either create a split-horizon DNS server for your cloud, or to ensure that the instances will associate the private IP to their public name.


Ready for the next meetup?

We just finished our first meetup, but we are already planning for the next one! Topics are still up for grab, so make sure you join the group and propose your favorite one.

Saturday, February 23, 2013

3.2.1 and the road ahead

In my older posts, I have been talking about maintainability and how I see it as a pillar for IaaS. I have been talking about the homework the Cloud Admin needs to do before deploying,  and I have been talking about the invisible work done in Eucalyptus 3.2.  Of course maintainability is not the goal of a sprint, or of a single release, but not unlike security, a  continued effort, a guiding principle. So, after only few weeks from the previous major release, welcome Eucalyptus 3.2.1.

The point release of Eucalyptus brings very important fixes. I will let you go through the release notes yourself, but I 'm very pleased about the gratuitous ARP message, and the moving if iptables-preload.   The former allow for a speedier network recovery after a Cluster Controller fail-over, and the latter makes Eucalyptus more compatibles with newer releases of linux distro. Of course there are a lot more fixes, and you can go and find your favorite one.


Ramping up QA

QA has always been the focus of our Engineering team. But as the perfectionists we are, we never rest on our laurels, and a tremendous effort has been put in extending the scope and speed of our QA. If you follow our blogs feed you have already noticed a lot of the work done. And with the good works, comes a better list of Known Issues, and warnings for corner cases we don't cover yet (if you have an EMC SAN and you use 3.2.1, it's your turn to be the corner case). The last thing we want is for our users to be surprised from unexpected behavior.

The Road Ahead



Aka Eucalyptus 3.3. The next major release of Eucalyptus will bring quite a few new features. You can check out the list of features scoped out. Of the major ones, we have Elastic Load Balancer, AutoScaling, CloudWatch, and Maintenance mode. 

We just had the end of sprint 3 status review: check out the demo yourself. So far the road ahead of Eucalyptus 3.3 is nice and clear, and we'll expect it to land on your machines by the mid/late Q2.

Edited: I don't know how to do math, since I considered a quarter to have 4 months ...

Sunday, January 13, 2013

Will My Internet Be Faster?

I have been beating the Maintainability drum lately, and highlighting what the latest Eucalyptus did in that regard. I'm not done yet. This time I want to change the angle of approach, focusing more on the Map Workload to Cloud Resources step, using examples and some  back-of-the-envelope calculations. 


Will my Internet be Faster?

Back in the day, I helped a user going through the installation of an ancient Eucalyptus (I think it was 1.4), and after some hurdles (there was no Faststart back then), he finally got instances up and running on his two nodes home setup.  Then he asked "Will my Internet connection be faster now?". 

I think the question points more to how Cloud Computing has been a buzzword, and has been perceived as a panacea to all IT problems. But it is also a reminder of the need to understand the underlying hardware Infrastructure, in order to achieve the desired performance. An on-premise Cloud is able to go as fast as the underlying Infrastructure, and will have as much capacity as the hardware supporting it. There is a tremendous flexibility that the Cloud provides, yet there is also the risk to under-perform, if the Physical Infrastructure is not prepared for the workload. 

Case In Point: Cloud Storage

At the end of the day, all Cloud Resources map to physical resources: RAM, CPU, network, disks. I will now focus on the Cloud Storage story because of its importance (anyone cares about their data?), and because historically it is where there have been some interesting issues. In particular Eucalyptus 3.1.2 was needed because high disk load caused seemingly random failures.
Some back-of-the-envelope, hand-waiving calculation needed for this blog
From my chicken scratch of the first figure, you should see how Ephemeral Storage resides on the NCs (Node Controller), while EBS (Elastic Block Storage) is handled by the SC (Storage Controller). Let's quickly re-harsh when the two are used:
  • Ephemeral: both instance-store and bfEBS (boot from EBS instances) uses Ephemeral, although instance-store use Ephemeral Storage also for root, and swap;
  • EBS: any instance with an attach Volume uses EBS, and bfEBS uses Volume for root, and swap.
and which kind of contention there is:
  • Ephemeral: any instance running on the same NC will compete for the same storage;
  • EBS: all instances using EBS within the same availability zone (Eucalyptus parlance for cluster) will access and use the same SC.
I used a simple spreadsheet to aid my examples. Feel free to copy it, play with it, enhance it, but please consider it a learning tool and not a real calculator: way too many variables have been left behind for the sake of simplicity.

In my examples I will measure the underlying storage speed with IOPS

IOPS values may vary dramatically. The above may be
used only to have  an indication of the expected performance. 


In the following examples, I will make the very unreasonable assumption that instances will access equally all their storage (both Ephemeral, and EBS), and that they will either use it 20% or 100% of the time. Moreover in the 20% case, an oracle minimizes the concurrent disk access of all instances (ie if there are less than five running instances, they will not compete at all and see the full speed of the storage). 

Thus one is a very light scenario, where the instances are mainly idles, while the other (100%) assumes the instances running benchmarks. Starting instances is a fairly disk intensive process, first because Eucalyptus needs to prepare the image to boot (which involve copying multi-GB files), and then because the OS will have to read the disk while booting. I added a column to the spreadsheet to show the impact of starting instances on the light workload.

Home setup

A small Cloud installation will most likely have the SC backed by local storage. Let's use an IOPS calculator to estimate the performance. Here I will use 2 Seagate Cheetah 15K rpm, and RAID 0, which gives about 343 IOPS (I will round it to 350). For the NC, I will assume 150 IOPS which should be a reasonably fast single disk (non SSD). 

For a Home setup three NCs seems a good number to me. Each NC should have enough cores and RAM to allow more than ten instances running (12-24 cores, 12-24 GB RAM  should do). If I run one instance-store, one bfEBS instance, and have one Volume  per NC, the very unrealistic calculator gives
Light load on the home setup: slowest
storage is still comparable to a 5400RPM disk.

Not bad for my Home setup. Even if the instances were to run iozone on all the disks, I can still see a performance of a slow 5400RPM disk. Now, let me create more load: four instance-store, four bfEBS, and have two Volume used per NC
The home setup with a heavier load doesn't do thatwell:
instances may see performance as slow as a floppy drive.

That's a bit more interesting. If the instances are very light in disk usage, they will see the performance of a 7200RPM disk, but under heavier load, they will be using something barely faster then a floppy. Ouch!

A More Enterprisy setup

From the previous example, is fairly obvious why bigger installations tend to use a SAN for their SC storage back-end. For this example I will use a Dell Equallogic. I will use a setup that gives a 5000 IOPS. Correspondingly, the number of NCs are increased to 10.

Let's start with a light load: one instance-store, one bfEBS, and one Volume per NC (similarly to the Home setup, although now there is a total of 20 running instances).
A SAN backed cloud with a
light load: pretty good all around.

The results are pretty good with access to EBS around 250 IOPS under heavy load, and very fast access on the light load. Even Ephemeral compares well with a 3.5" desktop-class disk. 

Now I will run more instances: four instance-store, four bfEBS, and have 4 Volumes per NC. . 
A SAN backed cloud with an heavier load: EBS
is now comparable to  a 5400 RPM under heavy load.

Ephemeral still takes a beating:  as in the Home setup case, there are eight different instances using the same local disk (bfEBS has access to Ephemeral too, and in my simplistic approach all disks are used at the same rate) . EBS slowed down quite a bit, and now it compares to a slow desktop-class disk. Although the instances should still have enough IOPS to access storage, perhaps it is time to start thinking about adding a second availability zones to this setup.

Snapshots

The above examples didn't consider Snapshots at all. Snapshots allows to back-up Volumes, and to move them across availability zones (Volumes can be created from Snapshot in any availability zone). Snapshots resides on Walrus, which means that every time a Snapshot is created, a full copy of the Volume is taken on the SC, and sent to Walrus. If Snapshots are frequent on this Cloud, it is easy to see how the SC, Walrus, and the Network can become taxed serving them.

Expectations

I would take all the above numbers as a best case scenario under their relative cases. A lot of variables have been ignored, starting from network, as well as others disk access. For example, Eucalyptus provides swap by default to instance-store, and the typical linux installation creates swap (ie bfEBS instances will most likely have swap), hence any instance running out or RAM, will start bogging down the respective disk. 

There was also the assumption that not only the load is independent, but the instance co-operate to make sure they play nice with the disk. Finally in a production Cloud, a certain mix of operation is to be expected, thus, starting, terminating, creating volumes, creating snapshots, will increase the load of both Storage (Ephemeral and EBS) accordingly. 

As I mention in my Maintainability post, having a proper workload example, will allow you to properly test and tune the cloud to satisfy your users.

Making Internet Faster

In the above examples, I pulled off some back-of-the-envelope calculations which do not consider the software Infrastructure at all (ie they don't consider Eucalyptus overhead). Eucalyptus impact on the Physical Infrastructure has been constantly decreasing. Just to mention few of the improvements, before Eucalyptus 3, the NC would make straight copies multi-GB file, now it use device mapper to minimize the disk access to bare minimum, And the SC alongside with the SAN plugins, now has DASManager (Direct Access Storage, ie a partition or a disk), which allow to bypass the file system when dealing with Volumes. 


There has been a nice performance boost with Eucalyptus 3,  but there is still room for improvements, and no option has been left unexplored, from using distributed file systems as back-end, to employing SDN. Although Eucalyptus may not be able to make Internet faster yet, it is for sure trying hard.

Tuesday, January 1, 2013

Maintainability and Eucalyptus

I recently blogged about the importance of Maintainability for on-premise Clouds. Within the lists of steps to a successful on-premise Cloud deployment identified in the blog, Eucalyptus as IaaS software is heavily involved with the Deploy and Maintain part.

Deploy

I already  mentioned the work done to make Eucalyptus installation easy peasy, so let me summarize them here. Eucalyptus is packaged for the main Linux distritributions, so the installation is as easy as configuring the repository, and do a yum install or apt-get install. Configuring Eucalyptus is still a bit more complex that I would like to, and requires to register the components with each other,  but the steps can easily be automated, as demonstrated by our FastStart installation. 

Although there is always margin to improve, as distributed systems go, I dare to say that we are getting as easy as possible. Moreover, any good sysadmin already uses software to manage the infrastructure, so I see script-ability as the most important feature to allow easy progress with custom installations (ie Eucalyptus deploy recipes to use with ansible, chef, and puppet).


Maintain

If you follow our development, you already know that Eucalyptus 3.2 got recently released. There are ample documents covering the release either in general (Rich, and Marten blog) or for specific features (DavidAndrew, and Kyo blog), but if I wear my Cloud Admin hat, the part that didn't get enough coverage is the the amount of work that went into making Eucalyptus more maintainable.

Eucalyptus 3.2 fixed issues.
Eucalyptus 3.2 had 350 fixed bugs, and those are only the reported ones, since quite a few got fixed while restructuring parts of the code. Peek over the list, you will see the ones related to the new features but there is a large number of things done to make Eucalyptus more robust and hence maintainable. You don't believe me? Let me give you a sample:
  • reworked the inner code paths of the Storage Controller, preventing now to accidentally configure the SC with am undesired backend;
  • added safety mechanism to the HA functioning which will prevent or greatly reduce the risk to have split brain Cloud Controllers;
  • more robust handling of orphan instances (the situation appears if the Node Controller is not able to timely relay its information all the way to the CLC)
  • plugged memory and database connections leaks (fairly annoying since they required restart of components under particular use cases).
Likely you got more excited about our awesome new user console, but it's the features like the above list that gives me the comfort of a solid Infrastructure. 

User Console screenshot taken from David's blog


Are we there yet?

As I mentioned before, there is always room for improvements. The bulk of the work for 3.2 went into hardening the code, into covering all the corner cases, into improving QA coverage. I call all this work the invisible work since it is neither flashy, nor apparent at cursory inspection, yet it is the one that allows the Infrastructure to survive the test of time. 

With most of the invisible work done, what is ahead of us is more easy to understand and categorize. From the list of the work scoped for 3.3, I see a lot of great new features: autoscaling, cloudwatch, elb for example. This is still scoping work, so if you really like one feature, go and tell us, or up-vote it in our issue tracker.  Yet, with all these new features, we don't loose focus on our Infrastructure roots. In particular the work for Maintenance Mode, and Networking, alongside with a lot of other features that will make it much easier to deal with Cloud Resource, for example vmtypes and tagging.

So, are we there yet? As I mentioned in my previous blog, work on an infrastructure is done when the infrastructure is not in use anymore, so no, we are not there yet, but for sure we are having a great ride.

Wednesday, December 5, 2012

It's Maintainability, Stupid

I have been playing with FastStart and Silvereye for a while, as well as worked with the single-cloud script. I also wrote about the setup I use for testing Eucalyptus on a laptop. The latest tweets, and blogs clocked a Eucalyptus installation in less than 15 minutes. Quite impressive I have to say. Yet I think it's important not to lose prospective on the ultimate goal for an on-premise cloud.

On-Premise Cloud as Infrastructure

I see the installation of  Eucalyptus (or any on-premise cloud for that matter) as a step during the deployment of a cloud, and neither the first one, nor the most important one. The on-premise clouds I am referring to, are IaaS, and right now I want to focus on the Infrastructure part . When I think about Infrastructure I think of satellites, power grid, aqueducts, Internet, and so on. There are quite a few characteristics of an Infrastructure, but one in particular comes to my mind to identify a successful one: Maintainability.

Why Maintainability?

An Infrastructure's purpose is to provide or support basics services: when infrastructure collapses, severe consequences are to be expected (think blackouts, blocked highway, etc...). An Infrastructure needs to be dependable (how can you build a house on unstable foundations?), to have a long lifespan (I can think of temporary only for Proof Of Concept installations), to sustain a very different load or use throughout its useful life (think about Internet, highways  etc.. from their inception to what they are today), hence it needs to  adapt to the load (elasticity), to isolate and/or limit the scope of failures (resiliency), to be functioning and accessible (available), to be inter-operable with different versions and/or similar minded infrastructure, to isolate operator access (minimize human errors). In my mind the above encapsulates the essence of Maintainability. I guess I'm taking the Cloud Admin side by considering reliability and availability as part of maintainability - since an Infrastructure that is neither reliable, nor available is not maintainable from the admin point of view.

Deploying On-Premise Cloud

So, how do we deploy an on premise cloud successfully? I see most of the difficulties in the planning, preparing, and forecasting. One key element is to understand where the workload and the cloud will be in the future, and to have a path to migrate both the underlying physical system and the cloud software incrementally. Very easy, right?


Learn your Workload

In most cases an on-premise cloud is deployed for a specific application or set of applications, or departments, or group of users, or use-cases, so the workload is already implicit. Learn the needs to of the workload in term of compute, network, storage, and how parallel or spiky it is, and if possible at all, forecast the future workload.

Capturing at this stage a sample of the workload, or to create an artificial load which mimic the real workload, is a boon for the successive stages. Although it not always easy or possible, having a model of the workload will be very helpful to validate the physical infrastructure, and the cloud deployment.

Map Workload to Cloud Resources

Cloud Resources are basically three: compute (CPU and RAM used by the instances), storage, network (bandwidth, security/isolation, IP addresses). Understanding the workload needs in terms of Cloud Resources usage, will help to size the cloud appropriately. 

Also, usage will most likely vary with the Cloud Users becoming more savvy: it's common to see a very heavy reliance on EBS (for instances and volumes) when starting using the cloud, and to move more toward instance-store once the applications become cloud-aware. 


Prepare your Physical Infrastructure

With the workload defined in term of Cloud Resources, we can start isolating the possible bottlenecks and prepare the physical infrastructure to successfully run the workload. Note that some Cloud Resources may end up using multiple physical resources at the same time: for example, boot from EBS instances may tax at the same time the Storage Controller (EBS service provider) and the network (to allow the Node Controller to boot the instance). 

Also factor in the load incurred in fulfilling some operation: for example starting an instance may have the Node Controller fetch the Image from Walrus, copying it on its local disk, then starting the instance. Those operation may create contention on the network (instance traffic and image transfer), on Walrus (multiple NC asking for different images), or on the local disk (if the caching of the instances is done on the same disk where ephemeral storage resides).

The physical infrastructure plays also a very important role when thinking about scaling the cloud: if the storage cannot be easily expanded, if the network cannot be upgraded or reconfigured, growing the cloud to meet the forecast workload may be impossible without the need of a re-install or a long downtime


Deploy

Well, we finally got here: with the physical infrastructure properly sized, we can start installing each component, Cloud Controller, Walrus, Cluster Controller(s), Storage Controller(s), and Node Controller(s) on their respective hosts. And yes, this step can take a lot less than 1/2 hour, although on production installation, with multi-cluster, HA, and SANs, it may take a bit longer.


Maintain

Once the cloud is deployed, it truly becomes an infrastructure, and as such we need to ensure it stays up all the time, through upgrades (cloud software, host OS, router firmware, all should be up-gradable with hopefully no downtime for the cloud as a whole), failures (failure of a machine or a component should not impact the cloud, although it may impact an instance, a pending request, etc...), expansions (adding Node Controllers, clusters, storage, network), and load spikes (cloud should degrade gracefully and not collapse). 

Any of the above steps may happen after deployment (you may need to profile some problematic workload, or to deploy some new components), yet they all fall under the Maintain umbrella, since they are all needed to ensure that the Infrastructure fires on all cylinders and becomes invisible. Once the Infrastructure has been in place long enough, its usage will be taken for granted, and the only attentions will be received when there are deficiencies or problems (think about power grid and how black-outs or brown-outs get in the news). That is, when your cloud is not anymore maintainable, you will be in the news.

Wednesday, October 3, 2012

Cloud Storage Types

Persistent Storage sounds like a tautology to me. Been used as I am (was?) to Hard Disks, USB keys, DVDs, and all other possible way to store information, it seems that storage is persistent by definition, and only failures, or human errors can cause non-persistent and catastrophic behavior . Well, in IaaS terminology, Storage comes in different flavors and can also be Ephemeral.

Storage Types

Eucalyptus follows the AWS API and with these APIs comes 3 Storage types:
  • Buckets -- objects store implemented by S3 -- (provided by Walrus), 
  • Elastic Block Storage (EBS) Volumes (provided by the Storage Controller),
  • Ephemeral Instance Store  (provided by the Node Controller). 
Our Storage Team, started a wiki to dig into the technical aspect of the different storage types: stay tune on GitHub for more in depth technical dive on how they have been implemented.

Of the above list, two are meant to be persistent (i.e. to persist across instance termination): Volumes and Buckets. Two provides the familiar block interface (i.e. they appear  and are used as Hard Disks): Volumes and Ephemeral. One is designed to be massively scalable: Buckets. And one is meant to be temporary: Ephemeral.


Instances and Storage

Instances by default are sitting on Ephemeral Storage. Uploaded images (EMIs) are the master copies, and all instances will start as a fresh copy of the that very image. All changes made to the instance (e.g. packages installed, configuration, application data) will disappear once the instance terminates. Notice that the termination of an instance can be voluntary (i.e. the Cloud User issue a terminate-instance command) or accidental (e.g. the hardware running the instance fail, or the software within the instance fails badly). This kind of instances are called instance-store instances

Instances can also use Volumes for their root file system: they are aptly called boot from EBS instance. In this case, at instance creation, a Volume is cloned from a specific EMI snapshot. The instance will then have exclusive access to this Volume, throughout its lifetime, allowing for stopping and re-starting without loss of any changes made. The instance can be restarted wherever the Volume is available (i.e. EBS Storage is only available on a cluster basis, or availability zone), and its performance is driven by the Volume performance (e.g. network speed to a SAN, or DAS serviced by the Storage Controller).

The main different in Storage speed between instance-store and boot from EBS,  is a trade between speed of the local disk (in the case of instance-store), and the speed of accessing a SAN (or DAS) across a network. Things gets more complicated when multiple instances competes for shared resource (e.g. common disks on a Node Controller, or network access to the SAN).


Cloud Admin

The Cloud Admin, although not a user of these Storage types, needs to  have a clear deploy plan to provide enough Storage space for each type, limit contention on shared resources, and ensure that the performance and reliability meets the expected levels. An understanding of the specific load, will go a long way to size the cloud properly.

The deployment of Walrus and the Storage Controller, respectively providers of  Buckets and Volumes, is key to ensure the right level of reliable Nines of the Persistent Storage types. Walrus get/put interface, helps to ensure the scalabilty of the service, but a slow host (CPU is needed to decrypt uploaded images, and serve concurrent streams), or limited space (Walrus stores uploaded images and Buckets) can severely crippled the normal functioning of a cloud.  The Storage Controller serve Volumes to a cluster, both for EBS attachment and boot from EBS: under-sizing the network between Storage Controller and Node Controllers,  can slow down to a crawling halt each instance request to disk.

Ephemeral is served by the Node Controller. Sizing the physical storage subsystem for the expected number and type of instances is needed to ensure the full load can be achieved. Also, with the current multi-cores CPUs, quite a few instances can run on the same Node Controller. Too many concurrent disk requests can easily overwhelm the Node Controller's host, causing instances to time out, or unpredictable and erratic behavior: the storage subsystem needs to be properly tested for the expected concurrent load.

Cloud Application Architect

The Application Architect is reasonably isolated from the underlying hardware used to build the private cloud, insofar as the Cloud Admin has planned properly the Storage Types availability, performances and reliability. Thus the main decisions for the Architect is which Storage Type to use and when. 

Persistent vs Ephemeral 

When I started to drink champagne, I went for what I was accustomed to, that is, very well know servers, well taken care of (ie very persistent). In short an environment where a server rebuild is an exceptional case. The ancient version of Eucalyptus we used then, didn't have boot from EBS, so we effectively implemented it using Volumes and chroot environment. As a bonus, backups were as easy as to create a Snapshot (euca-create-snapshot).

After few cloud moves and upgrades (both of software and hardware), I started to embrace the idea of the chaos monkey, where no single instance is central to the service. Now I'm relying more and more on scripts to configure defaults images on the fly, and on Buckets to store the backups needed to recover the last good state. In the case of essential database I would still use a combination of Volumes and Bucket for availability and backups.

I think my experience is common, and  I see how administrators coming from datacenter tend to start with the Storage is Persistent idea, looking for the comfort of boot from EBS. Administrators coming from the public cloud, are already familiar with the dynamic approach of the cloud, and are more comfortable with the idea that some Storage is Ephemeral, and plan accordingly for instances to be disposable

Edited December 10, 2012
Added links for the various storage types definitions, and made it clear that S3 provides Buckets. Added Cloud Storage Types properties picture.