Archive for the 'eucalyptus' Category

How to Get Credentials on Eucalyptus 4.2

10 Mar 2016

If you run euca_conf --get-credentials on eucalyptus 4.2 you will see the following warning:

warning: euca_conf is deprecated; use ``eval `clcadmin-assume-system-credentials`''
or ``euare-useraddkey''`

There are numerous reasons for that command’s deprecation, but what causes confusion is the fact that it has two replacements. Setting up a new cloud now involves more than just one set of credentials, and if you’re used to having fully-functional credentials immediately this is likely to trip you up.

Why Change?

One of the most common complaints about euca_conf is that it tries to be everything to everybody. It combines multiple types of functionality that need to run in different places, adding excess dependencies and requiring one to log into systems that one normally shouldn’t have to. Eucalyptus 4.2 introduces new administration tools that break euca_conf‘s functionality down into three groups with more specific purposes:

  1. Whole-cloud administration tools
  2. Cloud controller (CLC) support scripts
  3. Cluster controller (CC) support scripts

Cloud controller and cluster controller support scripts can run only on those specific systems, and thus are only installed alongside them. The rest of the administration tools are web service clients, similar to euca2ools, that can run from anywhere. All they need are access keys.

But where do those access keys come from?

Out with the Old

In the old regime, access keys and other credentials come in the form of a zip file containing a bunch of certificates as well as eucarc, a shell script that sets a bunch of environment variables that include service URLs and the access keys themselves. The first zip file it creates is missing several service URLs because those services have yet to be set up, and it doesn’t use DNS either because that has yet to be set up as well.

Once DNS and all of the services are ready, we then have the cloud generate a new zip file. Everything seems fine until something changes for whatever reason and we need to obtain a third one. Since we can only have two certificates at a time, though, this third zip file will not include one. This causes countless problems for automation that relies on them, including eucalyptus’s own QA scripts.

That said, the zip file still has some particularly useful properties:

  • It’s a single file for the administrator to e-mail to new users
  • It contains both access keys and service URLs
  • It (usually) contains all of the certificates needed to bundle images

A euca2ools.ini file also has the first two of those properties, while also managing to be more flexible. Any euca2ools commands that can create access keys, such as euare-useraddkey and euare-usercreate, can generate euca2ools.ini files automatically. That leaves just certificates, which we dealt with by making them all optional or possible to obtain automatically.

In with the New

In isolation, euca2ools commands alone have a chicken-and-egg problem: they require access keys to run, but a new cloud doesn’t have any access keys. We break this loop by splitting eucalyptus installation into two phases, each with different credentials.

Setup Credentials

A cloud controller support script, clcadmin-assume-system-credentials, provides temporary setup credentials. This script works similarly to euare-assumerole, but it is much more limited and it only works on a cloud controller. Setup credentials cannot be used for normal system operation; they provide access only to service registration, service configuration, and IAM services — the minimum necessary to get up and running with euca2ools.

# eval `clcadmin-assume-system-credentials`
# euserv-register-service -t user-api -h 198.51.100.2 ufs-1
# euctl system.dns.dnsdomain=mycloud.example.com

Admin Credentials

Once DNS and an IAM service are set up, you can use euca2ools to create long-lived admin credentials that let you access the cloud’s full functionality. It is these credentials that are the replacements for the zip file. Once you create them, you are unlikely to ever need setup credentials again.

# euare-usercreate -wld mycloud.example.com gholms > ~gholms/.euca/mycloud.ini

Here is an explanation of the various parts of that command:

  • gholms: Create a user named gholms
  • -w: Write out a euca2ools.ini file
  • -l: In that file, make that user the default for this cloud
  • -d mycloud.example.com: Use the domain mycloud.example.com as the cloud’s DNS domain

Normally, when this command writes a configuration file it will pull the DNS domain from the IAM service’s URL, but since this is the very first user we have to supply it by hand because it has not yet been set.

What now?

Once you have a set of admin credentials you can use this for day-to-day cloud administration the same way you would with a classic eucarc file.

% export AWS_DEFAULT_REGION=mycloud.example.com
% euare-accountcreate -wl alice > alice.ini
% mail -s "Try out this shiny, new cloud" -a alice.ini ...

Mr. TV

16 Dec 2013

I previously wrote about the big, Raspberry Pi-powered TV set at Eucalyptus HQ that displays the #eucalyptus-devel IRC channel so developers can always see what is going on and jump in if they need to. That setup has worked quite well for some time now, but I recently came up with a way to make it even better:

Mr. TV

Googly eyes have yet to fail me at improving a machine’s appearance.

Running a Text-based Kiosk with Systemd

9 Jul 2013

Eucalyptus HQ has a big TV on the wall that displays the #eucalyptus-devel IRC channel so developers can always see what is going on and jump in if they need to. Until recently, a laptop drove that display, but that seemed like overkill to me, so I went to employ my Raspberry Pi running the Raspberry Pi Fedora Remix to do that instead. Since the IRC program it’s using, irssi, is text-based I don’t need to use any of the Pi’s precious little memory to run anything graphical, so I just needed to figure out how to make systemd spawn irssi instead of a login prompt on tty1.

I would normally do this by copying /lib/systemd/system/getty@.service to /etc/systemd/system/getty@tty1.service and then editing that, but F18’s version of systemd let me do this in an even simpler manner. By creating a directory with the same name as that file, plus .d, I can add a config file to that directory that overrides only the parts of the original unit file that I need to change:

[Service]
After=network-online.target
Wants=network-online.target
ExecStartPre=/usr/bin/nm-online
ExecStart=
ExecStart=/usr/bin/irssi
KillSignal=SIGTERM
StandardInput=tty
StandardOutput=tty
User=kiosk

Now I can just plug the system in and have it automatically up and running irssi in less than a minute.

Unexpected lessons

I didn’t expect to have to run nm-online here because network-online.target is supposed to wait for a service that runs that itself, but for some reason systemd didn’t order things that way and irssi came up before the network connection did. Running that command as part of this unit worked around that problem.

Use the consoleblank=0 parameter to prevent Linux from blanking the screen after the usual ten minutes of inactivity.

I’m using the TV’s USB “service” port to power the raspberry pi. That usually works just fine, but when the TV turns off it cuts the power to that port as well, abruptly shutting the raspberry pi off. I don’t have any data loss in particular to worry about, but turning the system back on causes some annoyance: when the TV turns on the raspberry pi also powers on and attempts to detect what kind of screen it is plugged into. At that point the TV hasn’t figured out what it wants to display yet, so the detection fails and I’m left with a blank screen until I reboot the computer.

What’s New in Euca2ools 3, Part 2: A Developer’s Perspective

3 Apr 2013

The upcoming version of euca2ools, version 3, completely reworks the command line suite to make it both easier to write and easier to use. Part 1 of this series discussed the user-facing changes version 3 has to offer, and today we’re going to take a look at how things improve on the developer’s side of the fence.

A change in philosophy: declarative programming

The developer is very much in the driver’s seat in version 1 of euca2ools. To use a car analogy, the developer directly controls the code’s direction, speed, and gearbox manually. Version 2 adds a cruise control by centralizing a lot of boilerplate code in the form of boto’s roboto module. Version 3 opts to let the developer give the requestbuilder framework a destination, step aside completely, and let it do the driving for the boring parts of the trip.

Requestbuilder offers a set of base classes and a domain-specific language based on python’s standard argparse library that allows the developer to say exactly how something should look at the command line in addition to how it should look when given to the server all in the same place.

What makes this so powerful is that it lets anybody with a service’s documentation and knowledge of how to use argparse write a command line tool quickly and painlessly. For instance, it took me around a day to write highly-customized command line tools for every operation Amazon’s Elastic Load Balancing service supports. Here’s the code from one of them:

class CreateLBCookieStickinessPolicy(ELBRequest):
    DESCRIPTION = ('Create a new stickiness policy for a load balancer, '
                   'whereby the load balancer automatically generates cookies '
                   'that it uses to route requests from each user to the same '
                   'back end instance. This type of policy can only be '
                   'associated with HTTP or HTTPS listeners.')
    ARGS = [Arg('LoadBalancerName', metavar='ELB',
                help='name of the load balancer to modify (required)'),
            Arg('-e', '--expiration-period', dest='CookieExpirationPeriod',
                metavar='SECONDS', type=int, required=True,
                help='''time period after which cookies should be considered
                stale (default: user's session length) (required)'''),
            Arg('-p', '--policy-name', dest='PolicyName', metavar='POLICY',
                required=True, help='name of the new policy (required)')]

The framework hands everything inside each Arg in this code to argparse to gather input from the command line and then send the results directly to the web server using whatever name argparse gives the input it gets. For instance, whatever a user supplies using the -e option ends up getting sent to the server as a CookieExpirationPeriod parameter. With a small amount of practice it becomes quite easy to write a bunch of commands this way very quickly.

One request, one command

Euca2ools are built around a “one request, one command” tenet. This means that, in general, there is a dedicated command for each thing a web service can do. This philosophy naturally lends itself to the tight coupling between command line options and what gets sent to the server discussed earlier, but it also lends itself to reversing the usual relationship between web services and web service requests. Whereas one typically writes an object that represents the service and uses methods on it to send requests, in euca2ools it is the commands, and thus the requests, which are the first-class citizens. Each command that represents a request instead points to a service, rather than the other way around.

The way this works in practice is by defining a base class for each service and a base class that all methods which use that service share:

class CloudWatch(requestbuilder.service.BaseService):
    NAME = 'monitoring'
    DESCRIPTION = 'Instance monitoring service'
    API_VERSION = '2010-08-01'
    AUTH_CLASS = requestbuilder.auth.QuerySigV2Auth
    URL_ENVVAR = 'AWS_CLOUDWATCH_URL'

    ARGS = [MutuallyExclusiveArgList(
                Arg('--region', dest='userregion', metavar='USER@REGION',
                    route_to=SERVICE, help='''name of the region and/or user
                    in config files to use to connect to the service'''),
                Arg('-U', '--url', metavar='URL', route_to=SERVICE,
                    help='instance monitoring service endpoint URL'))]

class CloudWatchRequest(requestbuilder.request.AWSQueryRequest):
    SERVICE_CLASS = CloudWatch

Services can supply their own command line options in the same way as requests. After it gathers options from the command line, requestbuilder uses route_to to choose where to send it. This also provides a convenient way to tell the framework not to send an option to the server at all when a command needs to process it specially: just use route_to=None.

Convention over configuration

The oft-quoted programming paradigm for frameworks is just as true for euca2ools 3 as it is elsewhere. Want to make a command print something? Just write a print_result method. The result from the server gets passed in as a dictionary.

class TerminateInstances(EucalyptusRequest):
    DESCRIPTION = 'Terminate one or more instances'
    ARGS = [Arg('InstanceId', metavar='INSTANCE', nargs='+',
                help='ID(s) of the instance(s) to terminate')]
    LIST_TAGS = ['instancesSet']

    def print_result(self, result):
        for instance in result.get('instancesSet', []):
            print self.tabify(('INSTANCE', instance.get('instanceId'),
                               instance.get('previousState', {}).get('name'),
                               instance.get('currentState', {}).get('name')))

Want to make a request do fancier preparations than argparse can do on its own? Just write a preprocess method that takes things from self.args and adds things to self.params to be sent to the server.

class DescribeSecurityGroups(EucalyptusRequest):
    DESCRIPTION = ('Show information about security groups\n\nNote that '
                   'filters are matched on literal strings only, so '
                   '"--filter ip-permission.from-port=22" will *not* match a '
                   'group with a port range of 20 to 30.')
    ARGS = [Arg('group', metavar='GROUP', nargs='*', route_to=None,
                default=[], help='limit results to specific security groups')]
    ...
    def preprocess(self):
        for group in self.args['group']:
            if group.startswith('sg-'):
                self.params.setdefault('GroupId', [])
                self.params['GroupId'].append(group)
            else:
                self.params.setdefault('GroupName', [])
                self.params['GroupName'].append(group)

There are also a few other methods one can plug in, such as postprocess, and, for especially early-running code, configure. Expect documentation for requestbuilder that covers this in detail in the future.

Scratching the surface

The examples above cover only a fraction of what is possible with euca2ools 3’s new infrastructure. While you can look forward to some more advanced uses of it in later blog posts, you can also take a look at the current euca2ools code in development to see some of the interesting things one can do with it. Today’s pre-release of that code carries with it commands for all three of AWS’s “triangle” services: Auto Scaling, CloudWatch, and Elastic Load Balancing. Continuing what seems to have become a euca2ools tradition, just look for the commands that start with euscale (pronounced “you scale”) euwatch (“you watch”), and eulb (“you’ll be”).

Packages for Fedora and RHEL 6 are available here. If you’re using another OS or want to build the code yourself you can simply clone euca2ools’s git repository‘s requestbuilder branch. Requestbuilder itself is available on PyPI and GitHub. As always, I encourage you to test this code against AWS and Eucalyptus 3.3 and let me know what you think on the euca-users mailing list. If you encounter bugs, please file them in the project’s bug tracker.

What’s New in Euca2ools 3, Part 1: A User’s Perspective

21 Feb 2013

Version 3 of euca2ools, slated for release in just a couple months, gives the command line suite a much-needed refresh that makes it both easier to write and easier to use. Most of the innovation here involves changes to the platform upon which it is built. I will cover those changes from a developer’s perspective in future blog posts, but today I’m going to focus on what euca2ools 3 brings to the table for developers and other users alike. While there are too many small improvements to possibly cover them all, euca2ools 3 at last brings a few of the niceties power users have come to expect from their command line tools to cloud management.

A configuration file

Yes, you read that right: a configuration file. Both euca2ools and the command line tools provided by AWS themselves have astonishingly limited support for configuration, forcing people to resort to writing a separate shell script for each combination of users and clouds one might possibly want to access and then use them in place of one.

Your cries of anguish have been heard, so now we have this:

[user gholms]
key-id = AKIA93F29V0AEXAMPLE
secret-key = vcasd93cm1458un4vj84039vda78mDEXAMPLE

[user ecc-admin]
key-id = EVDB93F29V0AEXAMPLE
secret-key = 38fva93cm1458un4vj84039vda78mDEXAMPLE

[region us-east-1]
ec2-url = https://ec2.amazonaws.com/
iam-url = https://iam.amazonaws.com/
s3-url  = https://s3.amazonaws.com/
user = gholms

[region ecc]
ec2-url = https://communitycloud.eucalyptus.com:8773/services/Eucalyptus/
iam-url = https://communitycloud.eucalyptus.com:8773/services/Euare/
s3-url  = https://communitycloud.eucalyptus.com:8773/services/Walrus/
user = ecc-admin

[global]
default-region = us-east-1

A file like this, combined with the --region option that all tools share, mean you can mix and match users and clouds to you heart’s content. Just throw a file like this inside of ~/.euca, end it with .ini, and away you go! You can add as many files to ~/.euca as you want — they all get combined together.

Friendly error feedback

Another common complaint that people had with euca2ools 2 was its behavior in the face of input that didn’t match what it expected. Some of the worst offenders had error messages ranging from confusing to irrelevant to nonexistent. Euca2ools 3 overhauls the code that does this, replacing it with standard python tools and friendlier code that makes its behavior in the face of errors much better.

Here’s how it behaves in the face of the most common case of this:

% euca-describe-availability-zones
error: missing access key ID; please supply one with -I

Also included is special treatment for “pick one from multiple alternatives” options:

% euare-useraddcert
usage: euare-useraddcert (-c CERT | -f FILE) [-u USER]
                         [--as-account ACCOUNT] [--region REGION | -U URL]
                         [-I KEY_ID] [-S KEY]
euare-useraddcert: error: one of the arguments -c/--certificate-body -f/--certificate file is required

A lot of attention to detail went into dealing with some of the most common mistakes people make:

% euca-register -n myimage -b /dev/sda1=snap-12345678:false
euca-register: error: argument -b/--block-device-mapping: second element of EBS block device mapping "/dev/sda1=snap-00000000:false" must be an integer
% euca-authorize mygroup -p 8773:8777
euca-authorize: error: argument -p/--port-range: multi-port range must be separated by "-", not ":"

Tagging and filtering support

Euca2ools 3 at last offers full support for EC2’s massive sets of resource tags and filters:

% euca-describe-instances -h
usage: euca-describe-instances [-h] [--show-empty-fields]
...
  --filter NAME=VALUE   restrict results to those that meet criteria
...

allowed filter names:
  architecture          CPU architecture
  availability-zone
  block-device-mapping.attach-time
                        volume attachment time
  block-device-mapping.delete-on-termination
                        whether a volume is deleted upon instance
                        termination
  block-device-mapping.device-name
                        volume device name (e.g.  /dev/sdf)
  block-device-mapping.status
                        volume status
  block-device-mapping.volume-id
                        volume ID
  client-token          idempotency token provided at instance run
                        time
  dns-name              public DNS name
  group-id              security group membership
  hypervisor            hypervisor type
  image-id              machine image ID
  instance-id
  instance-lifecycle    whether this is a spot instance
  instance-state-code   numeric code identifying instance state
  instance-state-name   instance state
  instance-type
  ip-address            public IP address
  kernel-id             kernel image ID
  key-name              key pair name provided at instance launch time
  launch-index          launch index within a reservation
  launch-time           instance launch time
  monitoring-state      whether monitoring is enabled
  owner-id              instance owner's account ID
  placement-group-name
  platform              whether this is a Windows instance
  private-dns-name
  private-ip-address
  product-code
  ramdisk-id            ramdisk image ID
  reason                reason for the more recent state change
  requestor-id          ID of the entity that launched an instance
  reservation-id
  root-device-name      root device name (e.g.  /dev/sda1)
  root-device-type      root device type (ebs or instance-store)
  spot-instance-request-id
  state-reason-code     reason code for the most recent state change
  state-reason-message  
                        message for the most recent state change
  subnet-id             ID of the VPC subnet the instance is in
  tag-key               name of any tag assigned to the instance
  tag-value             value of any tag assigned to the instance
  tag:KEY               specific tag key/value combination
  virtualization-type
  vpc-id                ID of the VPC the instance is in

The new foundation this code is based upon makes it incredibly simple to extend support for these features as things change in the future.

What else?

Some other minor, but nonetheless noteworthy, changes include:

  • euca-* tools gained a --show-empty-fields option that tweaks their output to make it friendlier for running through the column command.
  • All tools that access web services use the same options (-I and -S) for access keys.
  • euare-* tools’ --delegate option for cloud administrators is now --as-account.
  • Multiple --filter options are handled correctly.
  • Machine image device mappings are now handled correctly.

A few tools have yet to be ported to the new framework, but will be in the near future. eustore-installimage is known to be broken. The bundle management tools should work correctly, though their testing to date has been minimal. Finally, do not install them on a system that runs a Eucalyptus node controller.

Isn’t aws-cli the future? Why continue developing euca2ools?

Aws-cli is a great project. Both it and euca2ools tie what the server sees very closely to what the user sees under the hood, but the euca2ools suite does so in a way that makes it trivial to customize tools to do more complicated things behind the hood or to make them easier to use. For instance, consider changing a security group’s permissions in EC2 with aws-cli:

% aws ec2 authorize-security-group-ingress --group-name MySecurityGroup --ip-permissions '{"from_port":22,"to_port":22,"ip_protocol":"tcp","ip_ranges":["0.0.0.0/0"]}'

The exact format we need to use to supply the info the tool needs requires relatively detailed knowledge of what EC2-the-server expects. Compared to that, the euca2ools version of that is easier to remember and much easier to type:

% euca-authorize MySecurityGroup --port 22 --source-subnet 0.0.0.0/0

Aws-cli is a very young project, so people haven’t yet had the chance to iron it out completely. Perhaps some day it will become as user-friendly as euca2ools and finally eclipse it. But we aren’t there yet.

How can I try it out?

If you’re interested in a preview of the next major version of euca2ools, an alpha release is available on GitHub. In addition to the dependencies required to run euca2ools 2, you will also need to install requests and the new requestbuilder framework that drives the new tools. It is still alpha-quality software, so be prepared to find bugs. If you encounter any, feel free to file them in the euca2ools project’s bug tracker.

If you’re interested in helping with development, we are happy to accept pull requests on GitHub. Please also consider joining the euca-users mailing list or stopping by in the #eucalyptus-devel IRC channel on Freenode. I look forward to hearing your feedback. 8^)

Importing Key Pairs in Eucalyptus

15 Oct 2012

One of Eucalyptus’s oldest feature requests that people constantly ask about is the ability to import a pre-existing SSH key for use with instances. It even predates EC2’s support for doing that. I am happy to report that Eucalyptus 3.2 will at long last support it as well! (See the change on GitHub.) If you’re following Eucalyptus development, you can try this out right away with euca-import-keypair. Chances are, your version of euca2ools already contains it.

The thing that makes this feature really nice, however, looks like this:

Web UI for ImportKeypair

Just a few hours ago, the new web console that is slated to land alongside Eucalyptus 3.2 later this year gained support for importing key pairs as well. (See the change on GitHub.) I’m sure I am not alone in wanting to thank the contributors who added this.

Less Obvious Changes in Eucalyptus 3.1

27 Jun 2012

Now that Eucalyptus 3.1 is out at last and we all get to wade through tons of announcements and blog posts, I thought I would mention a few of the changes that have happened since Eucalyptus 2 that you aren’t likely to see in marketing materials.

Why Eucalyptus 2? Most of us don’t get to use Eucalyptus 3.0, so comparing against that wouldn’t exactly be fair, would it? ;-)

Centralized documentation

The documentation for Eucalyptus 2 was strewn about the Eucalyptus website on a number of wiki pages. You had to read all of them to have any hope of ending up with a working cloud.

Eucalyptus 3’s documentation comes in the form of front-to-back PDFs. HTML documentation is forthcoming. You still need to read it all, but it is now in one place so you don’t have to go digging to find it.

The documentation’s source (in DITA format, if you find that sort of thing interesting) is also up on github, so there is now a way to fix errors: just send a pull request.

A new database

Eucalyptus 3.1 switches from HSQLDB to PostgreSQL. Given the number of Eucalyptus users I have seen over time who have experienced problems with HSQLDB’s behavior in the face of faults, I suspect this will make a lot of people happy.

Correct packaging

The RPM and DEB packages for Eucalyptus 2 fail to list a number of important things they depend upon, making the software needlessly complicated to install. In fact, this was so complicated that the popular FastStart distribution became the method of choice for getting started with a new Eucalyptus cloud.

This is no longer the case. Installation now consists of adding package repositories and telling one’s package manager to install a component. No more “install these dependencies first.” No more “download these packages separately and install them.” In fact, short of a script that writes Eucalyptus’s configuration for you, this completely obviates the need for FastStart.

RHEL 6 support

Eucalyptus 2 is supported only on CentOS 5. It also works on RHEL 5, but users of RHEL 6 and friends couldn’t even compile the stuff. That is now fixed; those operating systems now have full support.

A more usable configuration file

Eucalyptus 2’s configuration file jams everything into one huge list. Nothing gives any indication what Eucalyptus components actually care about each option. There is also no indication how options were affected by one’s choice of networking mode.

Eucalyptus 3’s configuration file and documentation break options down by component. For networking-related options, they also list the networking modes in which they apply.

Bugfixes

Eucalyptus 3 fixes bugs. Lots and lots of bugs. So many bugs that release notes cannot possibly list them all. In the future this will be easier, as every future bug report will now go through a new JIRA tracker.

What else?

Those are some of my favorites, but there are lots of other little improvements all over the place. Try Eucalyptus 3 out and see what you think. You may be pleasantly surprised.

Moving the Cloud Forward

23 May 2012

It is sort of becoming a tradition for each member of the Fedora Board to declare a personal goal of some sort and then lead by doing. So now that I am the newest Board member, some people are curious about my plans.

In 2010 I helped get Fedora’s Cloud SIG off the ground. At that point in time our main goal was to get a modern version of Fedora running inside Amazon’s popular cloud, EC2. Nowadays the EC2 image is part of Fedora’s regular release process and the Cloud SIG has grown into one of Fedora’s most vibrant groups, added support for one self-hosted cloud platform, and is on the way to adding several more.

In light of that, the answer is obvious: I plan to help the Cloud SIG continue to be successful.

Of course, that’s a rather vague goal, so here are some examples of what success for the Cloud SIG has meant in the past:

  • Building and testing EC2 images
  • Adding the EC2-controlling euca2ools command line suite to Fedora
  • Porting the cloud-init boot-time scripts to systemd

Now that those are done, here are some things success for the Cloud SIG may mean today:

  • Add more cloud software, such as the relatively venerable Eucalyptus, to Fedora
  • Continue to stabilize cloud-init on Fedora
  • Help make PaaS software like OpenShift and Cloud Foundry work with IaaS software like Eucalyptus and OpenStack

Lofty? Possibly. But they are certainly all worth the effort!

We need your help!

Want to give one of these a shot? Are you interested in attending a hackfest or an activity day for moving the cloud forward? Leave a comment or stop by #fedora-cloud on Freenode!

Euca2ools: Past, Present, and Future

14 Apr 2012

For those who don’t know, I work on the euca2ools suite of command line tools for interacting with Eucalyptus and Amazon Web Services clouds on Launchpad. As of late the project has stagnated somewhat, due in part to the sheer number of different tools it includes. Nearly every command one can send to a server that uses Amazon’s APIs should have at least one corresponding command line tool, making development of euca2ools’ code repetitive and error-prone.

Today this is going to end.

But before we get to that part, let’s chronicle how euca2ools got to where they are today.

The Past

Early euca2ools versions employed the popular boto Python library to do their heavy lifting. Each tool of this sort triggers a long chain of events:

  • The tool translates data from the command line into its internal data structures.
  • The tool translates its internal data into the form that boto expects and then hands it off to boto.
  • Boto translates the data into the form that the server expects and then sends it to the server.
  • When the server responds, boto translates its response into a packaged form that is useful for programming and returns it to the tool.
  • The tool immediately tears that version back apart and translates it into a text-based form that can go back to the command line.

Things shouldn’t be this convoluted. Not in Python.

The Present

Tackling this problem involved coming up with ways to simplify not only the code, but also the process through which they are written. This led to two major changes, upon which all of the current euca2ools code is built.

“eucacommand”

The first step was consolidating all of the code involved in performing the first step of this process — reading data from the command line — into one location. Each tool then simply needed to describe what it expected to receive from the command line, and the shared code would take care of the rest. For example, let’s look at part of an older command, euca-create-volume:

class CreateVolume(EucaCommand):
    Description = 'Creates a volume in a specified availability zone.'
    Options = [Param(name='size', short_name='s', long_name='size',
                     optional=True, ptype='integer',
                     doc='size of the volume (in GiB).'),
               Param(name='snapshot', long_name='snapshot',
                     optional=True, ptype='string',
                     doc="""snapshot id to create the volume from.
                     Either size or snapshot can be specified (not both)."""),
               Param(name='zone', short_name='z', long_name='zone',
                     optional=False, ptype='string',
                     doc='availability zone to create the volume in')]

Because there are three Params the shared code library reads three bits of info from the command line and hands them to the command’s code, which then hands them to boto, and so on.

This methodology forms the basis for all of the current euca2ools that begin with “euca”.

Roboto

For a euca2ools command line tool to be useful it has to gather data from the command line, send these data to the server, and return data from the server to the user. A little-known boto sub-project written by boto developer (and former euca2ools developer) Mitch Garnaat, roboto, takes this statement literally and opts to let tools work at a lower level: instead of translating data from the command line into an intermediate format to send to boto, tools send these data directly to the server in the form that the server expects. The effect of this is that of essentially removing boto from the euca2ools code base altogether. By removing boto from the path that data have to take to get from the command line to the server and back, roboto makes tool writing and debugging simpler because there is less code to walk through and understand.

Roboto is the basis for all of the current euca2ools that begin with “euare”.

The Future

That is the state of the code today. Where do we go from here? While roboto allows one to create command line tools with a minimal amount of effort, it has several rough edges which prevented it from taking off and which make it sub-optimal for building out the hundreds of commands that the euca2ools suite will soon need to cover:

  • User-unfriendly — When a user types something wrong or forgets to include something, roboto’s messages are often uselessly terse and unhelpful.
  • A steeper learning curve than necessary — Roboto contains a large amount of custom code dedicated to fetching information from the command line. This steepens the learning curve for people who want to contribute code or fix bugs.
  • Too much hardcoding — Roboto assumes that all tools do certain things, such as ascertaining what keys they should use to access the cloud, the same way.
  • Still more work than it has to be — Though it makes writing tools simpler, roboto still hands each tool a bucket of information and expects the tool to pick out the bits the server needs and send them onward.

Enter requestbuilder

Requestbuilder is a new Python library that attempts to rethink the way roboto works in a way that is more familiar to the typical Python developer and requires less custom code to run. The easiest way to illustrate this is with an example.

A command line tool embodies a specific request to the server, so each such tool defines a Request that describes how it works:

class ListUsers(EuareRequest):
    Description = 'List the users who start with a specific path'
    Args = [Arg('-p', '--prefix', dest='PathPrefix', metavar='PREFIX',
                help='list only users whose paths begin with a prefix'),
            Arg('--max-items', type=int, dest='MaxItems',
                help='limit the number of results')]

    def main(self):
        return self.send()

    def print_result(self, result):
        for user in result['Users']:
            print user['Arn']

Those familiar with Python’s argparse library will recognize the code inside Arg(...), because requestbuilder does away with roboto’s custom code for reading things off the command line and instead lets argparse do the work. This cuts down on the amount of code we need to maintain, makes tool writing easier for developers who are already familiar with the Python standard library, and makes command line-related error messages much more user-friendly.

When the tool starts running, requestbuilder uses data from the command line to fill in a dictionary called args and runs the tool’s main method, whose job is to process this information and fill in the portions of the request that will be sent to the server: params, headers, and post_data, and then run the send method to send it all to the server and retrieve a response. Attaching each of these sets of data to the request instead of passing them around between methods allows one to send a request, tweak it, and send the tweaked version as well.

Why doesn’t the code above fill any of these things in? Since most of the data that comes off the command line goes directly to the server, when a tool runs send requestbuilder will automatically fill in params from the contents of args so the tool doesn’t have to: whatever the user supplied with --prefix at the command line gets sent to the server with the name PathPrefix, and so forth.

But what if something should not be sent to the server? While data from the command line go into params to be sent to the server by default, one can tell requestbuilder to send a particular bit of data elsewhere instead:

Arg('--debug', action='store_true', route_to=None)

None instructs requestbuilder to leave the “debug” flag alone and not attempt to send it anywhere. Data can also go elsewhere, such as to the connection that gets set up as the tool contacts the server:

Arg('-I', '--access-key-id', dest='aws_access_key_id', route_to=CONNECTION)

Astute readers will note that I haven’t described what EuareRequest in the earlier example does, so here is the code for that:

class EuareRequest(BaseRequest):
    ServiceClass = Euare
    Args = [Arg('--delegate', dest='DelegateAccount', metavar='ACCOUNT',
                help='''[Eucalyptus extension] run this command as another
                        account (only usable by cloud administrators)''')]

Requestbuilder makes tool writers’ jobs easier by allowing one type of request to inherit its command line options from another type of request and then supply their own by simply listing more of them. This is a little different from the way Python usually works; Requestbuilder does some magic behind the scenes to make this possible. As a result, everything common to commands that access the EUARE service (Eucalyptus’s equivalent of Amazon’s IAM service) can go into one place to be shared with others.

The final piece of information requestbuilder needs is a ServiceClass, which describes the web service that the tool connects to. A service class is another simple bit of code that looks like this:

class Euare(BaseService):
    Description = 'Eucalyptus User, Authorization and Reporting Environment'
    APIVersion = '2010-05-08'
    EnvURL = 'EUARE_URL'

The net gain from all this is a smaller, but much more flexible code base that should be able to scale better than anything we have had before. Requestbuilder’s use of Python’s argparse library also makes tools much more informative to users than ever before.

How You Can Help

We’re developing requestbuilder on GitHub as a project under the boto organization. We’re going to start rewriting euca2ools, one by one, improving requestbuilder to support new things as we go. It’s still early on, so if you have ideas to share or you’re interested in helping develop this code, now is your chance!

We’re also moving development of euca2ools itself to GitHub. This will make it easier to work on euca2ools and requestbuilder in parallel. It will also make it easier to share code with the rest of the boto community.

If you’re interested in getting involved, join us on the #boto or #eucalyptus IRC channels on Freenode. You can also send e-mail to Eucalyptus’s community list.