Git Pull via Sneakernet

8 Jun 2012

Today I found myself needing to move some commits between two repositories. In general the best way to do this is by pulling changes from one into the other, but in this case the repositories did not have direct access to each other. Rather than copying an entire repository from one machine to another or mucking about with a pile of patches, we can save time by performing the sending and receiving sides of the network-enabled git fetch command by hand.

In the source repository, add the changes we want to move to a bundle that we can copy to a USB stick:

% git bundle create changes.bundle master..mybranch
Counting objects: 5, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 313 bytes, done.
Total 3 (delta 2), reused 0 (delta 0)
% cp changes.bundle /media/usbstick

In the destination repository, ensure that we have the commits necessary to use the bundle and then tell git fetch to grab the changes from it:

% git bundle verify /media/usbstick/changes.bundle
The bundle contains 1 ref
7a1d2087f10e6db33e6b4a28e2c427b65238a62c refs/heads/mybranch
The bundle requires these 1 ref
6f5fced94ef76f1b46e259db72ad6fc39c49ba72 
/media/usbstick/changes.bundle is okay
% git fetch /media/usbstick/changes.bundle mybranch
Receiving objects: 100% (3/3), done.
Resolving deltas: 100% (2/2), completed with 2 local objects.
From /media/usbstick/changes.bundle
 * branch            mybranch   -> FETCH_HEAD
% git merge FETCH_HEAD
Updating 6f5fced..7a1d208
Fast-forward
 README |    2 ++
 1 file changed, 2 insertions(+)

Useful Yum Commands: Installing by Path

25 May 2012

Every once in a while I find myself trying to install something, but not knowing what package contains it.

# yum install g++
Setting up Install Process
No package g++ available.
Error: Nothing to do

But if you’re using a modern version of yum (i.e. that of RHEL 6 or Fedora) then you can simply tell it to install the program you’re looking for.

# yum install /usr/bin/g++
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package gcc-c++.x86_64 0:4.4.6-3.el6 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

Moving the Cloud Forward

23 May 2012

It is sort of becoming a tradition for each member of the Fedora Board to declare a personal goal of some sort and then lead by doing. So now that I am the newest Board member, some people are curious about my plans.

In 2010 I helped get Fedora’s Cloud SIG off the ground. At that point in time our main goal was to get a modern version of Fedora running inside Amazon’s popular cloud, EC2. Nowadays the EC2 image is part of Fedora’s regular release process and the Cloud SIG has grown into one of Fedora’s most vibrant groups, added support for one self-hosted cloud platform, and is on the way to adding several more.

In light of that, the answer is obvious: I plan to help the Cloud SIG continue to be successful.

Of course, that’s a rather vague goal, so here are some examples of what success for the Cloud SIG has meant in the past:

  • Building and testing EC2 images
  • Adding the EC2-controlling euca2ools command line suite to Fedora
  • Porting the cloud-init boot-time scripts to systemd

Now that those are done, here are some things success for the Cloud SIG may mean today:

  • Add more cloud software, such as the relatively venerable Eucalyptus, to Fedora
  • Continue to stabilize cloud-init on Fedora
  • Help make PaaS software like OpenShift and Cloud Foundry work with IaaS software like Eucalyptus and OpenStack

Lofty? Possibly. But they are certainly all worth the effort!

We need your help!

Want to give one of these a shot? Are you interested in attending a hackfest or an activity day for moving the cloud forward? Leave a comment or stop by #fedora-cloud on Freenode!


Fixing a Git Branch that Started in the Wrong Place

21 May 2012

Today I did some work on a branch in git only to discover that I based it on some unstable code rather than the stable code that I usually want to use as a baseline.

You can fix this by using a bit of git’s rebase magic:

git rebase --onto master testing

This moves the entire current branch onto another.


Euca2ools: Past, Present, and Future

14 Apr 2012

For those who don’t know, I work on the euca2ools suite of command line tools for interacting with Eucalyptus and Amazon Web Services clouds on Launchpad. As of late the project has stagnated somewhat, due in part to the sheer number of different tools it includes. Nearly every command one can send to a server that uses Amazon’s APIs should have at least one corresponding command line tool, making development of euca2ools’ code repetitive and error-prone.

Today this is going to end.

But before we get to that part, let’s chronicle how euca2ools got to where they are today.

The Past

Early euca2ools versions employed the popular boto Python library to do their heavy lifting. Each tool of this sort triggers a long chain of events:

  • The tool translates data from the command line into its internal data structures.
  • The tool translates its internal data into the form that boto expects and then hands it off to boto.
  • Boto translates the data into the form that the server expects and then sends it to the server.
  • When the server responds, boto translates its response into a packaged form that is useful for programming and returns it to the tool.
  • The tool immediately tears that version back apart and translates it into a text-based form that can go back to the command line.

Things shouldn’t be this convoluted. Not in Python.

The Present

Tackling this problem involved coming up with ways to simplify not only the code, but also the process through which they are written. This led to two major changes, upon which all of the current euca2ools code is built.

“eucacommand”

The first step was consolidating all of the code involved in performing the first step of this process — reading data from the command line — into one location. Each tool then simply needed to describe what it expected to receive from the command line, and the shared code would take care of the rest. For example, let’s look at part of an older command, euca-create-volume:

class CreateVolume(EucaCommand):
    Description = 'Creates a volume in a specified availability zone.'
    Options = [Param(name='size', short_name='s', long_name='size',
                     optional=True, ptype='integer',
                     doc='size of the volume (in GiB).'),
               Param(name='snapshot', long_name='snapshot',
                     optional=True, ptype='string',
                     doc="""snapshot id to create the volume from.
                     Either size or snapshot can be specified (not both)."""),
               Param(name='zone', short_name='z', long_name='zone',
                     optional=False, ptype='string',
                     doc='availability zone to create the volume in')]

Because there are three Params the shared code library reads three bits of info from the command line and hands them to the command’s code, which then hands them to boto, and so on.

This methodology forms the basis for all of the current euca2ools that begin with “euca”.

Roboto

For a euca2ools command line tool to be useful it has to gather data from the command line, send these data to the server, and return data from the server to the user. A little-known boto sub-project written by boto developer (and former euca2ools developer) Mitch Garnaat, roboto, takes this statement literally and opts to let tools work at a lower level: instead of translating data from the command line into an intermediate format to send to boto, tools send these data directly to the server in the form that the server expects. The effect of this is that of essentially removing boto from the euca2ools code base altogether. By removing boto from the path that data have to take to get from the command line to the server and back, roboto makes tool writing and debugging simpler because there is less code to walk through and understand.

Roboto is the basis for all of the current euca2ools that begin with “euare”.

The Future

That is the state of the code today. Where do we go from here? While roboto allows one to create command line tools with a minimal amount of effort, it has several rough edges which prevented it from taking off and which make it sub-optimal for building out the hundreds of commands that the euca2ools suite will soon need to cover:

  • User-unfriendly — When a user types something wrong or forgets to include something, roboto’s messages are often uselessly terse and unhelpful.
  • A steeper learning curve than necessary — Roboto contains a large amount of custom code dedicated to fetching information from the command line. This steepens the learning curve for people who want to contribute code or fix bugs.
  • Too much hardcoding — Roboto assumes that all tools do certain things, such as ascertaining what keys they should use to access the cloud, the same way.
  • Still more work than it has to be — Though it makes writing tools simpler, roboto still hands each tool a bucket of information and expects the tool to pick out the bits the server needs and send them onward.

Enter requestbuilder

Requestbuilder is a new Python library that attempts to rethink the way roboto works in a way that is more familiar to the typical Python developer and requires less custom code to run. The easiest way to illustrate this is with an example.

A command line tool embodies a specific request to the server, so each such tool defines a Request that describes how it works:

class ListUsers(EuareRequest):
    Description = 'List the users who start with a specific path'
    Args = [Arg('-p', '--prefix', dest='PathPrefix', metavar='PREFIX',
                help='list only users whose paths begin with a prefix'),
            Arg('--max-items', type=int, dest='MaxItems',
                help='limit the number of results')]

    def main(self):
        return self.send()

    def print_result(self, result):
        for user in result['Users']:
            print user['Arn']

Those familiar with Python’s argparse library will recognize the code inside Arg(...), because requestbuilder does away with roboto’s custom code for reading things off the command line and instead lets argparse do the work. This cuts down on the amount of code we need to maintain, makes tool writing easier for developers who are already familiar with the Python standard library, and makes command line-related error messages much more user-friendly.

When the tool starts running, requestbuilder uses data from the command line to fill in a dictionary called args and runs the tool’s main method, whose job is to process this information and fill in the portions of the request that will be sent to the server: params, headers, and post_data, and then run the send method to send it all to the server and retrieve a response. Attaching each of these sets of data to the request instead of passing them around between methods allows one to send a request, tweak it, and send the tweaked version as well.

Why doesn’t the code above fill any of these things in? Since most of the data that comes off the command line goes directly to the server, when a tool runs send requestbuilder will automatically fill in params from the contents of args so the tool doesn’t have to: whatever the user supplied with --prefix at the command line gets sent to the server with the name PathPrefix, and so forth.

But what if something should not be sent to the server? While data from the command line go into params to be sent to the server by default, one can tell requestbuilder to send a particular bit of data elsewhere instead:

Arg('--debug', action='store_true', route_to=None)

None instructs requestbuilder to leave the “debug” flag alone and not attempt to send it anywhere. Data can also go elsewhere, such as to the connection that gets set up as the tool contacts the server:

Arg('-I', '--access-key-id', dest='aws_access_key_id', route_to=CONNECTION)

Astute readers will note that I haven’t described what EuareRequest in the earlier example does, so here is the code for that:

class EuareRequest(BaseRequest):
    ServiceClass = Euare
    Args = [Arg('--delegate', dest='DelegateAccount', metavar='ACCOUNT',
                help='''[Eucalyptus extension] run this command as another
                        account (only usable by cloud administrators)''')]

Requestbuilder makes tool writers’ jobs easier by allowing one type of request to inherit its command line options from another type of request and then supply their own by simply listing more of them. This is a little different from the way Python usually works; Requestbuilder does some magic behind the scenes to make this possible. As a result, everything common to commands that access the EUARE service (Eucalyptus’s equivalent of Amazon’s IAM service) can go into one place to be shared with others.

The final piece of information requestbuilder needs is a ServiceClass, which describes the web service that the tool connects to. A service class is another simple bit of code that looks like this:

class Euare(BaseService):
    Description = 'Eucalyptus User, Authorization and Reporting Environment'
    APIVersion = '2010-05-08'
    EnvURL = 'EUARE_URL'

The net gain from all this is a smaller, but much more flexible code base that should be able to scale better than anything we have had before. Requestbuilder’s use of Python’s argparse library also makes tools much more informative to users than ever before.

How You Can Help

We’re developing requestbuilder on GitHub as a project under the boto organization. We’re going to start rewriting euca2ools, one by one, improving requestbuilder to support new things as we go. It’s still early on, so if you have ideas to share or you’re interested in helping develop this code, now is your chance!

We’re also moving development of euca2ools itself to GitHub. This will make it easier to work on euca2ools and requestbuilder in parallel. It will also make it easier to share code with the rest of the boto community.

If you’re interested in getting involved, join us on the #boto or #eucalyptus IRC channels on Freenode. You can also send e-mail to Eucalyptus’s community list.


Comparing Versions in RPM Conditionals

10 Apr 2012

It’s easy to check a distribution’s version in a spec file since it is usually an integer:

%if 0%{?fedora} > 16

But this scheme doesn’t usually work for comparing program versions because they typically contain periods, which blow rpmbuild’s little mind. But if you have rpm 4.7 or later, you can use a bit of inline Lua to do it:

%if %{lua:rpm.vercmp('%{version}', '2.0.2')} > 0